WO2023044676A1 - Control method for multiple robots working cooperatively, system and robot - Google Patents

Control method for multiple robots working cooperatively, system and robot Download PDF

Info

Publication number
WO2023044676A1
WO2023044676A1 PCT/CN2021/119981 CN2021119981W WO2023044676A1 WO 2023044676 A1 WO2023044676 A1 WO 2023044676A1 CN 2021119981 W CN2021119981 W CN 2021119981W WO 2023044676 A1 WO2023044676 A1 WO 2023044676A1
Authority
WO
WIPO (PCT)
Prior art keywords
robot
scene information
robots
neural network
network model
Prior art date
Application number
PCT/CN2021/119981
Other languages
French (fr)
Chinese (zh)
Inventor
杜峰
吴剑强
李韬
Original Assignee
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子(中国)有限公司 filed Critical 西门子(中国)有限公司
Priority to PCT/CN2021/119981 priority Critical patent/WO2023044676A1/en
Publication of WO2023044676A1 publication Critical patent/WO2023044676A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric

Definitions

  • the embodiments of the present application relate to the technical field of industrial control, and in particular to a control method, system and robot for a plurality of robots working together.
  • the first robot 1 when two robots work together to complete a common task, the first robot 1 performs handling operations on the third object 103, the fourth object 104, the fifth object 105, and the sixth object 106, and the second robot 1 2 Carrying the first object 101, the second object 102, and the seventh object 107.
  • the first robot 1 In order to avoid the collision between the first robot 1 and the second robot 2 in the collision area A, real-time communication between the first robot 1 and the second robot 2 is required, which increases the control cost of multiple robots working together.
  • the embodiment of the present application provides a control scheme for a plurality of robots working together, which can realize a plurality of robots working together to complete a common task without real-time communication between the robots.
  • a method for controlling the operation of multiple robots including: accepting scene information captured by each robot, the scene information including: local robot scene information and other robot scene information; using The control algorithm calculates the scene information to obtain the corresponding action command of each robot; sends the corresponding action command to each robot, so that each robot executes the corresponding action command and cooperates to complete at least one common task.
  • each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task.
  • the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots.
  • the embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
  • the scene information includes: at least one of robot running image, robot running force, robot running distance, and robot running angle.
  • the action command includes: robot movement rotation angle and/or robot movement torque.
  • the robot can be precisely controlled to complete the task.
  • the scene information is captured by at least one sensor installed on each robot.
  • the scene information is captured by the sensor, which facilitates the analysis of the states of the robots working together, so as to control the robots to work.
  • the use of a control algorithm to calculate the scene information to obtain the action commands corresponding to each robot includes:
  • a control algorithm is used to calculate the state of the local robot, the states of other robots, and the time-series-based action set and state set to obtain action commands corresponding to each robot.
  • the embodiment of the present application not only obtains the state of the local robot through the scene information, but also obtains the state of other robots, so as to avoid collisions between the local robot operating under the action command and other robots operating under the action command.
  • control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  • the deep reinforcement learning neural network model is obtained by training in a virtual environment with various robots; and/or, the deep reinforcement learning neural network model uses multiple robots to perform various Obtained from common mission training.
  • the deep reinforcement learning neural network model is obtained by training with mixed data of virtual data obtained in a virtual environment combined with actual data tested by a single robot on site.
  • the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
  • the deep reinforcement learning neural network model is trained and obtained by using actual data of a plurality of robots tested in the field.
  • the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
  • the deep reinforcement learning neural network model is a continuous or discrete function.
  • the deep reinforcement learning neural network model adopts an asynchronous structure during training.
  • an operation control system for a plurality of robots including: a plurality of robots, and each robot respectively captures scene information, and the scene information includes: local robot scene information and other robot scene information ; Using a control algorithm to calculate the scene information to obtain the action commands corresponding to each robot; the robots execute the corresponding action commands and work together to complete at least one common task.
  • control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  • a robot sends the captured scene information to the controller, the scene information includes: local robot scene information and other robot scene information; the controller adopts The control algorithm calculates the scene information to obtain the action commands corresponding to each robot; the robot executes the corresponding action commands and cooperates with other robots to complete at least one common task.
  • control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  • a computer program product which is tangibly stored on a readable medium of a controller, and computer-executable instructions, which when executed cause at least A processor executes any one of the above methods.
  • a computer-readable medium on which computer-executable instructions are stored, and when executed, the computer-executable instructions cause at least one processor to execute any one of the above-mentioned methods.
  • the controller receives the scene information captured by each robot, uses the control algorithm to calculate according to the scene information, and sends corresponding action commands to each robot, so that each robot executes The corresponding action commands work together to complete at least one common task.
  • the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots.
  • the embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
  • FIG. 1 is a schematic diagram of a system in which a plurality of robots work together according to an embodiment of the present application;
  • FIG. 2 is a flow chart of the steps of a method for a plurality of robots to work together according to an embodiment of the present application
  • FIG. 3 is a flow chart of step S2 of a method for a plurality of robots to work together according to an embodiment of the present application;
  • FIG. 4 is a schematic diagram of a deep reinforcement learning neural network model of an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the actual training situation of the deep reinforcement learning neural network model of the embodiment of the present application.
  • S1 Each robot captures scene information separately, and the scene information includes: local robot scene information and other robot scene information
  • a controller In the field of industrial control, in order to ensure low latency of data transmission, a controller is usually installed for each robot, and the control operation of the robot is realized based on the local controller of each robot. For some complex industrial control scenarios, multiple robots need to work together to complete common tasks, and each robot needs to communicate in real time to avoid robot damage caused by collision with each other. In this application, each robot captures scene information separately, and uses a control algorithm to generate action commands corresponding to each robot based on the scene information to control multiple robots to work together without real-time communication between robots.
  • the embodiment of the present application provides a control method for multiple robots to work together, including:
  • Step S1. Accept scene information captured from each robot, the scene information includes: local robot scene information and other robot scene information.
  • each robot captures scene information through at least one sensor.
  • the senor may be a camera or a laser sensor, and at least one sensor is installed at any position of the robot that is convenient for capturing scene information.
  • the senor captures the scene information, which is convenient for analyzing the state of each robot working in cooperation, so as to control each robot to work.
  • the scene information can be captured at preset intervals, or can be captured continuously.
  • the specific capture method can be selected and set according to the needs of the common task completed by the collaborative work.
  • the scene information includes: at least one of robot running image, robot running force, robot running distance, and robot running angle.
  • Step S2 using the control algorithm to calculate the scene information, and obtain the action commands corresponding to each robot.
  • the motion command includes: robot movement rotation angle and/or robot movement torque, and the like.
  • the embodiment of the present application can accurately control the robot to complete the task by controlling the rotation angle of the robot movement and/or the movement torque of the robot.
  • the robot when the robot completes the carrying task, it needs to carry the object to the target position by adjusting the rotation angle of the robot movement and the movement torque of the robot.
  • step S2 includes:
  • the common tasks are usually analyzed by the task analysis module of the robot controller to obtain action sets and state sets based on time series, and the time interval of the specific time series is set according to the needs of the tasks.
  • a controller is usually installed for each robot, and the control operation of the robot is realized based on the local controller of each robot.
  • the action set includes: during the time period T1, the robot performs action one, and during the time period T2, the robot performs action two.
  • the state set includes: at time T3, the robot is in state one, and at time T4, the robot is in state two.
  • the status of the local robot is obtained through the scene information, but also the status of other robots can be obtained, so that collisions between the local robot operating under the action command and other robots operating under the action command can be avoided.
  • common tasks are parsed into action sets and state sets based on time series, and then local robot states and other robot states are obtained according to scene information.
  • the local robot state and other robot states are compared with the time series-based action set and state set to obtain the action commands that the local robot needs to execute. Therefore, in the embodiment of the present application, the action commands corresponding to each robot are obtained by combining the local robot state and other robot states with time-series-based action sets and state sets, so as to avoid collisions when the robots operate under the control of action commands.
  • control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  • DRL Deep Reinforcement Learning
  • the deep reinforcement learning neural network model M of the embodiment of the present application generates the local robot action command a according to the input state s of the robot (local robot and other robots), and operates the local robot under the action command
  • the result that is, the feedback information r is sent to the deep reinforcement learning neural network model, and then the deep reinforcement learning neural network model generates a new local robot action command a' according to the feedback information r and the new local robot state s'.
  • the deep reinforcement learning neural network model M can generate local robot action commands that are more suitable for common tasks, and can gradually improve the deep reinforcement learning neural network model M according to the selection of training samples.
  • the deep reinforcement learning neural network model is obtained by using various robots to train in a virtual environment; and/or, multiple robots are trained to perform various common tasks in a virtual environment.
  • the embodiment of the present application uses the data obtained by training various robots in the virtual environment as the training samples of the deep reinforcement learning neural network model, and the embodiment of the present application can also use the data obtained by multiple robots performing various common task training in the virtual environment As a training sample for a deep reinforcement learning neural network model.
  • the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
  • the deep reinforcement learning neural network model is obtained by training with mixed data of virtual data obtained in a virtual environment combined with actual data tested by a single robot in the field.
  • the embodiment of the present application uses the mixed data obtained in the virtual environment combined with the actual data of the on-site test as the training sample of the deep reinforcement learning neural network model, which further improves the training effect of the deep reinforcement learning neural network model.
  • the deep reinforcement learning neural network model is trained and obtained by using actual data of multiple robots tested in the field.
  • the embodiment of the present application uses the actual data of multiple robots tested on site as the training samples for the deep reinforcement learning neural network model.
  • the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
  • the deep reinforcement learning neural network model in order to expand the application range of the deep reinforcement learning neural network model so that it can improve the performance of more common tasks, is a continuous or discrete function.
  • the deep reinforcement learning neural network model adopts an asynchronous structure in training, thereby reducing the complexity of the application of the deep reinforcement learning neural network model, and making the embodiments of the present application easier to implement.
  • the deep enhanced learning neural network model can first realize the training of simple common tasks, and then receive the training of complex common tasks, and gradually increase the complexity of the training samples, which is convenient for the deep enhanced learning neural network in the embodiment of the present application.
  • the accuracy with which the model achieves complex common tasks can be achieved.
  • Step S3 sending corresponding action commands to each robot, so that each robot executes the corresponding action command, and cooperates to complete at least one common task.
  • the robots in the embodiment of the present application can realize cooperative work by executing corresponding action commands, and the cooperative work can be realized without implementing communication between the robots.
  • each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task.
  • the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots.
  • the embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
  • each robot will capture the local robot scene information
  • the deep reinforcement learning neural network model M 1 ,...,M N corresponding to the input of other robot scene information D 1 ,...,D N .
  • Scene information includes: robot running image, robot running force, robot running distance, and robot running angle.
  • the N deep reinforcement learning neural network models M 1 , ...,M N corresponding to N robots R 1 , ...,R N respectively receive the local robot scene information and other robot scene information sent by N robots R 1 ,...,R N D 1 , . . . , D N , and communicate with each other.
  • the common task 501 is parsed by the task parsing module 502 into a time-series-based action set and state set 503 (which includes action sets and state sets), and is sent to N robots R 1 , ..., R N corresponding to N deep reinforcement learning neural network models M 1 ,...,M N .
  • N robots R 1 ,...,R N correspond to N deep reinforcement learning neural network models M 1 ,...,M N
  • actions based on time series Set and state set 503 conduct end-to-end training to obtain action commands A 1 , ..., A N executed by each robot R 1 , ..., R N .
  • the training samples of the deep reinforcement learning neural network model may use data obtained by training various robots in a virtual environment, and/or data obtained by training multiple robots to perform various common tasks in a virtual environment.
  • the training samples of the deep reinforcement learning neural network model may use the mixed data of virtual data obtained in a virtual environment combined with actual data of on-site testing.
  • the training samples of the deep reinforcement learning neural network model may use actual data from field tests of multiple robots.
  • the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
  • the deep reinforcement learning neural network model is a continuous or discrete function.
  • the deep reinforcement learning neural network model adopts an asynchronous structure in training, thereby reducing the complexity of the application of the deep reinforcement learning neural network model, and making the embodiment of the present application easier to implement.
  • the deep enhanced learning neural network model can first realize the training of simple common tasks, and then receive the training of complex common tasks, and gradually increase the complexity of the training samples, which is convenient for the deep enhanced learning neural network in the embodiments of the present application.
  • the accuracy with which the model achieves complex common tasks can be achieved.
  • the embodiment of the present application continuously improves the accuracy of the output action commands of the deep reinforcement learning neural network model through the training of the deep reinforcement learning neural network model, improves the efficiency of the collaborative work of multiple robots, and simplifies the complexity of the collaborative work control of multiple robots.
  • some embodiments of the present application also provide a control system for multiple robots working together, including: multiple robots, each of which captures scene information respectively, and the scene information includes: local robot scene information and other robot scene information;
  • the controllers of each robot use the control algorithm to calculate the scene information to obtain the corresponding action commands of each robot; and send the corresponding action commands to each robot so that each robot can execute the corresponding action commands and work together to complete at least one common task. Task.
  • control algorithm is learned and obtained according to the deep reinforcement learning neural network model.
  • the control algorithm is integrated in the memory of each robot. According to the needs, each controller calls the control algorithm to calculate the scene information and obtain the corresponding action command of each robot, so as to realize the low-latency control of each robot and ensure the accuracy of the action command. .
  • control system in which a plurality of robots work together in this embodiment is used to implement the corresponding control methods in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • each robot in the control system of this embodiment reference may be made to the descriptions of corresponding parts in the foregoing method embodiments, and details are not repeated here.
  • each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task.
  • the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots.
  • the embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
  • some embodiments of the present application also provide a robot.
  • the robot sends the captured scene information to the controller.
  • the scene information includes: local robot scene information and other robot scene information; the controller uses a control algorithm to process the scene information. Calculate and obtain the action commands corresponding to each robot; the robot executes the corresponding action commands, and cooperates with other robots to complete at least one common task.
  • control algorithm is learned and obtained according to the deep reinforcement learning neural network model.
  • the robot in this embodiment is used to implement the corresponding control methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task.
  • the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots.
  • the embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
  • some embodiments of the present application further provide a computer program product, which is tangibly stored on a readable medium of the controller, and has computer-executable instructions. When executed, the computer-executable instructions cause at least one processing implement any of the above methods.
  • some embodiments of the present application further provide a computer-readable medium on which computer-executable instructions are stored, and when executed, the computer-executable instruction causes at least one processor to execute the above-mentioned method.
  • each component/step described in the embodiment of the present application can be divided into more components/steps, and two or more components/steps or partial operations of components/steps can also be combined into New components/steps to achieve the purpose of the embodiment of the present application.
  • the above-mentioned method according to the embodiment of the present application can be implemented in hardware, firmware, or as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by Computer code downloaded from a network that is originally stored on a remote recording medium or a non-transitory machine-readable medium and will be stored on a local recording medium so that the methods described herein can be stored on a computer code using a general-purpose computer, a dedicated processor, or a programmable Such software processing on a recording medium of dedicated hardware such as ASIC or FPGA.
  • a recording medium such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk
  • Computer code downloaded from a network that is originally stored on a remote recording medium or a non-transitory machine-readable medium and will be stored on a local recording medium so that the methods described herein can be stored on a computer code using a general-purpose computer, a
  • a computer, processor, microprocessor controller, or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when When the processor or hardware accesses and executes, the verification code generation method described here is realized.
  • memory components e.g., RAM, ROM, flash memory, etc.
  • the execution of the code converts the general-purpose computer into a special-purpose computer for executing the check code generation method shown here.

Abstract

Disclosed are a control method for multiple robots working cooperatively, a system and a robot. An operation control method comprises: receiving scene information captured from each robot, the scene information comprising local robot scene information and other robot scene information (S1); calculating the scene information using a control algorithm, to obtain an action command corresponding to each robot (S2); and sending the corresponding action command to each robot, so that each robot executes the corresponding action command, and at least one common task is completed cooperatively (S3). In this way, multiple robots can work together to complete a common task without real-time communication between the robots.

Description

一种多个机器人协同工作的控制方法、系统及机器人A control method, system and robot for multiple robots working together 技术领域technical field
本申请实施例涉及工业控制技术领域,尤其涉及一种多个机器人协同工作的控制方法、系统及机器人。The embodiments of the present application relate to the technical field of industrial control, and in particular to a control method, system and robot for a plurality of robots working together.
背景技术Background technique
随着工业控制的发展,越来越多的任务采用机器人(包括机器臂,统一简称为机器人)来完成,对机器人进行控制实现多种任务场景成为未来工业控制技术发展的关键。但是现有的机器人控制通常需要按照预先设定的程序完成对应的任务,当需要多个机器人协同工作完成共同任务时,往往需要进行大量的编程工作,为任务的完成带来极大困难。并且由于机器人的价格昂贵,多个机器人协同工作完成共同任务时,即便采用预先设定的程序也仍然需要多个机器人之间进行实时通信,以避免机器人在工作中发生碰撞所造成的机器人损伤。With the development of industrial control, more and more tasks are completed by robots (including robot arms, collectively referred to as robots). Controlling robots to achieve various task scenarios has become the key to the development of future industrial control technology. However, the existing robot control usually needs to complete the corresponding tasks according to the preset program. When multiple robots are required to work together to complete a common task, a large amount of programming work is often required, which brings great difficulties to the completion of the task. And because the robot is expensive, when multiple robots work together to complete a common task, even if a pre-set program is used, real-time communication between multiple robots is still required to avoid robot damage caused by collisions between robots during work.
示例性地,参见图1,当两个机器人协同工作完成共同任务时,第一机器人1对第三物体103、第四物体104、第五物体105、第六物体106进行搬运操作,第二机器人2对第一物体101、第二物体102、第七物体107进行搬运操作。为了避免第一机器人1和第二机器人2在碰撞区域A发生碰撞,需要第一机器人1和第二机器人2之间进行实时通信,增加了多个机器人协同工作的控制成本。For example, referring to FIG. 1, when two robots work together to complete a common task, the first robot 1 performs handling operations on the third object 103, the fourth object 104, the fifth object 105, and the sixth object 106, and the second robot 1 2 Carrying the first object 101, the second object 102, and the seventh object 107. In order to avoid the collision between the first robot 1 and the second robot 2 in the collision area A, real-time communication between the first robot 1 and the second robot 2 is required, which increases the control cost of multiple robots working together.
发明内容Contents of the invention
有鉴于此,本申请实施例提供一种多个机器人协同工作的控制方案,无需机器人之间的实时通信,即可实现多个机器人协同工作完成共同任务。In view of this, the embodiment of the present application provides a control scheme for a plurality of robots working together, which can realize a plurality of robots working together to complete a common task without real-time communication between the robots.
根据本申请实施例的第一方面,提供了一种多个机器人的运行控制方法,包括:接受各机器人分别捕获的场景信息,所述场景信息包括:本地机器人场景信息和其他机器人场景信息;采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令;向各机器人发送对应的动作命令,以使所述各机器人执行对应的动作命令,协同工作完成至少一项共同任务。According to the first aspect of the embodiments of the present application, a method for controlling the operation of multiple robots is provided, including: accepting scene information captured by each robot, the scene information including: local robot scene information and other robot scene information; using The control algorithm calculates the scene information to obtain the corresponding action command of each robot; sends the corresponding action command to each robot, so that each robot executes the corresponding action command and cooperates to complete at least one common task.
根据本申请实施例提供的多个机器人的运行控制方案,各机器人分别捕获场景信息,各机器人采用控制算法根据场景信息进行计算,获得各机器人对应的动作命令,从而令各机器人执行对应的动作命令,协同工作完成至少一项共同任务。本申请实施例各机器人通过捕获场景信息以及控制算法实现协同工作,无需各机器人之间进行实时通信即可避免各机器人之间发生的碰撞。本申请实施例降低了多个机器人协同工作的控制成本,易于维护与升级,且适用于处理多种共同任务。According to the operation control scheme of multiple robots provided in the embodiment of the present application, each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task. In the embodiments of the present application, the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots. The embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
在本申请一些实施例中,所述场景信息包括:机器人运行图像、机器人运行力量、 机器人运行距离、机器人运行角度中至少其一。In some embodiments of the present application, the scene information includes: at least one of robot running image, robot running force, robot running distance, and robot running angle.
以这样的方式,从而可以实现对机器人的状态的准确获知,进一步实现更佳准确地控制各机器人进行工作。In such a manner, it is possible to accurately know the state of the robot, and further realize better and more accurate control of each robot to work.
在本申请一些实施例中,所述动作命令包括:机器人运动旋转角度和/或机器人运动扭矩。In some embodiments of the present application, the action command includes: robot movement rotation angle and/or robot movement torque.
以这样的方式,能够准确控制机器人完成任务。In this way, the robot can be precisely controlled to complete the task.
在本申请一些实施例中,所述场景信息通过所述各机器人上安装的至少一传感器捕获。In some embodiments of the present application, the scene information is captured by at least one sensor installed on each robot.
以这样的方式,通过传感器捕获场景信息,便于对协同工作的各机器人的状态进行分析,以便控制各机器人进行工作。In this way, the scene information is captured by the sensor, which facilitates the analysis of the states of the robots working together, so as to control the robots to work.
在本申请一些实施例中,所述采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令,包括:In some embodiments of the present application, the use of a control algorithm to calculate the scene information to obtain the action commands corresponding to each robot includes:
获得共同任务,并将所述共同任务解析为基于时间序列的动作集和状态集;obtaining a common task, and parsing the common task into a time series-based action set and state set;
根据所述场景信息获得所述本地机器人状态、所述其他机器人状态;Obtaining the state of the local robot and the state of other robots according to the scene information;
采用控制算法对所述本地机器人状态、所述其他机器人状态以及所述基于时间序列的动作集和状态集进行计算,获得各机器人对应的动作命令。A control algorithm is used to calculate the state of the local robot, the states of other robots, and the time-series-based action set and state set to obtain action commands corresponding to each robot.
以这样的方式,本申请实施例不仅通过场景信息获得本地机器人状态,还要获得其他机器人状态,从而能够避免本地机器人在动作命令下操作和其他机器人在动作命令下操作发生碰撞。In this way, the embodiment of the present application not only obtains the state of the local robot through the scene information, but also obtains the state of other robots, so as to avoid collisions between the local robot operating under the action command and other robots operating under the action command.
在本申请一些实施例中,所述控制算法根据深度增强学习神经网络模型学习获得。In some embodiments of the present application, the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
以这样的方式,可以生成更满足共同任务的本机机器人动作命令,并可以根据训练样本的选择逐渐完善深度增强学习神经网络模型。In this way, native robot action commands that more satisfy common tasks can be generated, and the deep reinforcement learning neural network model can be gradually refined according to the selection of training samples.
在本申请一些实施例中,所述深度增强学习神经网络模型采用各种机器人在虚拟环境中训练获得;和/或,所述深度增强学习神经网络模型采用多个机器人在虚拟环境中执行各种共同任务训练获得。In some embodiments of the present application, the deep reinforcement learning neural network model is obtained by training in a virtual environment with various robots; and/or, the deep reinforcement learning neural network model uses multiple robots to perform various Obtained from common mission training.
以这样的方式,in this way,
在本申请一些实施例中,所述深度增强学习神经网络模型采用在虚拟环境中获得的虚拟数据结合单独机器人在现场测试的实际数据的混合数据训练获得。In some embodiments of the present application, the deep reinforcement learning neural network model is obtained by training with mixed data of virtual data obtained in a virtual environment combined with actual data tested by a single robot on site.
以这样的方式,本申请实施例通过各种训练样本,进一步改善了深度增强学习神经网络模型的训练效果。In this manner, the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
在本申请一些实施例中,所述深度增强学习神经网络模型采用多个机器人在现场测试的实际数据训练获得。In some embodiments of the present application, the deep reinforcement learning neural network model is trained and obtained by using actual data of a plurality of robots tested in the field.
以这样的方式,本申请实施例通过各种训练样本,进一步改善了深度增强学习神经网络模型的训练效果。In this manner, the embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
在本申请一些实施例中,所述深度增强学习神经网络模型为连续或者离散函数。In some embodiments of the present application, the deep reinforcement learning neural network model is a continuous or discrete function.
以这样的方式,扩大了深度增强学习神经网络模型的应用范围,令其可以改善更多 的共同任务的执行效果。In this way, the scope of application of the deep reinforcement learning neural network model is expanded, so that it can improve the performance of more common tasks.
在本申请一些实施例中,所述深度增强学习神经网络模型在训练中采用异步结构。In some embodiments of the present application, the deep reinforcement learning neural network model adopts an asynchronous structure during training.
以这样的方式,从而降低深度增强学习神经网络模型应用的复杂度,令本申请实施例更容易实现。In this way, the complexity of the application of the deep reinforcement learning neural network model is reduced, making the embodiment of the present application easier to implement.
根据本申请实施例的第二方面,提供了一种多个机器人的运行控制系统,包括:多个机器人,各机器人分别捕获场景信息,所述场景信息包括:本地机器人场景信息和其他机器人场景信息;采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令;所述各机器人执行对应的动作命令,协同工作完成至少一项共同任务。According to the second aspect of the embodiment of the present application, there is provided an operation control system for a plurality of robots, including: a plurality of robots, and each robot respectively captures scene information, and the scene information includes: local robot scene information and other robot scene information ; Using a control algorithm to calculate the scene information to obtain the action commands corresponding to each robot; the robots execute the corresponding action commands and work together to complete at least one common task.
在本申请一些实施例中,所述控制算法根据深度增强学习神经网络模型学习获得。In some embodiments of the present application, the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
以这样的方式,可以生成更满足共同任务的本机机器人动作命令,并可以根据训练样本的选择逐渐完善深度增强学习神经网络模型。In this way, native robot action commands that more satisfy common tasks can be generated, and the deep reinforcement learning neural network model can be gradually refined according to the selection of training samples.
根据本申请实施例的第三方面,提供了一种机器人,所述机器人将捕获的场景信息发送至控制器,所述场景信息包括:本地机器人场景信息和其他机器人场景信息;所述控制器采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令;所述机器人执行对应的动作命令,与其他机器人协同工作完成至少一项共同任务。According to the third aspect of the embodiment of the present application, a robot is provided, the robot sends the captured scene information to the controller, the scene information includes: local robot scene information and other robot scene information; the controller adopts The control algorithm calculates the scene information to obtain the action commands corresponding to each robot; the robot executes the corresponding action commands and cooperates with other robots to complete at least one common task.
在本申请一些实施例中,所述控制算法根据深度增强学习神经网络模型学习获得。In some embodiments of the present application, the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
以这样的方式,可以生成更满足共同任务的本机机器人动作命令,并可以根据训练样本的选择逐渐完善深度增强学习神经网络模型。In this way, native robot action commands that more satisfy common tasks can be generated, and the deep reinforcement learning neural network model can be gradually refined according to the selection of training samples.
根据本申请实施例的第三方面,提供了一种计算机程序产品,其被有形地存储在控制器的可读介质上,并且计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行上述任一项方法。According to a third aspect of the embodiments of the present application, there is provided a computer program product, which is tangibly stored on a readable medium of a controller, and computer-executable instructions, which when executed cause at least A processor executes any one of the above methods.
根据本申请实施例的第四方面,提供了计算机可读介质,其上存储有计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行上述任一项方法。According to a fourth aspect of the embodiments of the present application, there is provided a computer-readable medium, on which computer-executable instructions are stored, and when executed, the computer-executable instructions cause at least one processor to execute any one of the above-mentioned methods.
根据本申请实施例提供的多个机器人的运行控制方案,控制器接受各机器人分别捕获的场景信息,并采用控制算法根据场景信息进行计算,向各机器人发送对应的动作命令,从而令各机器人执行对应的动作命令,协同工作完成至少一项共同任务。本申请实施例各机器人通过捕获场景信息以及控制算法实现协同工作,无需各机器人之间进行实时通信即可避免各机器人之间发生的碰撞。本申请实施例降低了多个机器人协同工作的控制成本,易于维护与升级,且适用于处理多种共同任务。According to the operation control scheme of multiple robots provided in the embodiment of the present application, the controller receives the scene information captured by each robot, uses the control algorithm to calculate according to the scene information, and sends corresponding action commands to each robot, so that each robot executes The corresponding action commands work together to complete at least one common task. In the embodiments of the present application, the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots. The embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments described in the embodiments of the present application, and those skilled in the art can also obtain other drawings based on these drawings.
图1为适用本申请实施例的多个机器人协同工作的系统的示意图;FIG. 1 is a schematic diagram of a system in which a plurality of robots work together according to an embodiment of the present application;
图2为根据本申请实施例的一种多个机器人协同工作的方法的步骤流程图;FIG. 2 is a flow chart of the steps of a method for a plurality of robots to work together according to an embodiment of the present application;
图3为根据本申请实施例的一种多个机器人协同工作的方法的步骤S2的步骤流程图;FIG. 3 is a flow chart of step S2 of a method for a plurality of robots to work together according to an embodiment of the present application;
图4为本申请实施例的深度增强学习神经网络模型示意图;FIG. 4 is a schematic diagram of a deep reinforcement learning neural network model of an embodiment of the present application;
图5为本申请实施例的深度增强学习神经网络模型的实际训练情况示意图。FIG. 5 is a schematic diagram of the actual training situation of the deep reinforcement learning neural network model of the embodiment of the present application.
附图标记reference sign
1:第一机器人1: The first robot
2:第二机器人2: Second robot
101:第一物体101: First Object
102:第二物体102: Second object
103:第三物体103: The Third Object
104:第四物体104: The fourth object
105:第五物体105: Fifth Object
106:第六物体106: Sixth Object
107:第七物体107: Seventh Object
S1:各机器人分别捕获场景信息,场景信息包括:本地机器人场景信息和其他机器人场景信息S1: Each robot captures scene information separately, and the scene information includes: local robot scene information and other robot scene information
S2:采用控制算法对场景信息进行计算,获得各机器人对应的动作命令S2: Use the control algorithm to calculate the scene information, and obtain the action commands corresponding to each robot
S21:获得共同任务,并将共同任务解析为基于时间序列的动作集和状态集S21: Obtain the common task and parse the common task into a time series-based action set and state set
S22:根据场景信息获得本地机器人状态、其他机器人状态S22: Obtain the local robot state and other robot states according to the scene information
S23:采用控制算法对本地机器人状态、其他机器人状态以及基于时间序列的动作集和状态集进行计算,获得各机器人对应的动作命令S23: Use the control algorithm to calculate the state of the local robot, the state of other robots, and the action set and state set based on time series, and obtain the corresponding action commands of each robot
S3:各机器人执行对应的动作命令,协同工作完成至少一项共同任务S3: Each robot executes the corresponding action command, and works together to complete at least one common task
M:深度增强学习神经网络模型M: Deep Reinforcement Learning Neural Network Model
s:输入的机器人(本机机器人和其他机器人)状态s: state of the input robot (local robot and other robot)
a:生成本机机器人动作命令a: Generate native robot motion commands
r:反馈信息r: feedback information
s’:新的本机机器人状态s': new native robot state
a’:新的本机机器人动作命令a': new native robot action command
R 1,…,R N:N个机器人 R 1 ,…,R N : N robots
M 1,…,M N:N个深度增强学习神经网络模型 M 1 ,...,M N : N deep reinforcement learning neural network models
D 1,…,D N:N个本地机器人场景信息和其他机器人场景信息 D 1 ,..., D N : N local robot scene information and other robot scene information
501:共同任务501: common task
502:任务解析模块502: Task analysis module
503:基于时间序列的动作集和状态集503: Action Set and State Set Based on Time Series
A 1,…,A N:N个动作命令 A 1 ,…,A N : N action commands
具体实施方式Detailed ways
为了使本领域的人员更好地理解本申请实施例中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请实施例一部分实施例,而不是全部的实施例。基于本申请实施例中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于本申请实施例保护的范围。In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present application, the following will clearly and completely describe the technical solutions in the embodiments of the present application in conjunction with the drawings in the embodiments of the present application. Obviously, the described The embodiments are only some of the embodiments of the present application, but not all of them. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments in the embodiments of the present application shall fall within the protection scope of the embodiments of the present application.
在工业控制领域,为了保证数据传递的低延时,通常为每个机器人安装控制器,基于每个机器人本地的控制器实现机器人的控制操作。对于一些复杂的工业控制场景,需要多个机器人协同工作完成共同任务,每个机器人之间还需要通过实时通信来避免彼此碰撞所造成的机器人损坏。本申请通过每个机器人分别捕获场景信息,利用控制算法根据场景信息生成各机器人对应的动作命令以控制多个机器人协同工作,无需各机器人之间进行实时通信。In the field of industrial control, in order to ensure low latency of data transmission, a controller is usually installed for each robot, and the control operation of the robot is realized based on the local controller of each robot. For some complex industrial control scenarios, multiple robots need to work together to complete common tasks, and each robot needs to communicate in real time to avoid robot damage caused by collision with each other. In this application, each robot captures scene information separately, and uses a control algorithm to generate action commands corresponding to each robot based on the scene information to control multiple robots to work together without real-time communication between robots.
下面结合本申请实施例附图进一步说明本申请实施例具体实现。The specific implementation of the embodiment of the present application will be further described below in conjunction with the accompanying drawings of the embodiment of the present application.
参见图2,本申请实施例提供一种多个机器人协同工作的控制方法,包括:Referring to Figure 2, the embodiment of the present application provides a control method for multiple robots to work together, including:
步骤S1、接受从各机器人分别捕获的场景信息,场景信息包括:本地机器人场景信息和其他机器人场景信息。Step S1. Accept scene information captured from each robot, the scene information includes: local robot scene information and other robot scene information.
在本申请一些具体实施例中,各机器人通过至少一传感器捕获场景信息。In some specific embodiments of the present application, each robot captures scene information through at least one sensor.
具体地,传感器可以采用摄像头或者激光传感器,至少一传感器安装的位置为便于捕获场景信息的机器人的任意位置。Specifically, the sensor may be a camera or a laser sensor, and at least one sensor is installed at any position of the robot that is convenient for capturing scene information.
本申请实施例通过传感器捕获场景信息,便于对协同工作的各机器人的状态进行分析,以便控制各机器人进行工作。In the embodiment of the present application, the sensor captures the scene information, which is convenient for analyzing the state of each robot working in cooperation, so as to control each robot to work.
本申请实施例可以每间隔预设周期捕获场景信息,也可以不间断地捕获场景信息,具体捕获方式可以根据协同工作完成的共同任务的需要选择设置。In this embodiment of the present application, the scene information can be captured at preset intervals, or can be captured continuously. The specific capture method can be selected and set according to the needs of the common task completed by the collaborative work.
在本申请一些具体实施例中,场景信息包括:机器人运行图像、机器人运行力量、机器人运行距离、机器人运行角度中至少其一。In some specific embodiments of the present application, the scene information includes: at least one of robot running image, robot running force, robot running distance, and robot running angle.
本申请实施例通过对机器人运行图像、机器人运行力量、机器人运行距离、机器人运行角度中至少其一进行捕获,从而可以实现对机器人的状态的准确获知,进一步实现更佳准确地控制各机器人进行工作。In the embodiment of the present application, by capturing at least one of the robot running image, the robot running force, the robot running distance, and the robot running angle, it is possible to accurately know the state of the robot, and further realize better and more accurate control of each robot to work. .
步骤S2、采用控制算法对场景信息进行计算,获得各机器人对应的动作命令。Step S2, using the control algorithm to calculate the scene information, and obtain the action commands corresponding to each robot.
在本申请一些具体实施例中,动作命令包括:机器人运动旋转角度和/或机器人运动扭矩等。In some specific embodiments of the present application, the motion command includes: robot movement rotation angle and/or robot movement torque, and the like.
本申请实施例通过控制机器人运动旋转角度和/或机器人运动扭矩等,能够准确控制机器人完成任务。The embodiment of the present application can accurately control the robot to complete the task by controlling the rotation angle of the robot movement and/or the movement torque of the robot.
示例性地,机器人完成搬运任务,则需要通过调整机器人运动旋转的角度和机器人运动扭矩,将物体搬运至目标位置。Exemplarily, when the robot completes the carrying task, it needs to carry the object to the target position by adjusting the rotation angle of the robot movement and the movement torque of the robot.
在本申请一些具体实施例中,参见图3,步骤S2,包括:In some specific embodiments of the present application, referring to FIG. 3, step S2 includes:
S21、获得共同任务,并将共同任务解析为基于时间序列的动作集和状态集。S21. Obtain a common task, and parse the common task into an action set and a state set based on time series.
具体地,通常通过机器人控制器的任务解析模块将共同任务进行解析,获得基于时间序列的动作集和状态集,具体时间序列的时间间隔根据任务的需要进行设置。Specifically, the common tasks are usually analyzed by the task analysis module of the robot controller to obtain action sets and state sets based on time series, and the time interval of the specific time series is set according to the needs of the tasks.
为了保证数据传递的低延时,通常为每个机器人安装控制器,基于每个机器人本地的控制器实现机器人的控制操作。In order to ensure low latency of data transmission, a controller is usually installed for each robot, and the control operation of the robot is realized based on the local controller of each robot.
示例性地,动作集包括:在T1时间段,机器人执行动作一,在T2时间段,机器人执行动作二。Exemplarily, the action set includes: during the time period T1, the robot performs action one, and during the time period T2, the robot performs action two.
示例性地,状态集包括:在T3时刻,机器人处于状态一,在T4时刻,机器人处于状态二。Exemplarily, the state set includes: at time T3, the robot is in state one, and at time T4, the robot is in state two.
S22、根据场景信息获得本地机器人状态、其他机器人状态。S22. Obtain the status of the local robot and the status of other robots according to the scene information.
本申请实施例不仅通过场景信息获得本地机器人状态,还要获得其他机器人状态,从而能够避免本地机器人在动作命令下操作和其他机器人在动作命令下操作发生碰撞。In the embodiment of the present application, not only the status of the local robot is obtained through the scene information, but also the status of other robots can be obtained, so that collisions between the local robot operating under the action command and other robots operating under the action command can be avoided.
S23、采用控制算法对本地机器人状态、其他机器人状态以及基于时间序列的动作集和状态集进行计算,获得各机器人对应的动作命令。S23. Using the control algorithm to calculate the state of the local robot, the states of other robots, and the action set and state set based on time series, and obtain the action commands corresponding to each robot.
本申请实施例通过将共同任务解析为基于时间序列的动作集和状态集,再根据场景信息获得本地机器人状态、其他机器人状态。对于本地机器人,根据本地机器人状态和其他机器人状态与基于时间序列的动作集和状态集进行比对,从而获得本地机器人需要执行的动作命令。因此,本申请实施例通过本地机器人状态、其他机器人状态结合基于时间序列的动作集和状态集,获得各机器人对应的动作命令,避免各机器人在动作命令控制下操作时发生碰撞。In the embodiment of the present application, common tasks are parsed into action sets and state sets based on time series, and then local robot states and other robot states are obtained according to scene information. For the local robot, the local robot state and other robot states are compared with the time series-based action set and state set to obtain the action commands that the local robot needs to execute. Therefore, in the embodiment of the present application, the action commands corresponding to each robot are obtained by combining the local robot state and other robot states with time-series-based action sets and state sets, so as to avoid collisions when the robots operate under the control of action commands.
在本申请一些具体实施例中,控制算法根据深度增强学习神经网络模型学习获得。In some specific embodiments of the present application, the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
深度增强学习(Deep Reinforcement Learning,DRL)是近两年来深度学习领域迅猛发展起来的一个分支,目的是解决计算机从感知到决策控制的问题,从而实现通用人工智能。Deep Reinforcement Learning (DRL) is a branch that has developed rapidly in the field of deep learning in the past two years. Its purpose is to solve the problem of computer perception to decision-making control, so as to realize general artificial intelligence.
参见图4,本申请实施例的深度增强学习神经网络模型M根据输入的机器人(本机机器人和其他机器人)状态s,生成本机机器人动作命令a,将本机机器人在动作命令下的进行操作的结果,即反馈信息r发送给深度增强学习神经网络模型,则深度增强学习神经网络模型根据反馈信息r以及新的本机机器人状态s’生成新的本机机器人动作命令a’。经过如此反复训练,深度增强学习神经网络模型M可以生成更满足共同任务的本机机器人动作命令,并可以根据训练样本的选择逐渐完善深度增强学习神经网络模型M。Referring to Fig. 4, the deep reinforcement learning neural network model M of the embodiment of the present application generates the local robot action command a according to the input state s of the robot (local robot and other robots), and operates the local robot under the action command The result, that is, the feedback information r is sent to the deep reinforcement learning neural network model, and then the deep reinforcement learning neural network model generates a new local robot action command a' according to the feedback information r and the new local robot state s'. After such repeated training, the deep reinforcement learning neural network model M can generate local robot action commands that are more suitable for common tasks, and can gradually improve the deep reinforcement learning neural network model M according to the selection of training samples.
在本申请一些具体实施例中,深度增强学习神经网络模型采用各种机器人在虚拟环境中训练获得;和/或,多个机器人在虚拟环境中执行各种共同任务训练获得。In some specific embodiments of the present application, the deep reinforcement learning neural network model is obtained by using various robots to train in a virtual environment; and/or, multiple robots are trained to perform various common tasks in a virtual environment.
本申请实施例采用各种机器人在虚拟环境中训练获得的数据作为深度增强学习神经网络模型的训练样本,本申请实施例也可以采用多个机器人在虚拟环境中执行各种共同 任务训练获得的数据作为深度增强学习神经网络模型的训练样本。本申请实施例通过各种训练样本,进一步改善了深度增强学习神经网络模型的训练效果。The embodiment of the present application uses the data obtained by training various robots in the virtual environment as the training samples of the deep reinforcement learning neural network model, and the embodiment of the present application can also use the data obtained by multiple robots performing various common task training in the virtual environment As a training sample for a deep reinforcement learning neural network model. The embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
在本申请一些具体实施例中,深度增强学习神经网络模型采用在虚拟环境中获得的虚拟数据结合单独机器人在现场测试的实际数据的混合数据训练获得。In some specific embodiments of the present application, the deep reinforcement learning neural network model is obtained by training with mixed data of virtual data obtained in a virtual environment combined with actual data tested by a single robot in the field.
本申请实施例采用虚拟环境中获得虚拟数据结合现场测试的实际数据的混合数据作为深度增强学习神经网络模型的训练样本,进一步改善了深度增强学习神经网络模型的训练效果。The embodiment of the present application uses the mixed data obtained in the virtual environment combined with the actual data of the on-site test as the training sample of the deep reinforcement learning neural network model, which further improves the training effect of the deep reinforcement learning neural network model.
在本申请一些具体实施例中,深度增强学习神经网络模型采用多个机器人在现场测试的实际数据训练获得。In some specific embodiments of the present application, the deep reinforcement learning neural network model is trained and obtained by using actual data of multiple robots tested in the field.
本申请实施例采用多个机器人在现场测试的实际数据作为深度增强学习神经网络模型的训练样本。本申请实施例通过各种训练样本,进一步改善了深度增强学习神经网络模型的训练效果。The embodiment of the present application uses the actual data of multiple robots tested on site as the training samples for the deep reinforcement learning neural network model. The embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
在本申请一些具体实施例中,为了扩大深度增强学习神经网络模型的应用范围,令其可以改善更多的共同任务的执行效果,深度增强学习神经网络模型为连续或者离散函数。In some specific embodiments of the present application, in order to expand the application range of the deep reinforcement learning neural network model so that it can improve the performance of more common tasks, the deep reinforcement learning neural network model is a continuous or discrete function.
在本申请一些具体实施例中,深度增强学习神经网络模型在训练中采用异步结构,从而降低深度增强学习神经网络模型应用的复杂度,令本申请实施例更容易实现。In some specific embodiments of the present application, the deep reinforcement learning neural network model adopts an asynchronous structure in training, thereby reducing the complexity of the application of the deep reinforcement learning neural network model, and making the embodiments of the present application easier to implement.
本申请实施例中深度增强学习神经网络模型可以先实现简单的共同任务的训练,再接收复杂的共同任务的训练,逐步增加训练样本的复杂度,便于本申请实施例中的深度增强学习神经网络模型实现复杂共同任务的准确性。In the embodiment of the present application, the deep enhanced learning neural network model can first realize the training of simple common tasks, and then receive the training of complex common tasks, and gradually increase the complexity of the training samples, which is convenient for the deep enhanced learning neural network in the embodiment of the present application. The accuracy with which the model achieves complex common tasks.
步骤S3、向各机器人发送对应的动作命令,以使各机器人执行对应的动作命令,协同工作完成至少一项共同任务。Step S3, sending corresponding action commands to each robot, so that each robot executes the corresponding action command, and cooperates to complete at least one common task.
本申请实施例各机器人通过执行对应的动作命令,即可实现协同工作,各机器人之间无需通过实施通讯,即可实现协同工作。The robots in the embodiment of the present application can realize cooperative work by executing corresponding action commands, and the cooperative work can be realized without implementing communication between the robots.
根据本申请实施例提供的多个机器人的运行控制方案,各机器人分别捕获场景信息,各机器人采用控制算法根据场景信息进行计算,获得各机器人对应的动作命令,从而令各机器人执行对应的动作命令,协同工作完成至少一项共同任务。本申请实施例各机器人通过捕获场景信息以及控制算法实现协同工作,无需各机器人之间进行实时通信即可避免各机器人之间发生的碰撞。本申请实施例降低了多个机器人协同工作的控制成本,易于维护与升级,且适用于处理多种共同任务。According to the operation control scheme of multiple robots provided in the embodiment of the present application, each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task. In the embodiments of the present application, the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots. The embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
参见图5,采用N个机器人R 1,…,R N在现场测试的实际数据作为深度增强学习神经网络模型M 1,…,M N的训练样本,每个机器人都将捕获的本地机器人场景信息和其他机器人场景信息D 1,…,D N输入对应的深度增强学习神经网络模型M 1,…,M N。场景信息包括:机器人运行图像、机器人运行力量、机器人运行距离、机器人运行角度。N个机器人R 1,…,R N对应的N个深度增强学习神经网络模型M 1,…,M N分别接收N个机器人R 1,…,R N 发送的本地机器人场景信息和其他机器人场景信息D 1,…,D N,并且彼此进行通信。共同任务501被任务解析模块502解析为基于时间序列的动作集和状态集503(其包括动作集合与状态集合),并经过规则核对后发送至N个机器人人R 1,…,R N对应的N个深度增强学习神经网络模型M 1,…,M N。N个机器人R 1,…,R N对应的N个深度增强学习神经网络模型M 1,…,M N根据本地机器人场景信息和其他机器人场景信息D 1,…,D N、基于时间序列的动作集和状态集503进行端对端的训练,获得各机器人R 1,…,R N执行的动作命令A 1,…,A NSee Figure 5, using the actual data of N robots R 1 ,...,R N tested on the spot as the training samples of the deep reinforcement learning neural network model M 1 ,...,M N , each robot will capture the local robot scene information The deep reinforcement learning neural network model M 1 ,...,M N corresponding to the input of other robot scene information D 1 ,...,D N . Scene information includes: robot running image, robot running force, robot running distance, and robot running angle. The N deep reinforcement learning neural network models M 1 , ...,M N corresponding to N robots R 1 , ...,R N respectively receive the local robot scene information and other robot scene information sent by N robots R 1 ,...,R N D 1 , . . . , D N , and communicate with each other. The common task 501 is parsed by the task parsing module 502 into a time-series-based action set and state set 503 (which includes action sets and state sets), and is sent to N robots R 1 , ..., R N corresponding to N deep reinforcement learning neural network models M 1 ,...,M N . N robots R 1 ,...,R N correspond to N deep reinforcement learning neural network models M 1 ,...,M N According to local robot scene information and other robot scene information D 1 ,...,D N , actions based on time series Set and state set 503 conduct end-to-end training to obtain action commands A 1 , ..., A N executed by each robot R 1 , ..., R N .
本申请实施例深度增强学习神经网络模型的训练样本可以采用各种机器人在虚拟环境中训练获得的数据,和/或,多个机器人在虚拟环境中执行各种共同任务训练获得的数据。In the embodiment of the present application, the training samples of the deep reinforcement learning neural network model may use data obtained by training various robots in a virtual environment, and/or data obtained by training multiple robots to perform various common tasks in a virtual environment.
本申请实施例深度增强学习神经网络模型的训练样本可以采用在虚拟环境中获得的虚拟数据结合现场测试的实际数据的混合数据。In the embodiment of the present application, the training samples of the deep reinforcement learning neural network model may use the mixed data of virtual data obtained in a virtual environment combined with actual data of on-site testing.
本申请实施例深度增强学习神经网络模型的训练样本可以采用多个机器人在现场测试的实际数据。In this embodiment of the present application, the training samples of the deep reinforcement learning neural network model may use actual data from field tests of multiple robots.
本申请实施例通过各种训练样本,进一步改善了深度增强学习神经网络模型的训练效果。The embodiment of the present application further improves the training effect of the deep reinforcement learning neural network model through various training samples.
为了扩大深度增强学习神经网络模型的应用范围,令其可以改善更多的共同任务的执行效果,深度增强学习神经网络模型为连续或者离散函数。In order to expand the application scope of the deep reinforcement learning neural network model so that it can improve the performance of more common tasks, the deep reinforcement learning neural network model is a continuous or discrete function.
深度增强学习神经网络模型在训练中采用异步结构,从而降低深度增强学习神经网络模型应用的复杂度,令本申请实施例更容易实现。The deep reinforcement learning neural network model adopts an asynchronous structure in training, thereby reducing the complexity of the application of the deep reinforcement learning neural network model, and making the embodiment of the present application easier to implement.
本申请实施例中深度增强学习神经网络模型可以先实现简单的共同任务的训练,再接收复杂的共同任务的训练,逐步增加训练样本的复杂度,便于本申请实施例中的深度增强学习神经网络模型实现复杂共同任务的准确性。In the embodiment of the present application, the deep enhanced learning neural network model can first realize the training of simple common tasks, and then receive the training of complex common tasks, and gradually increase the complexity of the training samples, which is convenient for the deep enhanced learning neural network in the embodiments of the present application. The accuracy with which the model achieves complex common tasks.
本申请实施例通过深度增强学习神经网络模型的训练,不断改善深度增强学习神经网络模型输出动作命令的准确性,提升多个机器人协同工作的效率,简化多个机器人协同工作控制的复杂度。The embodiment of the present application continuously improves the accuracy of the output action commands of the deep reinforcement learning neural network model through the training of the deep reinforcement learning neural network model, improves the efficiency of the collaborative work of multiple robots, and simplifies the complexity of the collaborative work control of multiple robots.
对应于上述方法,本申请一些实施例还提供一种多个机器人协同工作的控制系统,包括:多个机器人,各机器人分别捕获场景信息,场景信息包括:本地机器人场景信息和其他机器人场景信息;Corresponding to the above method, some embodiments of the present application also provide a control system for multiple robots working together, including: multiple robots, each of which captures scene information respectively, and the scene information includes: local robot scene information and other robot scene information;
各机器人的控制器分别采用控制算法对场景信息进行计算,获得各机器人对应的动作命令;并向各机器人发送对应的动作命令,以使各机器人执行对应的动作命令,协同工作完成至少一项共同任务。The controllers of each robot use the control algorithm to calculate the scene information to obtain the corresponding action commands of each robot; and send the corresponding action commands to each robot so that each robot can execute the corresponding action commands and work together to complete at least one common task. Task.
具体地,控制算法根据深度增强学习神经网络模型学习获得。Specifically, the control algorithm is learned and obtained according to the deep reinforcement learning neural network model.
控制算法集成在各机器人的存储器中,根据需要各控制器调用控制算法实现对场景信息进行计算,获得各机器人对应的动作命令,从而实现对各机器人的低延时控制,保证动作命令的准确性。The control algorithm is integrated in the memory of each robot. According to the needs, each controller calls the control algorithm to calculate the scene information and obtain the corresponding action command of each robot, so as to realize the low-latency control of each robot and ensure the accuracy of the action command. .
本实施例的多个机器人协同工作的控制系统用于实现前述多个方法实施例中相应的控制方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的控制系统中的各机器人的实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。The control system in which a plurality of robots work together in this embodiment is used to implement the corresponding control methods in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here. In addition, for the implementation of each robot in the control system of this embodiment, reference may be made to the descriptions of corresponding parts in the foregoing method embodiments, and details are not repeated here.
根据本申请实施例提供的多个机器人的运行控制方案,各机器人分别捕获场景信息,各机器人采用控制算法根据场景信息进行计算,获得各机器人对应的动作命令,从而令各机器人执行对应的动作命令,协同工作完成至少一项共同任务。本申请实施例各机器人通过捕获场景信息以及控制算法实现协同工作,无需各机器人之间进行实时通信即可避免各机器人之间发生的碰撞。本申请实施例降低了多个机器人协同工作的控制成本,易于维护与升级,且适用于处理多种共同任务。According to the operation control scheme of multiple robots provided in the embodiment of the present application, each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task. In the embodiments of the present application, the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots. The embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
对应于上述方法,本申请一些实施例还提供一种机器人,机器人将捕获的场景信息发送至控制器,场景信息包括:本地机器人场景信息和其他机器人场景信息;控制器采用控制算法对场景信息进行计算,获得各机器人对应的动作命令;机器人执行对应的动作命令,与其他机器人协同工作完成至少一项共同任务。Corresponding to the above method, some embodiments of the present application also provide a robot. The robot sends the captured scene information to the controller. The scene information includes: local robot scene information and other robot scene information; the controller uses a control algorithm to process the scene information. Calculate and obtain the action commands corresponding to each robot; the robot executes the corresponding action commands, and cooperates with other robots to complete at least one common task.
具体地,控制算法根据深度增强学习神经网络模型学习获得。Specifically, the control algorithm is learned and obtained according to the deep reinforcement learning neural network model.
本实施例的机器人用于实现前述多个方法实施例中相应的控制方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的机器人的实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。The robot in this embodiment is used to implement the corresponding control methods in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here. In addition, for the implementation of the robot in this embodiment, reference may be made to the descriptions of corresponding parts in the foregoing method embodiments, which will not be repeated here.
根据本申请实施例提供的多个机器人的运行控制方案,各机器人分别捕获场景信息,各机器人采用控制算法根据场景信息进行计算,获得各机器人对应的动作命令,从而令各机器人执行对应的动作命令,协同工作完成至少一项共同任务。本申请实施例各机器人通过捕获场景信息以及控制算法实现协同工作,无需各机器人之间进行实时通信即可避免各机器人之间发生的碰撞。本申请实施例降低了多个机器人协同工作的控制成本,易于维护与升级,且适用于处理多种共同任务。According to the operation control scheme of multiple robots provided in the embodiment of the present application, each robot captures scene information separately, and each robot uses a control algorithm to perform calculations based on the scene information to obtain the corresponding action commands of each robot, so that each robot executes the corresponding action command , working together to accomplish at least one common task. In the embodiments of the present application, the robots realize cooperative work by capturing scene information and control algorithms, and avoid collisions between the robots without real-time communication between the robots. The embodiment of the present application reduces the control cost of multiple robots working together, is easy to maintain and upgrade, and is suitable for handling various common tasks.
对应于上述方法,本申请一些实施例还提供一种计算机程序产品,其被有形地存储在控制器的可读介质上,并且计算机可执行指令,计算机可执行指令在被执行时使至少一个处理器执行上述任一方法。Corresponding to the above method, some embodiments of the present application further provide a computer program product, which is tangibly stored on a readable medium of the controller, and has computer-executable instructions. When executed, the computer-executable instructions cause at least one processing implement any of the above methods.
对应于上述方法,本申请一些实施例还提供计算机可读介质,其上存储有计算机可执行指令,计算机可执行指令在被执行时使至少一个处理器执行上述的方法。Corresponding to the above method, some embodiments of the present application further provide a computer-readable medium on which computer-executable instructions are stored, and when executed, the computer-executable instruction causes at least one processor to execute the above-mentioned method.
需要指出,根据实施的需要,可将本申请实施例中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本申请实施例的目的。It should be pointed out that, according to the needs of implementation, each component/step described in the embodiment of the present application can be divided into more components/steps, and two or more components/steps or partial operations of components/steps can also be combined into New components/steps to achieve the purpose of the embodiment of the present application.
上述根据本申请实施例的方法可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器 或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的校验码生成方法。此外,当通用计算机访问用于实现在此示出的校验码生成方法的代码时,代码的执行将通用计算机转换为用于执行在此示出的校验码生成方法的专用计算机。The above-mentioned method according to the embodiment of the present application can be implemented in hardware, firmware, or as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by Computer code downloaded from a network that is originally stored on a remote recording medium or a non-transitory machine-readable medium and will be stored on a local recording medium so that the methods described herein can be stored on a computer code using a general-purpose computer, a dedicated processor, or a programmable Such software processing on a recording medium of dedicated hardware such as ASIC or FPGA. It will be appreciated that a computer, processor, microprocessor controller, or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when When the processor or hardware accesses and executes, the verification code generation method described here is realized. In addition, when a general-purpose computer accesses the code for implementing the check code generation method shown here, the execution of the code converts the general-purpose computer into a special-purpose computer for executing the check code generation method shown here.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及方法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。Those skilled in the art can appreciate that the units and method steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the embodiments of the present application.
以上实施方式仅用于说明本申请实施例,而并非对本申请实施例的限制,有关技术领域的普通技术人员,在不脱离本申请实施例的精神和范围的情况下,还可以做出各种变化和变型,因此所有等同的技术方案也属于本申请实施例的范畴,本申请实施例的专利保护范围应由权利要求限定。The above implementations are only used to illustrate the embodiments of the application, rather than to limit the embodiments of the application. Those of ordinary skill in the relevant technical fields can also make various implementations without departing from the spirit and scope of the embodiments of the application Changes and modifications, so all equivalent technical solutions also belong to the scope of the embodiments of the present application, and the scope of patent protection of the embodiments of the present application should be defined by the claims.

Claims (17)

  1. 一种多个机器人协同工作的控制方法,包括:A control method for a plurality of robots working together, comprising:
    接受从各机器人分别捕获的场景信息,所述场景信息包括:本地机器人场景信息和其他机器人场景信息(S1);Accepting scene information captured separately from each robot, said scene information including: local robot scene information and other robot scene information (S1);
    采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令(S2);Using a control algorithm to calculate the scene information to obtain action commands corresponding to each robot (S2);
    向各机器人发送对应的动作命令,以使所述各机器人执行对应的动作命令,协同工作完成至少一项共同任务(S3)。Send corresponding action commands to each robot, so that each robot executes the corresponding action command, and cooperates to complete at least one common task (S3).
  2. 根据权利要求1所述的方法,其中,所述场景信息包括:机器人运行图像、机器人运行力量、机器人运行距离、机器人运行角度中至少其一。The method according to claim 1, wherein the scene information includes: at least one of robot running image, robot running force, robot running distance, and robot running angle.
  3. 根据权利要求1所述的方法,其中,所述动作命令包括:机器人运动旋转角度和/或机器人运动扭矩。The method according to claim 1, wherein the motion command comprises: robot motion rotation angle and/or robot motion torque.
  4. 根据权利要求1所述的方法,其中,所述场景信息通过所述各机器人上安装的至少一传感器捕获。The method according to claim 1, wherein the scene information is captured by at least one sensor installed on each robot.
  5. 根据权利要求1所述的方法,其中,所述采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令(S2),包括:The method according to claim 1, wherein said using a control algorithm to calculate said scene information to obtain action commands (S2) corresponding to each robot, comprising:
    获得共同任务,并将所述共同任务解析为基于时间序列的动作集和状态集(S21);Obtaining a common task, and parsing the common task into a time series-based action set and state set (S21);
    根据所述场景信息获得所述本地机器人状态、所述其他机器人状态(S22);Obtaining the state of the local robot and the state of other robots according to the scene information (S22);
    采用控制算法对所述本地机器人状态、所述其他机器人状态以及所述基于时间序列的动作集和状态集进行计算,获得各机器人对应的动作命令(S23)。Using a control algorithm to calculate the local robot state, the other robot states, and the time-series-based action set and state set to obtain action commands corresponding to each robot (S23).
  6. 根据权利要求1所述的方法,其中,所述控制算法根据深度增强学习神经网络模型学习获得。The method according to claim 1, wherein the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  7. 根据权利要求6所述的方法,其中,所述深度增强学习神经网络模型采用各种机器人在虚拟环境中训练获得;和/或,所述深度增强学习神经网络模型采用多个机器人在虚拟环境中执行各种共同任务训练获得。The method according to claim 6, wherein, the deep reinforcement learning neural network model is obtained by training in a virtual environment using various robots; and/or, the deep reinforcement learning neural network model adopts a plurality of robots in a virtual environment Obtained by performing various common mission training.
  8. 根据权利要求6所述的方法,其中,所述深度增强学习神经网络模型采用在虚拟环境中获得的虚拟数据结合单独机器人在现场测试的实际数据的混合数据训练获得。The method according to claim 6, wherein the deep reinforcement learning neural network model is obtained by training the mixed data of virtual data obtained in a virtual environment combined with actual data tested by a single robot in the field.
  9. 根据权利要求6所述的方法,其中,所述深度增强学习神经网络模型采用多个机器人在现场测试的实际数据训练获得。The method according to claim 6, wherein the deep reinforcement learning neural network model is obtained by training with actual data of a plurality of robots tested on site.
  10. 根据权利要求6所述的方法,其中,所述深度增强学习神经网络模型为连续或者离散函数。The method according to claim 6, wherein the deep reinforcement learning neural network model is a continuous or discrete function.
  11. 根据权利要求6所述的方法,其中,所述深度增强学习神经网络模型在训练中采用异步结构。The method according to claim 6, wherein the deep reinforcement learning neural network model adopts an asynchronous structure in training.
  12. 一种多个机器人协同工作的控制系统,包括:A control system for multiple robots working together, including:
    多个机器人,各机器人分别捕获场景信息,所述场景信息包括:本地机器人场景信息和其他机器人场景信息;A plurality of robots, each robot captures scene information respectively, and the scene information includes: local robot scene information and other robot scene information;
    控制器,其采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令; 并向各机器人发送对应的动作命令,以使所述各机器人执行对应的动作命令,协同工作完成至少一项共同任务。A controller, which uses a control algorithm to calculate the scene information to obtain the corresponding action command of each robot; and sends the corresponding action command to each robot, so that each robot executes the corresponding action command, and cooperates to complete at least one a common task.
  13. 根据权利要求12所述的系统,其中,所述控制算法根据深度增强学习神经网络模型学习获得。The system according to claim 12, wherein the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  14. 一种机器人,所述机器人将捕获的场景信息发送至控制器,所述场景信息包括:本地机器人场景信息和其他机器人场景信息;所述控制器采用控制算法对所述场景信息进行计算,获得各机器人对应的动作命令;所述机器人执行对应的动作命令,与其他机器人协同工作完成至少一项共同任务。A robot, the robot sends captured scene information to a controller, the scene information includes: local robot scene information and other robot scene information; the controller uses a control algorithm to calculate the scene information to obtain each An action command corresponding to the robot; the robot executes the corresponding action command, and cooperates with other robots to complete at least one common task.
  15. 根据权利要求13所述的机器人,其中,所述控制算法根据深度增强学习神经网络模型学习获得。The robot according to claim 13, wherein the control algorithm is learned and obtained according to a deep reinforcement learning neural network model.
  16. 一种计算机程序产品,其被有形地存储在控制器的可读介质上,并且计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行根据权利要求1至11中任一项所述的方法。A computer program product tangibly stored on a readable medium of a controller and computer executable instructions which, when executed, cause at least one processor to perform the one of the methods described.
  17. 计算机可读介质,其上存储有计算机可执行指令,所述计算机可执行指令在被执行时使至少一个处理器执行根据权利要求1至11中任一项所述的方法。A computer-readable medium having stored thereon computer-executable instructions which, when executed, cause at least one processor to perform the method of any one of claims 1-11.
PCT/CN2021/119981 2021-09-23 2021-09-23 Control method for multiple robots working cooperatively, system and robot WO2023044676A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/119981 WO2023044676A1 (en) 2021-09-23 2021-09-23 Control method for multiple robots working cooperatively, system and robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/119981 WO2023044676A1 (en) 2021-09-23 2021-09-23 Control method for multiple robots working cooperatively, system and robot

Publications (1)

Publication Number Publication Date
WO2023044676A1 true WO2023044676A1 (en) 2023-03-30

Family

ID=85719817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119981 WO2023044676A1 (en) 2021-09-23 2021-09-23 Control method for multiple robots working cooperatively, system and robot

Country Status (1)

Country Link
WO (1) WO2023044676A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017121457A1 (en) * 2016-01-11 2017-07-20 Abb Schweiz Ag A collaboration system and a method for operating the collaboration system
CN109382825A (en) * 2017-08-08 2019-02-26 发那科株式会社 Control device and learning device
CN110587606A (en) * 2019-09-18 2019-12-20 中国人民解放军国防科技大学 Open scene-oriented multi-robot autonomous collaborative search and rescue method
US20200166952A1 (en) * 2018-11-27 2020-05-28 Institute For Information Industry Coach apparatus and cooperative operation controlling method for coach-driven multi-robot cooperative operation system
CN112465151A (en) * 2020-12-17 2021-03-09 电子科技大学长三角研究院(衢州) Multi-agent federal cooperation method based on deep reinforcement learning
CN113189983A (en) * 2021-04-13 2021-07-30 中国人民解放军国防科技大学 Open scene-oriented multi-robot cooperative multi-target sampling method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017121457A1 (en) * 2016-01-11 2017-07-20 Abb Schweiz Ag A collaboration system and a method for operating the collaboration system
CN109382825A (en) * 2017-08-08 2019-02-26 发那科株式会社 Control device and learning device
US20200166952A1 (en) * 2018-11-27 2020-05-28 Institute For Information Industry Coach apparatus and cooperative operation controlling method for coach-driven multi-robot cooperative operation system
CN110587606A (en) * 2019-09-18 2019-12-20 中国人民解放军国防科技大学 Open scene-oriented multi-robot autonomous collaborative search and rescue method
CN112465151A (en) * 2020-12-17 2021-03-09 电子科技大学长三角研究院(衢州) Multi-agent federal cooperation method based on deep reinforcement learning
CN113189983A (en) * 2021-04-13 2021-07-30 中国人民解放军国防科技大学 Open scene-oriented multi-robot cooperative multi-target sampling method

Similar Documents

Publication Publication Date Title
CN109397285B (en) Assembly method, assembly device and assembly equipment
Tang et al. A framework for manipulating deformable linear objects by coherent point drift
JP2023164459A (en) Efficient robot control based on inputs from remote client devices
CN109940619A (en) Trajectory planning method, electronic device and storage medium
CN109910018B (en) Robot virtual-real interaction operation execution system and method with visual semantic perception
KR20170102991A (en) Control systems and control methods
KR20210012672A (en) System and method for automatic control of robot manipulator based on artificial intelligence
JP2009066692A (en) Trajectory searching device
Bihlmaier et al. Robot unit testing
WO2019061690A1 (en) Mechanical arm inverse kinematics solution error determination and correction method and device
Brecher et al. Towards anthropomorphic movements for industrial robots
CN115026835A (en) Method for optimizing overall performance of robot mechanical arm servo system
WO2023044676A1 (en) Control method for multiple robots working cooperatively, system and robot
Huang et al. Control of a piecewise constant curvature continuum manipulator via policy search method
CN210115917U (en) Robot virtual-real interactive operation execution system with visual semantic perception
JP6383716B2 (en) Control device and control method for drone
JP2020052032A (en) Imaging device and imaging system
JP2020042787A (en) Automatic driving support method, driving device, support device, and computer-readable storage medium
CN117957500A (en) Control method and system for cooperative work of multiple robots and robot
KR20230147710A (en) Imitation learning in manufacturing environments
Queißer et al. Skill memories for parameterized dynamic action primitives on the pneumatically driven humanoid robot child affetto
CN109934155B (en) Depth vision-based collaborative robot gesture recognition method and device
US20200202178A1 (en) Automatic visual data generation for object training and evaluation
Solis et al. An underwater simulation server oriented to cooperative robotic interventions: The educational approach
Santos Learning from Demonstration using Hierarchical Inverse Reinforcement Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21957812

Country of ref document: EP

Kind code of ref document: A1