CN113524200A

CN113524200A - Mechanical arm scheduling system, mechanical arm scheduling method, mechanical arm replacing device, mechanical arm equipment and mechanical arm medium

Info

Publication number: CN113524200A
Application number: CN202111044877.7A
Authority: CN
Inventors: 焦家辉; 张晟东; 王济宇; 张立华
Original assignee: Ji Hua Laboratory
Current assignee: Ji Hua Laboratory
Priority date: 2021-09-07
Filing date: 2021-09-07
Publication date: 2021-10-22
Anticipated expiration: 2041-09-07
Also published as: CN113524200B

Abstract

The invention relates to the technical field of mechanical arms, and particularly discloses a system, a method, a replacing method, a device, equipment and a medium for scheduling a mechanical arm, wherein the system comprises: the system comprises an electroencephalogram acquisition module, a processing calibration module, a scheduling center module and a mechanical arm model; the dispatching center module is used for endowing corresponding reward information to the mechanical arm model according to the condition that each mechanical arm completes the subtasks, and endowing the mechanical arm model with a replacement probability according to the reward information so that the mechanical arm model can be replaced according to the replacement probability; the system enables the mechanical arm model to obtain a replacement probability according to reward information in the process that the mechanical arm gradually completes subtasks, so that the mechanical arm model is replaced according to the replacement probability, sufficient excellent samples are obtained when the mechanical arm can smoothly complete the same or similar operation movement, the mechanical arm model is driven to gradually improve and evolve, and the mechanical arm models in a mechanical arm group can be ensured to be jointly studied and evolved.

Description

Mechanical arm scheduling system, mechanical arm scheduling method, mechanical arm replacing device, mechanical arm equipment and mechanical arm medium

Technical Field

The application relates to the technical field of mechanical arms, in particular to a mechanical arm scheduling system, a mechanical arm scheduling method, a mechanical arm replacing device, a mechanical arm scheduling device and a mechanical arm replacing medium.

Background

In order to improve the intelligent degree of factories, some intelligent factories control the operation of the movable mechanical arm to operate through brain-computer interaction, and the way of controlling the operation of the mechanical arm by the brain-computer can avoid the direct contact of a user and an operation button and reduce the risk of cross infection of viruses, bacteria and the like.

In a processing mode of controlling the mechanical arm by the brain machine, a mechanical arm model needs to be issued to the mechanical arms of the mechanical arm group, so that the mechanical arm can complete production operation under the guidance of the mechanical arm model, but because the tasks issued by the brain machine have certain bias and the environment in a workshop can change at any time, the mechanical arm model needs to have certain learning and evolution capacity, but the mechanical arm model independently operates and learns, so that the degree of evolution of the mechanical arm model is uneven, and the mechanical arm group can not be ensured to be gradually improved and evolved.

In view of the above problems, no effective technical solution exists at present.

Disclosure of Invention

The application aims to provide a system, a method, a replacement method, a device, equipment and a medium for scheduling mechanical arms, so that when the mechanical arms can smoothly complete the same or similar operation movement, enough excellent samples are obtained to drive mechanical arm models to gradually improve and evolve, and the mechanical arm models in a mechanical arm group can be ensured to be jointly studied and evolved.

In a first aspect, the present application provides a robot scheduling system for scheduling a robot group to complete a production job, the system comprising:

the brain electrical acquisition module is used for acquiring brain electrical signals of the user according to the multi-mode brain-computer interactive paradigm;

the processing calibration module is used for analyzing and acquiring user planning information according to the electroencephalogram signals;

the dispatching center module is used for planning a plurality of subtasks according to the user planning information and distributing the subtasks to the mechanical arm model;

the mechanical arm model is stored in the mechanical arm and used for controlling the corresponding mechanical arm to perform operation movement according to the subtasks;

the dispatching center module is also used for endowing corresponding reward information to the mechanical arm model according to the condition that each mechanical arm completes the subtasks, and endowing the mechanical arm model with a replacement probability according to the reward information so that the mechanical arm model can be replaced according to the replacement probability.

The utility model provides a mechanical arm scheduling system, combine brain electricity collection module and processing calibration module to acquire user's planning information fast, accurately, utilize dispatch center module according to user's planning information planning a plurality of subtasks, and distribute the subtasks that correspond to mechanical arm model, make mechanical arm model control mechanical arm carry out the operation motion, in the process that the sub-task was gradually accomplished to the mechanical arm, dispatch center module gives mechanical arm model reward information, and make mechanical arm model obtain a replacement probability according to reward information, so that mechanical arm model replaces according to the replacement probability, make mechanical arm model can continuously carry out the update iteration, it has sufficient outstanding sample to order mechanical arm model to gradually perfect and evolve to have enough when making the mechanical arm can accomplish smoothly the same or similar operation motion.

The mechanical arm scheduling system is characterized in that the scheduling center module is used for giving a replacement probability to the mechanical arm model according to the sum of the reward information obtained by each mechanical arm model, so that the mechanical arm model is replaced by the mechanical arm model with the highest sum of the reward information according to the replacement probability.

The mechanical arm dispatching system is characterized in that the replacement probability is in negative correlation with the sum of reward information obtained by the mechanical arm model.

The mechanical arm dispatching system is characterized in that the dispatching center module is used for giving a replacement probability to the mechanical arm model when the mechanical arm completes a preset number of subtasks, so that the mechanical arm model is replaced according to the replacement probability.

The mechanical arm dispatching system is characterized in that the dispatching center module is used for giving a replacement probability to the mechanical arm model when each mechanical arm completes the same number of subtasks, so that the mechanical arm model is replaced according to the replacement probability.

In a second aspect, the present application further provides a robot scheduling method for scheduling a robot group to complete a production job, where the scheduling method includes:

acquiring user planning information, and planning a plurality of subtasks according to the user planning information, wherein the user planning information is acquired by a processing calibration module according to electroencephalogram signal analysis, and the electroencephalogram signal is acquired by an electroencephalogram acquisition module according to a multi-mode brain-computer interactive paradigm;

distributing the subtasks to a mechanical arm model so that the mechanical arm model controls corresponding mechanical arms to perform operation movement according to the subtasks;

endowing corresponding reward information corresponding to the mechanical arm model according to the condition that each mechanical arm completes the subtasks;

and giving a replacement probability to the mechanical arm model according to the reward information so that the mechanical arm model is replaced according to the replacement probability.

The method for scheduling the mechanical arm comprises the steps of planning a plurality of subtasks according to user planning information, distributing the corresponding subtasks to a mechanical arm model, enabling the mechanical arm model to control the mechanical arm to perform operation movement, giving reward information to the mechanical arm model in the process that the mechanical arm gradually completes the subtasks, enabling the mechanical arm model to obtain a replacement probability according to the reward information, enabling the mechanical arm model to be replaced according to the replacement probability, enabling the mechanical arm model to continuously perform updating iteration, enabling the mechanical arm to have enough excellent samples to drive the mechanical arm model to gradually improve and evolve when the mechanical arm can smoothly complete the same or similar operation movement.

In a third aspect, the present application further provides a robot model replacement method for driving a robot model to perform evolutionary replacement in a production operation, where the replacement method includes the following steps:

endowing corresponding reward information to the corresponding mechanical arm model according to the condition that each mechanical arm completes the subtasks;

According to the mechanical arm model replacing method, in the production operation process of the mechanical arm, reward information is given to the mechanical arm model, the mechanical arm model obtains a replacing probability according to the reward information, the mechanical arm model is replaced according to the replacing probability, the mechanical arm model can continuously conduct updating iteration, and when the mechanical arm can smoothly complete the same or similar operation movement, enough excellent samples are obtained to drive the mechanical arm model to be gradually improved and evolved.

In a fourth aspect, the present application further provides a robot model replacing apparatus for driving a robot model to perform evolution replacement in a production operation, the replacing apparatus including:

the reward module is used for endowing corresponding reward information to the corresponding mechanical arm model according to the condition that each mechanical arm completes the subtasks;

and the replacement module is used for endowing the mechanical arm model with a replacement probability according to the reward information so as to enable the mechanical arm model to be replaced according to the replacement probability.

The application provides a mechanical arm model replacement device, in the mechanical arm production operation process, reward information is given to the mechanical arm model by the reward module, the replacement module is used for enabling the mechanical arm model to obtain a replacement probability according to the reward information, the mechanical arm model is replaced according to the replacement probability, the mechanical arm model can continuously conduct updating iteration, and when the mechanical arm can smoothly complete the same or similar operation movement, enough excellent samples are obtained to drive the mechanical arm model to gradually improve and evolve.

In a fifth aspect, the present application further provides an electronic device, comprising a processor and a memory, where the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, perform the steps of the method as provided in the third aspect.

In a sixth aspect, the present application also provides a storage medium having a computer program stored thereon, which, when executed by a processor, performs the steps of the method as provided in the third aspect above.

From the above, the present application provides a robot scheduling system, a robot scheduling method, a robot replacement device, a robot apparatus, and a medium, wherein the robot scheduling system plans a plurality of subtasks according to user planning information by using a scheduling center module, and allocates the corresponding subtasks to a robot model, so that the robot model controls a robot to perform a working motion.

Drawings

Fig. 1 is a schematic structural diagram of a robot scheduling system according to an embodiment of the present disclosure.

Fig. 2 is a flowchart of a method for scheduling a robot according to an embodiment of the present disclosure.

Fig. 3 is a flowchart of a robot model replacement method according to an embodiment of the present disclosure.

Fig. 4 is a schematic structural diagram of a robot model replacement device according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of a robot scheduling system in embodiment 1 according to an embodiment of the present disclosure.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Reference numerals: 100. an electroencephalogram acquisition module; 200. a processing calibration module; 201. an electroencephalogram processing module; 202. a task calibration module; 300. a dispatching center module; 301. a task planning unit; 302. a holographic sensing unit; 303. a high-precision map unit; 304. a SLAM unit; 305. an intelligent formation unit; 400. a mechanical arm model; 500. a mechanical arm; 600. an electronic device; 601. a processor; 602. a memory; 603. a communication bus; 700. a comprehensive display module; 800. a safety monitoring module; 900. a self-checking maintenance module; 3011. a reward module; 3012. and replacing the module.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

In industrial production, mechanical arm groups consisting of a plurality of mechanical arms are often used for large-scale production operation, and some mechanical arm groups relate to moving and working links in the production operation, so that a corresponding dispatching system is required to complete the large-scale dispatching control of the mechanical arm groups, and a proper mechanical arm model is configured for the mechanical arms, so that each mechanical arm in the mechanical arm groups can smoothly complete the production operation.

In a first aspect, please refer to fig. 1, in which fig. 1 is a system for scheduling robot arms to complete a production job in some embodiments of the present application, the system includes:

the electroencephalogram acquisition module 100 is used for acquiring electroencephalogram signals of a user according to the multi-mode brain-computer interaction paradigm;

the processing calibration module 200 is used for acquiring user planning information according to electroencephalogram signal analysis;

the scheduling center module 300 is configured to plan a plurality of subtasks according to the user planning information, and to assign the subtasks to the mechanical arm model 400;

the robot arm model 400 is stored in the robot arm 500, and is used for controlling the corresponding robot arm 500 to perform operation movement according to the subtasks;

the dispatch center module 300 is further configured to assign reward information corresponding to the robot arm model 400 according to the sub-task completion status of each robot arm 500, and assign a replacement probability to the robot arm model 400 according to the reward information, so that the robot arm model 400 is replaced according to the replacement probability.

Specifically, the brain-computer interaction paradigm is a paradigm of multi-modal fusion formed by mixing a common electroencephalogram signal with one or more bioelectric signals, for example: motor imagery and Steady State Visual Evoked Potential (SSVEP) combinations, P300 and SSVEP combinations, et cetera; the brain electrical acquisition module 100 acquires the brain electrical signals of the user according to the multi-mode brain-computer interactive paradigm, and the brain-computer interactive paradigm is adopted to acquire the brain electrical signals, so that the accuracy of the acquired brain electrical signals can be ensured.

Specifically, the electroencephalogram signal is information of brain activity, which can reflect the overall information of the brain activity, and is directly acquired from the head of the user through the electroencephalogram acquisition module 100.

Specifically, the user planning information is a processing node and/or a displacement node set by the user in the operation scene according to the operation requirement, and is used for guiding the operation of the mechanical arm 500.

Specifically, the scheduling center module 300 plans a global task according to the user planning information, and then splits the global task into a plurality of subtasks, where the global task may be an operation task to be completed by the whole robot arm group, or an operation task to be completed by each robot arm 500; when the global task is a job task to be completed by the whole robot arm group, the number and types of subtasks allocated to each robot arm 500 in the global task may be the same or different, and preferably, the subtasks of the same number and types are allocated, so that the tasks executed by each robot arm 500 have consistency, and the comparison of the advantages and disadvantages of the robot arm model 400 is facilitated; when the global task is a task that each robot arm 500 needs to complete, it means that the number and types of subtasks executed by each robot arm 500 are consistent, thereby facilitating comparison of the advantages and disadvantages of the robot arm model 400.

Specifically, the bonus information is a bonus value given by the scheduling center module 300 according to the condition that the robot 500 completes the subtasks, and the better the condition that the robot 500 completes the subtasks is, the larger the bonus information given by the scheduling center module 300 to the corresponding robot model 400 is.

Specifically, the dispatching center module 300 assigns a replacement probability to the robot model 400 according to the reward information, so that the robot model 400 performs a replacement process according to the replacement probability, mainly utilizes the reward information to judge the service capability of the robot 500 to complete the subtasks to evaluate the advantages and disadvantages of the corresponding robot model 400, and then assigns a corresponding replacement probability according to the advantages and disadvantages of the robot model 400, so that the robot model 400 performs a model replacement process probabilistically according to the replacement probability, in order to gradually replace the robot model 400 corresponding to the robot 500 with poor service capability, so as to ensure that the robot model 400 can smoothly complete the production operation, and at the same time, more excellent robot models 400 are retained, so as to ensure that the robot models 400 have diversity, and further, the robot model 400 has enough excellent sample robot models 400 to gradually improve and evolve, enabling the mechanical arm models 400 in the mechanical arm group to continuously perform updating iteration; in the production operation engineering, the mechanical arm model 400 performs comparative evolution by using different mechanical arm models 400 in the mechanical arm group as reference objects, so that the mechanical arm model 400 can gradually adapt to the production operation flow, that is, the mechanical arm model 400 is corrected and completed by using the actual application effect of the different mechanical arm models 400, the process of performing multiple times of simulation and debugging during the actual application of the mechanical arm 500 is omitted, and the optimization efficiency of the mechanical arm model 400 is improved.

More specifically, the replacement of the mechanical arm model 400 according to the replacement probability means that the mechanical arm model 400 has a certain probability to replace with another mechanical arm model 400, and the mechanical arm model 400 is maintained to be unchanged with a certain probability, the replacement probability is related to the reward information acquired by the mechanical arm model 400, and the reward information is used for comprehensively reflecting the service capability of the mechanical arm 500 to complete the subtasks, such as the operation speed, the movement speed, the position accuracy, and the like; in the process of completing one same subtask, the mechanical arm model 400 corresponding to the mechanical arm 500 obtaining higher reward information is a high-quality model in the subtask; the robot scheduling system according to the embodiment of the application assigns a corresponding replacement probability based on the quality of the robot model 400, and the robot model 400 determines whether to perform an evolution behavior based on the replacement probability, and if the replacement probability is 50%, the robot model 400 has a 50% probability of being replaced with another robot model 400, and also has a 50% probability of remaining unchanged, that is, remaining as the original robot model 400.

Specifically, the replacement of the robot arm model 400 may be replaced with a new robot arm model 400, or may be replaced with a robot arm model 400 that obtains the highest reward information in completing the current subtasks, or may be replaced with a robot arm model 400 that obtains the highest accumulated reward information, so as to achieve the identity and the difference of the robot arm models 400 in the robot arm group, so that the robot arm 500 can smoothly complete the same or similar operation motions and simultaneously has diversity to drive the robot arm model 400 to continuously evolve.

The mechanical arm scheduling system of the embodiment of the application combines the electroencephalogram acquisition module 100 and the processing calibration module 200 to quickly and accurately acquire user planning information, plans a plurality of subtasks according to the user planning information by using the scheduling center module 300, and assigns the corresponding subtasks to the robot arm model 400, so that the robot arm model 400 controls the robot arm 500 to perform the working motion, in the process of the robot 500 gradually completing the subtasks, the scheduling center module 300 gives the robot model 400 bonus information, and makes the robot model 400 obtain a replacement probability according to the bonus information, so that the robot arm model 400 is replaced according to the replacement probability, so that the robot arm model 400 can continue to perform update iterations, so that the robot 500 can smoothly complete the same or similar operation motions and has enough excellent samples to drive the robot model 400 to gradually improve and evolve, thereby ensuring that the robot models 400 in the robot group can jointly learn and evolve.

In addition, the robot arm model 400 is replaced according to the replacement probability, so that the robot arm models 400 in the robot arm group can gradually screen out the robot arm models 400 with poor operation capability in the production operation, and diversity is maintained to drive the robot arm models 400 to continuously evolve.

In some preferred embodiments, the dispatch center module 300 is configured to assign a replacement probability to the robotic arm model 400 based on the sum of the reward information obtained for each robotic arm model 400, such that the robotic arm model 400 is replaced with the robotic arm model 400 having the highest sum of the reward information obtained based on the replacement probability.

Specifically, each time the robot 500 completes one subtask, the robot model 400 obtains reward information matching the performance of the robot 500 in completing the subtask process, where the reward information reflects the performance of the robot 500 in completing the current subtask, and the process of the robot model 400 controlling the robot 500 to complete different subtasks has different performance capabilities, i.e., the robot model 400 that performs poorly in one subtask may perform well in another subtask, and therefore, if the quality of the robot model 400 is assessed based on the reward information assigned to a single subtask, the performance is too good, so the scheduling center module 300 uses the sum of the reward information obtained by each robot model 400 as a reference for the replacement probability to more accurately determine the quality of the robot model 400.

More specifically, after each of the robots 500 in the robot group completes the same number of subtasks, the robot model 400 with the highest total sum of the acquired reward information represents the current best quality robot model 400 in the robot group, and the scheduling center module 300 sets a corresponding replacement probability for the corresponding robot model 400 according to the respective total sums of the reward information of the remaining robot models 400, so that the remaining robot models 400 are probabilistically replaced with the current best quality robot model 400 according to the corresponding replacement probability.

More specifically, after the robot arm model 400 currently acquiring the highest total sum of reward information continues to execute one or more subtasks, the acquired total sum of reward information may not be the highest any more, and at this time, the robot arm model 400 with the highest previous quality is converted into the robot arm model 400 with the highest total sum of reward information acquired, so that the replacement object of the robot arm model 400 can be dynamically adjusted, namely the replacement object is always the best quality robot arm model 400, and the robot arm models 400 in the robot arm group can be dynamically and continuously subjected to evolutionary replacement towards the best quality robot arm model 400.

In some preferred embodiments, the replacement probability is inversely related to the sum of the reward information obtained by the robotic arm model 400.

Specifically, on the one hand, the purpose of the mechanical arm model 400 replacement according to the replacement probability is to reserve more relatively good model samples, and therefore, the replacement probability of the mechanical arm model 400 is in negative correlation with the sum of reward information obtained by the mechanical arm model 400, so that the mechanical arm model 400 with poor performance has a greater replacement probability to be replaced by the mechanical arm model 400 with the current best quality, and the mechanical arm model 400 with better performance has a smaller replacement probability to be replaced by the mechanical arm model 400 with the current best quality, so that the inferior mechanical arm model 400 in the mechanical arm group is gradually eliminated and replaced by the good mechanical arm model 400, and the replacement and evolution process of the mechanical arm model 400 is similar to the evolution process of the competitive selection; on the other hand, the purpose of the probabilistic replacement of the robot arm model 400 is to preserve the diversity of the evolution of the robot arm model 400, the robot arm model 400 has the operation contents of excellence and not excellence, and the robot arm model 400 that is currently not performing well may perform as the best quality robot arm model 400 in the subsequent operation, and therefore, setting the replacement probability to perform the replacement action may make these robot arm models 400 have a certain probability of remaining unchanged to preserve the diversity of the evolution of the robot arm model 400.

In some preferred embodiments, the dispatch center module 300 is configured to assign a replacement probability to the robot arm model 400 every time the robot arm 500 completes a preset number of subtasks, so that the robot arm model 400 is replaced according to the replacement probability.

Specifically, the replacement action of the robot model 400 may be performed when the robot 500 completes one subtask, or may be performed when the robot 500 completes a plurality of subtasks, that is, the time interval of the evolution of the robot model 400 is set based on the number of times the robot 500 completes the subtasks, and corresponds to the time when the robot model 400 obtains the corresponding reward information.

More specifically, the evolution time interval corresponds to a preset number of subtasks set by the dispatch center module 300, and the smaller the preset number, the higher the evolution frequency of the robot model 400, in this embodiment, the preset number of subtasks corresponding to the evolution time interval is preferably one, that is, after the robot 500 completes one subtask, the robot model 400 performs a probability replacement determination to determine whether the robot model 400 needs to be replaced by the currently best quality robot model 400.

In some preferred embodiments, the dispatch center module 300 is configured to assign a replacement probability to the robot arm model 400 when each robot arm 500 completes the same number of subtasks, such that the robot arm model 400 is replaced according to the replacement probability.

Specifically, since the dispatching center module 300 gives the replacement probability to the robot arm model 400 based on the total reward information obtained by the robot arm model 400, in order to more objectively reflect the quality of the robot arm model 400, the dispatching center module 300 needs to give the corresponding replacement probability to each robot arm 500 when each robot arm 500 completes the same number of subtasks, so as to ensure that each robot arm 500 obtains the same number of reward information, and the total reward information can more accurately reflect the quality of the robot arm model 400 in the robot arm group.

In some preferred embodiments, the subtasks include a sub-movement task and a sub-action task.

Specifically, the robot arm group is a group consisting of movable robot arms 500, the sub-movement task is a task in which the robot arms 500 move, and the sub-action task is a task in which the robot arms 500 change in action.

More specifically, in the process of executing the sub-movement task by the robot 500, the dispatching center module 300 gives corresponding reward information according to the movement speed, obstacle avoidance behavior, and the like of the robot 500; in the process of executing the sub-action task by the robot 500, the dispatching center module 300 gives corresponding reward information according to the obstacle avoidance behavior, the action accuracy, the operation completion degree, and the like of the robot 500.

In some preferred embodiments, the user planning information includes destination information and waypoint information.

The destination information is an operation place, more than one operation place exists in a complete production operation task, the operation place is a place where the mechanical arm 500 performs operation movement, and a user sets a plurality of destination information based on the electroencephalogram signals, namely sets a plurality of operation places, so that the robot performs operation after reaching the operation place, namely, executes a sub-action task; the requisite waypoint information is a position point that the mechanical arm 500 must pass through for displacement, in a complete production operation task, the mechanical arm 500 needs to be moved to a place for operation movement, and a user sets a plurality of requisite waypoint information based on electroencephalogram signals, that is, sets a plurality of path points of the mechanical arm 500, so as to guide planning of a displacement path of the mechanical arm 500.

Specifically, the global task is planned and set based on destination information and must-pass waypoint information, the global task comprises the moving path of the mechanical arm group and the content of a working place, so that the mechanical arm group can complete the whole production operation task assigned by a user, wherein the destination information defines the working place, and the must-pass waypoint information defines the path of moving to the working place.

More specifically, the global task is divided into a plurality of subtasks based on the destination information and the must-pass waypoint information, wherein the must-pass waypoint information is a position point through which the robot 500 must pass for displacement, so that the movement path is divided into a plurality of sub-movement tasks based on the must-pass waypoint information, so that the robot 500 completes one subtask every time the robot moves for a certain distance, the robot model 400 can perform multiple probabilistic replacement judgment in the displacement process of one movement path, and the robot models 400 in the robot group gradually evolve in the movement process; in addition, since the destination information defines a work site, the robot arm 500 may perform repeated production operations in the same work site for multiple times, each time the robot arm 500 completes one repeated process in the work site is defined as a sub-action task, so that the robot arm model 400 can generate replacement probabilities for replacement determination for multiple times in the repeated production operation process in the work site, and the robot arm models 400 in the robot arm group gradually evolve in the repeated production operation process.

More specifically, the moving path in the global task is divided into multiple path segments based on the waypoint information, but some path segments are still too long, a path threshold is set in the scheduling center module 300, if the path segment length exceeds the path threshold, the path segment is divided again in equal parts until the path segment length is lower than the path threshold, and then the sub-moving tasks corresponding to the path segment are set within a proper length, so that the probabilistic replacement time intervals of the robot arm models 400 in the robot arm group are more uniform.

In a second aspect, please refer to fig. 2, fig. 2 is a method for scheduling a robot arm group to complete a production job according to some embodiments of the present disclosure, where the method for scheduling includes:

s101, obtaining user planning information and planning a plurality of subtasks according to the user planning information;

the user planning information is obtained by analyzing the electroencephalogram signals through the processing and calibrating module 200, and the electroencephalogram signals are obtained by acquiring through the electroencephalogram acquisition module 100 according to the multi-mode brain-computer interactive paradigm;

s102, distributing the subtasks to the mechanical arm model 400 so that the mechanical arm model 400 controls the mechanical arm 500 to perform operation movement according to the subtasks;

s103, endowing corresponding reward information to the corresponding mechanical arm model 400 according to the condition that each mechanical arm 500 completes the subtasks;

and S104, driving to endow the mechanical arm model 400 with a replacement probability according to the reward information, so that the mechanical arm model 400 is replaced according to the replacement probability.

According to the method for scheduling the mechanical arm, a plurality of subtasks are planned according to user planning information, the corresponding subtasks are allocated to the mechanical arm model 400, the mechanical arm model 400 controls the mechanical arm 500 to perform operation movement, reward information is given to the mechanical arm model 400 in the process that the mechanical arm 500 gradually completes the subtasks, the mechanical arm model 400 obtains a replacement probability according to the reward information, the mechanical arm model 400 is replaced according to the replacement probability, the mechanical arm model 400 can be continuously updated and iterated, and when the mechanical arm 500 can smoothly complete the same or similar operation movement, enough excellent samples are obtained to drive the mechanical arm model 400 to gradually improve and evolve.

As an embodiment, the robot scheduling method is preferably implemented by using the robot scheduling system according to the first aspect.

In a third aspect, referring to fig. 3, fig. 3 is a robot model replacement method provided in some embodiments of the present application for driving the robot model 400 to perform evolutionary replacement in a production operation, the replacement method including the following steps:

s201, endowing corresponding reward information to the corresponding mechanical arm model 400 according to the condition that each mechanical arm 500 completes the subtasks;

s202, giving a replacement probability to the mechanical arm model 400 according to the reward information, so that the mechanical arm model 400 is replaced according to the replacement probability.

In the mechanical arm model replacement method according to the embodiment of the application, reward information is given to the mechanical arm model 400 in the production operation process of the mechanical arm 500, the mechanical arm model 400 obtains a replacement probability according to the reward information, the mechanical arm model 400 is replaced according to the replacement probability, the mechanical arm model 400 can be continuously updated and iterated, and when the mechanical arm 500 can smoothly complete the same or similar operation movement, enough excellent samples are obtained to drive the mechanical arm model 400 to gradually improve and evolve.

In the existing multi-mechanical arm model evolution mode, the mechanical arm 500 generally runs to obtain actual running parameters of the mechanical arm, then the mechanical arm feeds the actual running parameters back to the upper computer, the upper computer corrects the mechanical arm model 400 in the simulation platform according to the actual running parameters and carries out re-simulation, and corrected parameters obtained after simulation are issued to the mechanical arm 500 to realize the corrected evolution of the mechanical arm 500, the evolution mode mainly obtains a plurality of actual running parameters through the plurality of mechanical arms 500 to comprehensively correct design parameters of the mechanical arm model 400 so that the mechanical arm model 400 evolves, but the evolution mode has the defects of large calculation amount, single evolution direction, monotonous model, frequent data transmission and the like; in the method for replacing the mechanical arm model according to the embodiment of the application, the corresponding replacement probability is given according to the reward information obtained by the mechanical arm model 400, so that the mechanical arm model 400 is replaced according to the replacement probability, the actual operation parameters do not need to be tracked in the process, the simulation correction does not need to be carried out according to the actual operation parameters, the correction data does not need to be sent to the corresponding mechanical arm 500 for correcting the mechanical arm model 400, the calculated amount can be effectively reduced, the form diversity and the evolutionary diversity of the mechanical arm model 400 are ensured, and the data transmission frequency can be effectively reduced.

In some preferred embodiments, step S202 includes the following sub-steps:

s2021, calculating the sum of reward information obtained by each mechanical arm model 400;

s2022, defining the mechanical arm model 400 with the highest total sum of the acquired reward information as the best-quality mechanical arm model 400;

s2023, giving corresponding replacement probability to each mechanical arm model 400 according to the sum of the reward information obtained by each mechanical arm model 400;

s2024, the robot arm model 400 for each robot arm 500 is driven to be replaced with the best quality robot arm model 400 based on the replacement probability according to itself.

Specifically, according to the robot arm model replacement method of the embodiment of the present application, the sum of reward information obtained by each robot arm model 400 is used as a reference basis for the replacement probability, so as to more accurately judge the advantages and disadvantages of the robot arm model 400.

In some preferred embodiments, the replacement probability in step S2023 is inversely related to the sum of the reward information obtained by the corresponding robot arm model 400.

Specifically, the replacement probability is negatively correlated with the sum of the reward information obtained by the corresponding mechanical arm model 400, so that the inferior mechanical arm model 400 in the mechanical arm group is gradually eliminated and replaced by the superior mechanical arm model 400, the replacement and evolution process of the mechanical arm model 400 is similar to the evolution process of the competitive selection, and the mechanical arm model 400 has a certain probability to remain unchanged so as to preserve the diversity of the mechanical arm model 400 in evolution.

As an embodiment, the robot scheduling method is preferably implemented by using the scheduling center module 300 in the robot scheduling system according to the first aspect.

In a fourth aspect, please refer to fig. 4, fig. 4 is a robot model replacing apparatus for driving the robot model 400 to perform evolutionary replacement in a production operation according to some embodiments of the present application, the replacing apparatus includes:

the reward module 3011, is used for giving the corresponding reward information of the model 400 of the mechanical arm according to the situation that each mechanical arm 500 finishes the subtask;

a replacement module 3012, configured to assign a replacement probability to the robot arm model 400 according to the reward information, so that the robot arm model 400 is replaced according to the replacement probability.

In the robot model replacement apparatus according to the embodiment of the present application, in the production process of the robot 500, the reward module 3011 is used to give reward information to the robot model 400, and the replacement module 3012 is used to make the robot model 400 obtain a replacement probability according to the reward information, so that the robot model 400 is replaced according to the replacement probability, so that the robot model 400 can continuously perform update iteration, and when the robot 500 can smoothly complete the same or similar operation, there are enough excellent samples to drive the robot model 400 to gradually improve and evolve.

In a fifth aspect, please refer to fig. 6, where fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and the present application provides an electronic device 600 including: the processor 601 and the memory 602, the processor 601 and the memory 602 being interconnected and communicating with each other via a communication bus 603 and/or other form of connection mechanism (not shown), the memory 602 storing a computer program executable by the processor 601, the computer program being executable by the processor 601 when the computing device is running to perform the method of any alternative implementation of the embodiment of the third aspect when executed.

In a sixth aspect, the present application provides a storage medium, and when executed by a processor, the computer program performs the method in any optional implementation manner of the embodiment of the third aspect. The storage medium may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic Memory, a flash Memory, a magnetic disk, or an optical disk.

Example 1

Referring to fig. 5, fig. 5 is a schematic structural diagram of a robot scheduling system in embodiment 1 according to an embodiment of the present disclosure.

The scheduling system includes: the system comprises an electroencephalogram acquisition module 100, a processing calibration module 200, a dispatching center module 300 and a mechanical arm group.

Specifically, the production operation scene of the scheduling system for cargo grabbing is further explained below by taking as an example, in this embodiment, the processing calibration module 200 includes an electroencephalogram processing module 201 and a task calibration module 202, the scheduling system is further provided with an integrated display module 700 for a user to observe and operate, and the integrated display module 700 is used for displaying an operation interface for acquiring electroencephalograms in cooperation with the electroencephalogram acquisition module 100; the user wears the electroencephalogram acquisition module 100, and the sliding of the cargo selection interface in the operation interface of the comprehensive display module 700 is controlled in a brain-computer interactive mode through preset left and right hand motor imagery, so that the cargo type to be grabbed by the mechanical arm 500 is selected, namely after electroencephalograms are acquired by the electroencephalogram acquisition module 100, the electroencephalogram signals are decoded by the electroencephalogram processing module 201 to obtain the user operation intention, and then the cargo selection interface of the comprehensive display module 700 is made to slide left and right to determine the cargo type to be grabbed.

After the user determines the type of the goods to be grabbed by the mechanical arm 500, the user formulates destination information and necessary waypoint information for transporting the goods through a preset SSVEP brain-computer interactive paradigm.

Specifically, the user receives flashes with different frequencies sent by the navigation interface in the comprehensive display module 700 through eyes, different types of electroencephalogram signals can be generated by utilizing flash stimulation and brain imagination to move left and right hands, the electroencephalogram processing module 201 decodes different types of electroencephalogram signals to know the intention of the user, and the flash stimulation with different frequencies respectively represents different destination information and necessary waypoint information, such as: when the user accepts 6.67Hz flicker stimulation, the destination information which represents the intention selection of the user is C point, and when the user accepts 7Hz flicker stimulation, the necessary waypoint which represents the intention selection of the user is B point₁And determining destination information and necessary waypoint information by analogy.

After the electroencephalogram processing module 201 decodes destination information and must-pass waypoint information corresponding to the electroencephalogram signals, the task calibration module 202 marks the destination and the waypoint in the high-precision map according to the destination information and the must-pass waypoint information; wherein the high-precision map is obtained by measurement in advance.

The task planning unit 301 of the dispatch center module 300 plans the global task according to the high-precision map labeled with the destination and the waypoint, divides the global task into a plurality of subtasks, wherein the subtasks include a subtask about goods lifting, a subtask about goods dropping, and a subtask of a plurality of path segments divided by the waypoint, and performs sorting and packaging processing on the subtasks.

After dividing the subtasks, the task planning unit 301 of the dispatching center module 300 sends the subtasks after the sorting and packaging processing to the mechanical arm model 400 corresponding to the mechanical arm 500, so that the mechanical arm 500 gradually executes the subtasks.

During the sub-task execution of the robotic arm 500, the dispatch center module 300 may be configured to schedule each time the robotic arm 500 is finishedThe sub-tasks are assigned to the robot 500 corresponding reward information, and the reward information obtained by the robot model 400 in completing one sub-task is

Wherein i is the serial number of the mechanical arm, and the reward information obtained when the No. 1 mechanical arm completes one subtask is

When the robot 500 completes k subtasks, the total of the acquired reward information is

Then, the total of the reward information obtained when the mechanical arm 500 completes k +1 subtasks is:

i.e. to be original

Increase of

Is endowed with

The value is obtained.

In addition, taking the movement of the robotic arm 500 as an example, the reward information obtained by the robotic arm 500 in performing a subtask is obtained

Where i is the serial number of the robot arm, T is the step length of the robot arm 500, and is a positive integer, which may be the moving distance of the robot arm 500, such as T =10m, or the number of segments of the robot arm 500 for a moving distance, such as T =10 segments, T =0,1,2,3 … … T,

in order to discount the coefficient of the current,

is an instantaneous prize value, and

awarding an action given to the robot arm 500 during the displacement according to the action pose with respect to the state space, i.e., awarding information

After the step length T is divided into a plurality of sections, the corresponding instantaneous reward value is given according to each section of action pose

Then is aligned with

Summing the results; the robot 500 is a learning type robot, and therefore, in the T movement range, the backward movement of the robot 500 reflects the superiority and inferiority of the robot 500, so that the learning type robot is adopted

The t power of the user is used as the instantaneous reward value and the discount coefficient is utilized

The compensation calculation is performed to the power of t.

In addition, in other embodiments, a thwarting penalty may also be added such that

Wherein, in the step (A),

for penalty values, if a robot arm 500 experiences motion stoppage or significant delay during displacement, a corresponding penalty value is assigned to the robot arm 500

，

The setting is made according to the time taken for the robot arm 500 to complete the subtasks.

Each robot arm model 400 has its own

To facilitate comparison of the reward information sums, the reward information sums for each robot arm model 400 are normalized as follows:

wherein, in the step (A),

the normalized bonus value is summed with the bonus information for the robot model 400, wherein,

the reward information sum of the robot model 400 that the robot arm group obtains the maximum reward information sum when the robot arms 500 in the robot arm group complete the current quantum task,

the reward information sum of the robot model 400 for which the robot 500 in the robot group obtains the minimum reward information sum in the robot group when the current quantum task is completed.

After the bonus information sum of the robot models 400 is normalized, the dispatching center module 300 sets a corresponding replacement probability for each robot model 400 according to the normalized value, wherein the replacement probability of the robot model 400 is

WhereinμFor the evolution of the parameters, preset by the user,

the reward value after the reward information sum normalization processing obtained for the mechanical arm model 400 with the largest total sum of the currently obtained reward information is

The probability of the robot arm model 400 remaining unchanged is 1-

I.e., when the robot 500 completes a subtask, it corresponds to the robot model 400

The probability of (1) is replaced by the mechanical arm model 400 with the maximum sum of the currently obtained reward information, and 1-

The probability of the moving robot arm group is kept unchanged, so that the robot arm models 400 of the robot arm group continuously realize the evolution of superior and inferior in the actual operation process, and the robot arm models 400 in the moving robot arm group are continuously optimized, updated and iterated.

Specifically, the comprehensive display module 700 in this embodiment is configured to display an image to cooperate with the electroencephalogram acquisition module 100 to acquire an electroencephalogram signal, where the display image includes a prompt stimulation display image for acquiring an electroencephalogram, a waypoint information display image intended to be selected by a user, a mechanical arm 500 motion information display image, a high-precision map display image, a safety monitoring information display image, a self-checking maintenance interface display image, and the like.

Specifically, the dispatch center module 300 considers all of the robot arms 500 as a system to simultaneously plan the paths of all of the robot arms 500 when performing global mission planning, based on considering various collision possibilities between each of the moving robot arms 500 and minimizing the total system runtime.

More specifically, the dispatch center module 300 of the dispatch system includes:

holographic perception unit 302, high-precision map unit 303, intelligent formation unit 305, SLAM unit 304 and task planning unit 301.

The task planning unit 301 is configured to plan sub-tasks and assign the sub-tasks to the robot arm model 400, and is configured to assign reward information corresponding to the robot arm model 400 according to a situation that each robot arm 500 completes the sub-tasks, and to assign a replacement probability to the robot arm model 400 according to the reward information, so that the robot arm model 400 is replaced according to the replacement probability.

The high-precision map unit 303 is used for establishing a high-precision map, so that the mission planning unit 301 can perform global mission planning by combining the destination information, the necessary waypoint information and the high-precision map.

The intelligent formation unit 305 is used for grouping and forming the mechanical arm groups. The intelligent formation unit 305 divides the mechanical arms 500 formed into one team into two task roles of a pilot and a follower, wherein at least one pilot is arranged in one team, and the mechanical arms 500 serving as the followers keep a fixed relative distance and a fixed relative angle with the pilot according to program setting, so that the overall planning control of the mechanical arms 500 is facilitated.

The holographic sensing unit 302 and the SLAM unit 304 are both used to acquire map data in real time to correct a high precision map.

In addition, the scheduling system further includes:

and the safety monitoring module 800 is used for monitoring the scheduling scene and the scheduling process of the mechanical arm 500 in real time. The safety monitoring module 800 creates a digital space communicatively interconnected with the real space to facilitate monitoring of the robotic arm 500 and the scene environment. The dispatching system realizes complete synchronization with the actual scene and the actual state of the mobile mechanical arm 500 by establishing a physical model of the actual mechanical arm 500 and the environment in a digital space and updating data in real time by using a sensor, a user can monitor the whole dispatching process in real time, record and store information data of the whole system in real time, feed back motion information and abnormal state information of the mobile mechanical arm 500, alarm is given when an abnormal state is detected, and the safety monitoring module 800 displays the monitoring condition through the comprehensive display module 700.

And the self-checking maintenance module 900 is configured to analyze and evaluate current states of the robot 500 and a scene environment in time according to sensor data and historical information after a robot group completes a task, determine whether maintenance is needed, and prepare for scheduling a task next time, and the self-checking maintenance module 900 displays information needed to be maintained through the comprehensive display module 700.

To sum up, the embodiment of the present application provides a robot scheduling system, a method, a replacement method, an apparatus, a device and a medium, wherein the robot scheduling system utilizes the scheduling center module 300 to plan a plurality of subtasks according to the user planning information, and allocates the corresponding subtasks to the robot model 400, so that the robot model 400 controls the robot 500 to perform the operation motion, and in the process that the robot 500 gradually completes the subtasks, the scheduling center module 300 gives reward information to the robot model 400, and the robot model 400 obtains a replacement probability according to the reward information, so that the robot model 400 performs replacement according to the replacement probability, so that the robot model 400 has enough excellent samples to drive the robot model 400 to gradually improve and evolve when the robot 500 smoothly completes the same or similar operation motions.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A robotic arm scheduling system for scheduling a population of robotic arms to complete a production job, the system comprising:

the mechanical arm model is stored in a mechanical arm and used for controlling the corresponding mechanical arm to perform operation movement according to the subtasks;

2. The system according to claim 1, wherein the dispatch center module is configured to assign a replacement probability to the robot model according to the total sum of the reward information obtained by each robot model, so that the robot model is replaced with the robot model with the highest total sum of the reward information according to the replacement probability.

3. The system of claim 2, wherein the replacement probability is inversely related to a sum of reward information obtained by the robot model.

4. The system according to claim 1, wherein the dispatching center module is configured to assign a replacement probability to the robot model every time the robot completes a preset number of the subtasks, so that the robot model is replaced according to the replacement probability.

5. The system according to claim 4, wherein the dispatching center module is configured to assign a replacement probability to the robot model when each robot completes the same number of subtasks, so that the robot model is replaced according to the replacement probability.

6. A mechanical arm scheduling method is used for scheduling mechanical arm groups to complete production operation, and is characterized by comprising the following steps:

7. A robot model replacement method for driving a robot model to perform evolutionary replacement in a production operation, the replacement method comprising the steps of:

8. A robot model replacing apparatus for driving robot models to evolve and replace in production operation, the replacing apparatus comprising:

the reward module is used for endowing corresponding reward information to the mechanical arm model according to the condition that each mechanical arm completes the subtasks;

9. An electronic device comprising a processor and a memory, the memory storing computer readable instructions which, when executed by the processor, perform the steps of the method of claim 7.

10. A storage medium having a computer program stored thereon, wherein the computer program when executed by a processor performs the steps of the method of claim 7.