CN112395124A - Robot abnormity control method and device in cluster environment - Google Patents

Robot abnormity control method and device in cluster environment Download PDF

Info

Publication number
CN112395124A
CN112395124A CN202011284484.9A CN202011284484A CN112395124A CN 112395124 A CN112395124 A CN 112395124A CN 202011284484 A CN202011284484 A CN 202011284484A CN 112395124 A CN112395124 A CN 112395124A
Authority
CN
China
Prior art keywords
robot
task
cluster environment
managing
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011284484.9A
Other languages
Chinese (zh)
Other versions
CN112395124B (en
Inventor
陈艺辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202011284484.9A priority Critical patent/CN112395124B/en
Publication of CN112395124A publication Critical patent/CN112395124A/en
Application granted granted Critical
Publication of CN112395124B publication Critical patent/CN112395124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Manipulator (AREA)

Abstract

The invention provides a method and a device for managing and controlling robot abnormity in a cluster environment, wherein the method for managing and controlling the robot abnormity in the cluster environment comprises the following steps: acquiring state information of the first robot in real time according to a heartbeat detection method; judging whether the first robot is abnormal or not according to the state information; and when the first robot is abnormal, executing a task through a second robot. The invention can improve the utilization rate of the robot by processing the abnormity and monitoring the machine state under the condition of limited number of the robots; the robot abnormity automatic processing scheme can be provided for a user, manual intervention is reduced, cost is reduced, and processing efficiency is improved; and after the robot is abnormal, the support subordinate robot can continue to execute at the interrupt point.

Description

Robot abnormity control method and device in cluster environment
Technical Field
The invention relates to the field of artificial intelligence, in particular to a robot flow automation technology, and specifically relates to a robot abnormity control method and device in a cluster environment.
Background
With the RPA (robot Process Automation, namely an automated software tool, the existing application of an enterprise can be used and understood through a user interface, regular operation based on rules is automated, and the Automation replaces people to execute an office Process with high rule and repeatability in front of a computer), the popularity in recent years is high, the RPA demands are vigorous, more enterprises can recognize the advantages of the RPA, the RPA can replace manual work to Process a large number of fussy and complex transactions, the labor cost of the enterprises is reduced, the efficiency is improved, and the Process Automation is realized. The RPA has wide application fields, such as a clearing robot, a financial robot, an IT operation and maintenance robot, an approval robot, a customer service robot, a human resource robot and the like.
In the prior art, technologies related to RPA are all dedicated to the research of processing affairs, and the related prior art only relates to monitoring the state of the robot, data statistics and other aspects, and does not relate to the automatic processing of robot abnormity. The invention mainly realizes that an automatic solution is provided for a user when the robot is abnormal in a cluster environment.
Disclosure of Invention
Aiming at the problems in the prior art, the invention can establish a method for managing and controlling the robot abnormity in the controllable cluster environment, and can improve the utilization rate of the robot by processing the abnormity and monitoring the machine state under the condition of limited number of the robots; the robot abnormity automatic processing scheme can be provided for a user, manual intervention is reduced, cost is reduced, and processing efficiency is improved; and after the robot is abnormal, the support subordinate robot can continue to execute at the interrupt point.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, the present invention provides a method for managing and controlling robot anomalies in a cluster environment, including:
acquiring state information of the first robot in real time according to a heartbeat detection method;
judging whether the first robot is abnormal or not according to the state information;
and when the first robot is abnormal, executing a task through a second robot.
In an embodiment, the determining whether the first robot is abnormal according to the state information includes:
judging the abnormal type of the first robot according to the network state of the first robot, the CPU processing speed and the number of tasks;
the exception types include: robot failure, network delay instability, CPU full load, and downtime.
In an embodiment, the method for managing and controlling robot anomalies in a cluster environment further includes:
splitting the task into a plurality of flows;
call chains are generated from the plurality of flows.
In one embodiment, the performing, by the second robot, a task when the first robot is abnormal includes:
when the first robot is abnormal, setting an identifier in the calling chain to record the task position where the abnormality occurs;
the second robot performs task synchronization with the first robot according to the identifier.
In one embodiment, task synchronization by the second robot with the first robot based on the identifier comprises:
sending the IP address and the secret key of the second robot to the first robot;
sending the IP address, the port, the interface name, the interface parameters corresponding to the synchronous task and the secret key of the first robot to the second robot;
the second robot generates a uniform resource locator according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronous task and the secret key;
and the second robot generates a synchronization request according to the uniform resource locator by using an RPC method and sends the synchronization request to the first robot.
In a second aspect, the present invention provides a device for managing and controlling robot anomalies in a cluster environment, where the device includes:
the state information acquisition unit is used for acquiring the state information of the first robot in real time according to the heartbeat detection method;
an abnormality occurrence judging unit for judging whether the first robot is abnormal or not according to the state information;
and the task execution unit is used for executing the task through the second robot.
In one embodiment, the abnormality occurrence determination unit is specifically configured to determine the type of abnormality occurrence of the first robot according to a network state, a CPU processing speed, and a number of tasks of the first robot;
the exception types include: robot failure, network delay instability, CPU full load, and downtime.
In one embodiment, the apparatus for managing and controlling robot anomalies in a cluster environment further includes:
the task splitting unit is used for splitting the task into a plurality of flows;
and the call chain generating unit is used for generating a call chain according to the plurality of flows.
In one embodiment, the task execution unit includes:
the identifier setting module is used for setting an identifier in the calling chain when the first robot is abnormal so as to record the task position where the abnormality occurs;
and the synchronization module is used for the second robot to perform task synchronization with the first robot according to the identifier.
In one embodiment, the synchronization module comprises:
the key sending module is used for sending the IP address and the key of the second robot to the first robot;
the interface name sending module is used for sending the IP address, the port, the interface name, the interface parameters corresponding to the synchronous task and the secret key of the first robot to the second robot;
the locator generating module is used for generating a uniform resource locator by the second robot according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronous task and the key;
and the synchronous request sending module is used for generating a synchronous request according to the uniform resource locator by the second robot by using an RPC method and sending the synchronous request to the first robot.
In a third aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements a method for managing and controlling robot exceptions in a cluster environment when executing the computer program.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of a method for managing robot anomalies in a cluster environment.
As can be seen from the above description, the present invention provides a method and an apparatus for managing and controlling robot anomalies in a cluster environment, where first, state information of a first robot is obtained in real time according to a heartbeat detection method; then, judging whether the first robot is abnormal or not according to the state information; and finally, when the first robot is abnormal, executing the task through the second robot. The invention has the following beneficial effects: under the condition that the number of robots is limited, the robot utilization rate can be improved by processing the abnormity and monitoring the machine state; the robot abnormity automatic processing scheme can be provided for a user, manual intervention is reduced, cost is reduced, and processing efficiency is improved; the support of the next robot can continue execution at the point of interruption after the exception occurs.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a diagram of a prior art RPA architecture in an embodiment of the present invention;
fig. 2 is a first flowchart illustrating a method for managing and controlling robot anomalies in a cluster environment according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a method for solving the abnormal category of the robot according to an embodiment of the present invention;
FIG. 4 is a flowchart of step 200 in an embodiment of the present invention;
fig. 5 is a flowchart illustrating a second method for managing and controlling robot anomalies in a cluster environment according to an embodiment of the present invention;
FIG. 6 is a flowchart of step 300 in an embodiment of the present invention;
FIG. 7 is a flowchart illustrating step 302 according to an embodiment of the present invention;
FIG. 8 is a schematic diagram illustrating a task synchronization process of a robot according to an embodiment of the present invention;
fig. 9 is a flowchart illustrating a method for managing and controlling robot anomalies in a cluster environment according to an embodiment of the present invention;
fig. 10 is a first schematic structural diagram of a robot anomaly management and control device in a cluster environment according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a robot abnormality management and control device in a cluster environment according to an embodiment of the present invention;
FIG. 12 is a diagram illustrating a task execution unit according to an embodiment of the present invention;
FIG. 13 is a block diagram of a synchronization module according to an embodiment of the present invention;
fig. 14 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, many internal systems of large enterprises are not open to the outside due to security considerations, such as financial internal systems, so that the RPA architecture mostly adopts a CS architecture of a robot client and a robot server, a plurality of robots correspond to one or more server ends, and a relay server or other intermediate media may be added in the middle to achieve an interactive effect (see fig. 1). In such clustering situations, when an abnormality occurs in one or more robots (robot automation is often performed in the middle of the night), human intervention is very costly if necessary. Based on this, an embodiment of the present invention provides a specific implementation of a method for managing and controlling robot anomalies in a cluster environment, and referring to fig. 2, the method specifically includes the following steps:
step 100: and acquiring the state information of the first robot in real time according to the heartbeat detection method.
Specifically, whether the connection is normal is detected by sending a heartbeat packet, the server sends a short data packet to the robot at regular intervals, then a thread is started, the response of the robot end is continuously detected in the thread, and if the response of the robot end is not received within a certain time, the robot end is considered to be disconnected; similarly, if the robot end does not receive the heartbeat packet of the server within a certain time, the connection is considered to be unavailable.
Step 200: and judging whether the first robot is abnormal or not according to the state information.
Specifically, judging the type of the first robot with an abnormality according to the network state, the CPU processing speed and the task number of the first robot; the types of exceptions include: robot failure, network delay instability, CPU full load, and downtime. It will be appreciated that different solutions are determined according to the type of anomaly, and referring to fig. 3, when the type of anomaly is a robot fault, recovery is first attempted. And the server sends an instruction to restart the robot, if the reply cannot be received at preset time intervals, the robot is considered to be failed to restart and is unavailable, and a new robot is pushed to execute the process. When the abnormal type is network delay instability or CPU full load, firstly judging whether a zone bit is needed, if so, establishing connection with any previous robot, and if so, continuing execution from the zone bit; and if the acquisition fails, the server is used as the transit acquisition.
Step 300: and when the first robot is abnormal, executing a task through a second robot.
It should be noted that, a new robot strategy is introduced, that is, how to select the second robot is as follows: under the cluster, if there is an idle robot, one idle robot is randomly selected. And if no free robot exists, selecting the robot with the least task list from the robots, and adding the robot into the task list of the robot. If there is no free robot and the task list of the robot is already full, join the task list of the server first, wait for joining the task list of the robot or the free robot later.
In addition, the method for selecting the idle robot comprises the following steps: selecting according to the robot priority (initializing the priority according to the machine performance); the server maintains a robot execution state list, calculates a corresponding state value according to the execution state (network state, CPU processing speed, task number and the like), if the state is less than the threshold value, the state is good, the tasks can be accepted, if the state is greater than the threshold value, the tasks are piled up, the robot tasks can be properly lightened, if the state value is smaller, the state is better, and the tasks can be arranged preferentially.
As can be seen from the above description, the present invention provides a method for managing and controlling robot anomalies in a cluster environment, which includes first obtaining status information of a first robot in real time according to a heartbeat detection method; then, judging whether the first robot is abnormal or not according to the state information; and finally, when the first robot is abnormal, executing the task through the second robot. The invention has the following beneficial effects: under the condition that the number of robots is limited, the robot utilization rate can be improved by processing the abnormity and monitoring the machine state; the robot abnormity automatic processing scheme can be provided for a user, manual intervention is reduced, cost is reduced, and processing efficiency is improved; the support of the next robot can continue execution at the point of interruption after the exception occurs.
In one embodiment, referring to fig. 4, step 200 specifically includes:
step 201: judging the abnormal type of the first robot according to the network state of the first robot, the CPU processing speed and the number of tasks;
wherein the exception types include: robot failure, network delay instability, CPU full load, and downtime.
In an embodiment, referring to fig. 5, the method for managing and controlling robot anomalies in a cluster environment further includes:
step 400: splitting the task into a plurality of flows;
it is understood that a task is composed of a plurality of small flows or steps, so that one task can be split into several small tasks (small flows).
Step 500: call chains are generated from the plurality of flows.
In one embodiment, referring to fig. 6, step 300 further comprises:
step 301: when the first robot is abnormal, setting an identifier in the calling chain to record the task position where the abnormality occurs;
step 302: the second robot performs task synchronization with the first robot according to the identifier.
In step 301 and step 302, when the robot performs an abnormal task, the robot marks the position in the call chain. If the successive robot needs to inherit the zone bit and continue to execute the task from the zone bit, the following strategies are adopted: when the successor robot starts to execute the successor task, whether a flag bit is needed or not is judged, if so, connection is established with the last robot, and if the connection is successful, execution is continued from the flag bit; and if the acquisition fails, the server is used as the transit acquisition.
In one embodiment, referring to fig. 7, step 302 further comprises:
step 3021: sending the IP address and the secret key of the second robot to the first robot;
step 3022: sending the IP address, the port, the interface name, the interface parameters corresponding to the synchronous task and the secret key of the first robot to the second robot;
step 3023: the second robot generates a uniform resource locator according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronous task and the secret key;
step 3024: and the second robot generates a synchronization request according to the uniform resource locator by using an RPC method and sends the synchronization request to the first robot.
In steps 3021 to 3024, referring to fig. 8, when two robots need to be synchronized (a > > B), a unified format of a message is agreed, and a server issues an interface parameter/random key corresponding to an IP/port/interface name/synchronization task to the B machine; and the server sends the IP/random key of the calling party to the machine A for verification and use. And the machine B assembles the URL after receiving the URL and calls the machine A by adopting an RPC mode. After the machine A receives the request of the machine B, the machine A verifies whether the IP is legal or not, the key is used for decrypting the parameter, and if the decryption fails, the request is illegal.
To further explain the scheme, the invention provides a specific application example of the method for managing and controlling the robot abnormality in the cluster environment, and the specific application example specifically includes the following contents, and refer to fig. 9.
S0: and acquiring the state information of the first robot in real time.
S1: and judging whether the first robot is abnormal or not according to the state information.
S2: and when the first robot is abnormal, selecting a second robot.
It will be appreciated that the robot itself has a task queue from which tasks are preferentially taken for execution. The server maintains a robot execution state list, calculates a corresponding state value according to the execution state (network state, CPU processing speed, task number and the like), if the state is less than the threshold value, the state is good, the tasks can be accepted, if the state is greater than the threshold value, the tasks are piled up, the robot tasks can be properly lightened, if the state value is smaller, the state is better, and the tasks can be arranged preferentially.
In addition, if it is a robot failure, an attempt is made to restore the robot, and if the restoration is successful, the execution is continued. When a new robot executes a task, whether a zone bit is needed or not is judged, if so, zone bit information is obtained in a cluster mode, and if not, the zone bit information is directly executed. In the cluster, the robots can directly communicate, and if the robots cannot communicate with each other, the robots can indirectly communicate with each other through the server as relays.
As can be seen from the above description, the method for managing and controlling the robot abnormality in the cluster environment according to the present invention first needs to identify the risk abnormality of the robot, where the abnormality includes a robot fault, a robot crash, an unstable network delay, a full CPU load, and a service abnormality. The robot reports the state of the robot to the server at regular time, the server performs a corresponding processing scheme according to the type of abnormality of state matching, the robot is considered to be down if the state of the robot cannot be detected after timeout, and the robot is considered to be recovered if the state of the robot is detected again at intervals and is continuously detected above a threshold time period, so that task allocation can be performed.
Based on the same inventive concept, the embodiment of the present application further provides a management and control device for robot exception in a cluster environment, which can be used to implement the method described in the foregoing embodiment, as described in the following embodiment. Because the principle of solving the problems of the management and control device for the robot anomaly in the cluster environment is similar to the management and control method for the robot anomaly in the cluster environment, the implementation of the management and control device for the robot anomaly in the cluster environment can be implemented by referring to the implementation of the management and control method for the robot anomaly in the cluster environment, and repeated parts are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
An embodiment of the present invention provides a specific implementation manner of a management and control device for robot anomalies in a cluster environment, which can implement a management and control method for robot anomalies in a cluster environment, and referring to fig. 10, the management and control device for robot anomalies in a cluster environment specifically includes the following contents:
a state information obtaining unit 10, configured to obtain state information of the first robot in real time according to a heartbeat detection method;
an abnormality occurrence determination unit 20, configured to determine whether the first robot has an abnormality according to the state information;
a task execution unit 30 for executing the task by the second robot.
In an embodiment, the abnormality occurrence determining unit 20 is specifically configured to determine the type of the abnormality occurrence of the first robot according to a network state, a CPU processing speed, and a number of tasks of the first robot;
the exception types include: robot failure, network delay instability, CPU full load, and downtime.
In an embodiment, referring to fig. 11, the apparatus for managing and controlling robot anomalies in a cluster environment further includes:
a task splitting unit 40, configured to split the task into multiple processes;
a call chain generating unit 50 for generating a call chain according to a plurality of flows.
In one embodiment, referring to fig. 12, the task execution unit 30 includes:
an identifier setting module 301, configured to set an identifier in the call chain when the first robot is abnormal, so as to record a task position where the abnormality occurs;
a synchronization module 302, configured to perform task synchronization with the first robot according to the identifier by the second robot.
In one embodiment, referring to fig. 13, the synchronization module 302 includes:
a key sending module 3021, configured to send the IP address of the second robot and a key to the first robot;
an interface name sending module 3022, configured to send the IP address, the port, the interface name, the interface parameter corresponding to the synchronization task, and the secret key of the first robot to the second robot;
a locator generating module 3023, configured to generate a uniform resource locator by the second robot according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronization task, and the secret key;
a synchronization request sending module 3024, configured to generate a synchronization request according to the uniform resource locator by using an RPC method, and send the synchronization request to the first robot.
As can be seen from the above description, the present invention provides a robot anomaly management and control device in a cluster environment, which first obtains status information of a first robot in real time according to a heartbeat detection method; then, judging whether the first robot is abnormal or not according to the state information; and finally, when the first robot is abnormal, executing the task through the second robot. The invention has the following beneficial effects: under the condition that the number of robots is limited, the robot utilization rate can be improved by processing the abnormity and monitoring the machine state; the robot abnormity automatic processing scheme can be provided for a user, manual intervention is reduced, cost is reduced, and processing efficiency is improved; the support of the next robot can continue execution at the point of interruption after the exception occurs.
An embodiment of the present application further provides a specific implementation manner of an electronic device, which is capable of implementing all steps in the method for managing and controlling robot anomalies in a cluster environment in the foregoing embodiment, and referring to fig. 14, the electronic device specifically includes the following contents:
a processor (processor)1201, a memory (memory)1202, a communication Interface 1203, and a bus 1204;
the processor 1201, the memory 1202 and the communication interface 1203 complete communication with each other through the bus 1204; the communication interface 1203 is configured to implement information transmission among related devices, such as a server-side device, an acquisition device, a client device, and the like.
The processor 1201 is configured to call the computer program in the memory 1202, and when the processor executes the computer program, all the steps in the method for managing and controlling the robot exception in the cluster environment in the foregoing embodiments are implemented, for example, when the processor executes the computer program, the following steps are implemented:
step 100: acquiring state information of the first robot in real time according to a heartbeat detection method;
step 200: judging whether the first robot is abnormal or not according to the state information;
step 300: and when the first robot is abnormal, executing a task through a second robot.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the method for managing and controlling a robot anomaly in a cluster environment in the foregoing embodiments, where the computer-readable storage medium stores a computer program, and the computer program implements all the steps in the method for managing and controlling a robot anomaly in a cluster environment in the foregoing embodiments when executed by a processor, for example, the processor implements the following steps when executing the computer program:
step 100: acquiring state information of the first robot in real time according to a heartbeat detection method;
step 200: judging whether the first robot is abnormal or not according to the state information;
step 300: and when the first robot is abnormal, executing a task through a second robot.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Although the present application provides method steps as described in an embodiment or flowchart, additional or fewer steps may be included based on conventional or non-inventive efforts. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or client product executes, it may execute sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
Although embodiments of the present description provide method steps as described in embodiments or flowcharts, more or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an actual apparatus or end product executes, it may execute sequentially or in parallel (e.g., parallel processors or multi-threaded environments, or even distributed data processing environments) according to the method shown in the embodiment or the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, in implementing the embodiments of the present description, the functions of each module may be implemented in one or more software and/or hardware, or a module implementing the same function may be implemented by a combination of multiple sub-modules or sub-units, and the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The described embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of an embodiment of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The above description is only an example of the embodiments of the present disclosure, and is not intended to limit the embodiments of the present disclosure. Various modifications and variations to the embodiments described herein will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present specification should be included in the scope of the claims of the embodiments of the present specification.

Claims (12)

1. A control method for robot abnormity in cluster environment is characterized by comprising the following steps:
acquiring state information of the first robot in real time according to a heartbeat detection method;
judging whether the first robot is abnormal or not according to the state information;
and when the first robot is abnormal, executing a task through a second robot.
2. The method for managing and controlling robot abnormality in cluster environment according to claim 1, wherein said determining whether the first robot has abnormality according to the status information includes:
judging the abnormal type of the first robot according to the network state of the first robot, the CPU processing speed and the number of tasks;
the exception types include: robot failure, network delay instability, CPU full load, and downtime.
3. The method for managing and controlling robot anomalies in a cluster environment according to claim 1, further comprising:
splitting the task into a plurality of flows;
call chains are generated from the plurality of flows.
4. The method for managing and controlling robot anomalies in a cluster environment according to claim 3, wherein when the first robot is anomalous, performing a task by a second robot includes:
when the first robot is abnormal, setting an identifier in the calling chain to record the task position where the abnormality occurs;
the second robot performs task synchronization with the first robot according to the identifier.
5. The method for managing robot exceptions in a cluster environment of claim 4, wherein task synchronization of the second robot with the first robot according to the identifier comprises:
sending the IP address and the secret key of the second robot to the first robot;
sending the IP address, the port, the interface name, the interface parameters corresponding to the synchronous task and the secret key of the first robot to the second robot;
the second robot generates a uniform resource locator according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronous task and the secret key;
and the second robot generates a synchronization request according to the uniform resource locator by using an RPC method and sends the synchronization request to the first robot.
6. The utility model provides an unusual management and control device of robot under cluster environment which characterized in that includes:
the state information acquisition unit is used for acquiring the state information of the first robot in real time according to the heartbeat detection method;
an abnormality occurrence judging unit for judging whether the first robot is abnormal or not according to the state information;
and the task execution unit is used for executing the task through the second robot.
7. The apparatus for managing and controlling robot anomalies in a cluster environment of claim 6, wherein the anomaly occurrence determination unit is specifically configured to determine the type of anomaly occurrence for the first robot according to a network state, a CPU processing speed, and a number of tasks of the first robot;
the exception types include: robot failure, network delay instability, CPU full load, and downtime.
8. The apparatus for managing and controlling robot anomalies in a cluster environment of claim 6, further comprising:
the task splitting unit is used for splitting the task into a plurality of flows;
and the call chain generating unit is used for generating a call chain according to the plurality of flows.
9. The apparatus for managing robot exceptions in a cluster environment according to claim 8, wherein the task execution unit includes:
the identifier setting module is used for setting an identifier in the calling chain when the first robot is abnormal so as to record the task position where the abnormality occurs;
and the synchronization module is used for the second robot to perform task synchronization with the first robot according to the identifier.
10. The apparatus for managing robot exceptions in a cluster environment according to claim 9, wherein the synchronization module includes:
the key sending module is used for sending the IP address and the key of the second robot to the first robot;
the interface name sending module is used for sending the IP address, the port, the interface name, the interface parameters corresponding to the synchronous task and the secret key of the first robot to the second robot;
the locator generating module is used for generating a uniform resource locator by the second robot according to the IP address, the port, the interface name, the interface parameter corresponding to the synchronous task and the key;
and the synchronous request sending module is used for generating a synchronous request according to the uniform resource locator by the second robot by using an RPC method and sending the synchronous request to the first robot.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for managing robot exceptions in a cluster environment according to any one of claims 1 to 5 when executing the program.
12. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the method for managing robot anomalies in a cluster environment according to any one of claims 1 to 5.
CN202011284484.9A 2020-11-17 2020-11-17 Method and device for managing and controlling abnormality of robots in cluster environment Active CN112395124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011284484.9A CN112395124B (en) 2020-11-17 2020-11-17 Method and device for managing and controlling abnormality of robots in cluster environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011284484.9A CN112395124B (en) 2020-11-17 2020-11-17 Method and device for managing and controlling abnormality of robots in cluster environment

Publications (2)

Publication Number Publication Date
CN112395124A true CN112395124A (en) 2021-02-23
CN112395124B CN112395124B (en) 2024-09-13

Family

ID=74600839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011284484.9A Active CN112395124B (en) 2020-11-17 2020-11-17 Method and device for managing and controlling abnormality of robots in cluster environment

Country Status (1)

Country Link
CN (1) CN112395124B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568812A (en) * 2021-07-29 2021-10-29 北京奇艺世纪科技有限公司 State detection method and device for intelligent robot
CN114244890A (en) * 2021-12-22 2022-03-25 珠海金智维信息科技有限公司 RPA server cluster control method and system
CN114237196A (en) * 2021-11-15 2022-03-25 北京云迹科技股份有限公司 Split robot fault processing method and device, terminal equipment and medium
WO2023035755A1 (en) * 2021-09-10 2023-03-16 北京京东乾石科技有限公司 Task processing method, apparatus and system for multiple robots, and robot

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180077230A1 (en) * 2016-09-14 2018-03-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for switching between servers in server cluster
CN108388187A (en) * 2018-04-12 2018-08-10 广东水利电力职业技术学院(广东省水利电力技工学校) A kind of robot control system
CN108994840A (en) * 2018-08-23 2018-12-14 北京云迹科技有限公司 Failed machines people rescue skills and device
CN109048996A (en) * 2018-08-07 2018-12-21 北京云迹科技有限公司 robot abnormal state processing method and device
CN110308730A (en) * 2019-07-18 2019-10-08 滁州学院 A kind of multi-robot coordination control system
CN111385107A (en) * 2018-12-27 2020-07-07 大唐移动通信设备有限公司 Main/standby switching processing method and device for server
CN111796960A (en) * 2020-07-01 2020-10-20 中国建设银行股份有限公司 Method and system for automatically recovering robot equipment abnormity

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180077230A1 (en) * 2016-09-14 2018-03-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for switching between servers in server cluster
CN108388187A (en) * 2018-04-12 2018-08-10 广东水利电力职业技术学院(广东省水利电力技工学校) A kind of robot control system
CN109048996A (en) * 2018-08-07 2018-12-21 北京云迹科技有限公司 robot abnormal state processing method and device
CN108994840A (en) * 2018-08-23 2018-12-14 北京云迹科技有限公司 Failed machines people rescue skills and device
CN111385107A (en) * 2018-12-27 2020-07-07 大唐移动通信设备有限公司 Main/standby switching processing method and device for server
CN110308730A (en) * 2019-07-18 2019-10-08 滁州学院 A kind of multi-robot coordination control system
CN111796960A (en) * 2020-07-01 2020-10-20 中国建设银行股份有限公司 Method and system for automatically recovering robot equipment abnormity

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113568812A (en) * 2021-07-29 2021-10-29 北京奇艺世纪科技有限公司 State detection method and device for intelligent robot
CN113568812B (en) * 2021-07-29 2024-06-07 北京奇艺世纪科技有限公司 State detection method and device for intelligent robot
WO2023035755A1 (en) * 2021-09-10 2023-03-16 北京京东乾石科技有限公司 Task processing method, apparatus and system for multiple robots, and robot
CN114237196A (en) * 2021-11-15 2022-03-25 北京云迹科技股份有限公司 Split robot fault processing method and device, terminal equipment and medium
CN114244890A (en) * 2021-12-22 2022-03-25 珠海金智维信息科技有限公司 RPA server cluster control method and system

Also Published As

Publication number Publication date
CN112395124B (en) 2024-09-13

Similar Documents

Publication Publication Date Title
CN112395124A (en) Robot abnormity control method and device in cluster environment
CN111144883B (en) Processing performance analysis method and device for blockchain network
CN106789141B (en) Gateway equipment fault processing method and device
JP2017538200A (en) Service addressing in a distributed environment
CA2948914A1 (en) Systems and methods for fault tolerant communications
CN109656742B (en) Node exception handling method and device and storage medium
CN110611707B (en) Task scheduling method and device
CN108984333B (en) Method and device for big data real-time calculation
CN107181780B (en) Communication channel processing method and system
CN115297124B (en) System operation and maintenance management method and device and electronic equipment
CN104158707A (en) Method and device of detecting and processing brain split in cluster
CN103401698A (en) Monitoring system used for alarming server status in server cluster operation
CN111526049B (en) Operation and maintenance system, operation and maintenance method, electronic device and storage medium
CN111858007A (en) Task scheduling method and device based on message middleware
CN109379757B (en) Single-user fault diagnosis method and device based on narrowband Internet of things service
CN112988433A (en) Method, apparatus and computer program product for fault management
CN111652728B (en) Transaction processing method and device
CN115277727B (en) Data disaster recovery method, system, device and storage medium
CN114237510B (en) Data processing method, device, electronic equipment and storage medium
CN114064217A (en) Node virtual machine migration method and device based on OpenStack
Thomesse Time and industrial local area networks
CN116300531B (en) Method and system for identifying bottleneck link of production system, storage medium and terminal
CN115426356A (en) Distributed timed task lock update control execution method and device
CN114003384B (en) Task management method, device and equipment
CN107682173B (en) Automatic fault positioning method and system based on transaction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant