WO2017041671A1 - Method and apparatus for recovering fault - Google Patents

Method and apparatus for recovering fault Download PDF

Info

Publication number
WO2017041671A1
WO2017041671A1 PCT/CN2016/097957 CN2016097957W WO2017041671A1 WO 2017041671 A1 WO2017041671 A1 WO 2017041671A1 CN 2016097957 W CN2016097957 W CN 2016097957W WO 2017041671 A1 WO2017041671 A1 WO 2017041671A1
Authority
WO
WIPO (PCT)
Prior art keywords
recovery
node
failed
file
failure
Prior art date
Application number
PCT/CN2016/097957
Other languages
French (fr)
Chinese (zh)
Inventor
李龙
龚学文
胡琳
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017041671A1 publication Critical patent/WO2017041671A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation

Definitions

  • the present invention relates to the field of computers and, more particularly, to methods and apparatus for fault recovery.
  • the computer In a computer system, the computer mainly relies on the execution process to realize its function. When the process fails or even fails, it may affect the normal operation of the program and even the entire computer system. Therefore, how to realize the process recovery of the process becomes an urgent problem to be solved.
  • a method for recovering a fault which periodically records a recovery file that records a calculation state when a process is normal, and corresponds to the node when a node running the process fails and the process fails.
  • the recovery node recovers the process based on the saved recovery file.
  • Embodiments of the present invention provide a method and apparatus for fault recovery, which can improve the reliability of fault recovery.
  • a method for fault recovery comprising: determining a size of a recovery file corresponding to each of the N failed processes, and determining an operating state of each of the M recovery nodes, Wherein, N ⁇ 1, M ⁇ 2; determining a recovery node corresponding to each invalidation process according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, wherein the running state includes a resource usage state Controlling the recovery nodes corresponding to each failed process to perform a failure process for each failed process Recovery.
  • the recovery file corresponding to the first invalid process of the N failed processes is stored in the at least two storage nodes.
  • the recovery files corresponding to the first invalidation process stored in each storage node are the same.
  • the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node different.
  • the size of the recovery file corresponding to each failure process and the running state of each recovery node are determined.
  • the recovery node corresponding to each failure process includes: determining, according to the running state of each recovery node, the size of the recovery file corresponding to each failure process, in descending order, determining each failure process in turn Recovery node.
  • the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalidation process.
  • the controlling, by the recovery node corresponding to each failure process includes: performing, according to the recovery node corresponding to each failure process The state and the size of the recovery file corresponding to each failed process, the recovery time of each failed process is estimated; and the control is performed according to the recovery time of each failed process.
  • a device for fault recovery comprising: a determining unit, configured to determine a size of a recovery file corresponding to each of the N failed processes, and each of the M recovery nodes The running state, and determining the recovery node corresponding to each invalid process according to the size of the recovery file corresponding to each failure process and the running state of the M recovery nodes, wherein the running state includes the resource usage state, N ⁇ 1 , M ⁇ 2; a processing unit, configured to control a recovery node corresponding to each failed process to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
  • the recovery file corresponding to the first invalid process of the N failed processes is stored in the at least two storage nodes.
  • the recovery files corresponding to the first invalidation process stored in each storage node are the same.
  • the The recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each of the storage nodes are different.
  • the determining unit when N ⁇ 2, is specifically configured to be configured according to each failure process according to an operating state of each recovery node.
  • the recovery node corresponding to one failure process is different from the storage node corresponding to the same invalidation process.
  • the processing unit is specifically configured to: restore, according to an operating state of the recovery node corresponding to each failure process, and recovery corresponding to each failure process.
  • the size of the file, the recovery time of each failed process is estimated, and control is performed according to the recovery time of each failed process.
  • the recovery of the failure recovery of the failed process is determined from the at least two recovery nodes according to the size of the recovery file corresponding to the failed process and the running status of the at least two recovery nodes.
  • a node is more reliable than only one recovery node, and at the same time can ensure that the determined recovery node can achieve fault recovery for the failed process to further improve the reliability of the fault recovery.
  • FIG. 1 is a schematic flow chart of a method of fault recovery according to an embodiment of the present invention.
  • FIG. 2 is a schematic architectural diagram of a method of applying fault recovery in accordance with an embodiment of the present invention.
  • FIG. 3 is a schematic block diagram of an apparatus for fault recovery in accordance with an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an apparatus for fault recovery according to an embodiment of the present invention.
  • the method and apparatus for fault recovery provided by the embodiments of the present invention can be applied to a computer, which includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer.
  • the hardware layer includes hardware such as a CPU, a Memory Management Unit (MMU), and a memory (also referred to as a memory).
  • the operating system may be any one or more computer operating systems that implement business processing through processes, such as a Linux system, a Unix system, an Android system, an iOS system, or a Windows system.
  • the application layer includes applications such as browsers, contacts, word processing software, and instant messaging software. It should be understood that the above-listed computer devices are merely illustrative and the invention is not particularly limited.
  • FIG. 1 is a schematic flowchart of a method 100 for fault recovery according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes:
  • the failure process refers to the result of the operation (or processing) failing to reach the desired process, for example, the operation is aborted due to the failure.
  • the process that is, the process does not run the result (it can also be seen that the running result of the process is not as expected), and for example, the process seems to be successful but the running result is not in accordance with the expected process.
  • S120 Determine, according to a size of the recovery file corresponding to each failure process and an operation state of each recovery node, a recovery node corresponding to each failure process, where the operation state includes a resource usage state.
  • the resource usage status includes hardware utilization on the recovery node, such as CPU utilization and/or memory utilization. For example, if the CPU usage of a recovery node is already high, that is, the recovery node is already busy, and the recovery file corresponding to the invalidation process is large, that is, the recovery process needs to consume more resources, then the recovery is performed.
  • the node is not suitable as a recovery node for the failed process.
  • the running state may further include a communication state of the recovery node.
  • the communication state refers to whether the communication state of the recovery node and other nodes is communicable or not. For example, if a recovery node cannot communicate with the storage node of the recovery file corresponding to the storage invalidation process A, the recovery node cannot be the recovery node of the invalidation process A.
  • FIG. 2 shows a schematic architectural diagram of a computer system 200 to which the method 100 is applied.
  • the computer system 200 includes a management node 210, a plurality of storage nodes 220, and a plurality of recovery nodes 230 and a plurality of computing nodes 240.
  • the management node 210 includes a management node 210, a plurality of storage nodes 220, and a plurality of recovery nodes 230 and a plurality of computing nodes 240.
  • FIG. 2 shows a schematic architectural diagram of a computer system 200 to which the method 100 is applied.
  • the computer system 200 includes a management node 210, a plurality of storage nodes 220, and a plurality of recovery nodes 230 and a plurality of computing nodes 240.
  • FIG. 2 shows a schematic architectural diagram of a computer system 200 to which the method 100 is applied.
  • the computer system 200 includes a management node 210, a plurality of storage nodes 220, and a plurality of
  • Each computing node 240 runs one or more processes.
  • each computing node 240 is in communication with one or more storage nodes 220 such that the computing node 240 can transfer the recovered files of the running processes to the connected storage node 220 for backup.
  • the recovery file may be data when its process is running in a normal state.
  • management node 210 is in communication with each of the computing nodes 240 such that the management node 210 can monitor the operational status of each of the processes running in each computing node 240.
  • management node 210 is communicatively coupled to each of the recovery nodes 230 such that the management node 210 can monitor the operational status of each of the recovery nodes 230 and send control instructions to the recovery node 230.
  • each recovery node 230 and each storage node 220 can be communicably connected, so that the recovery node 230 can obtain the recovery file from the storage node 220 when recovering the process.
  • each of the recovery nodes 230 and the storage nodes 220 enumerated above is only an example.
  • the management node 210 may be communicably connected to each storage node 220, and the recovery node 230 may be passed through the management node 210.
  • the recovery file is obtained in the storage node 220.
  • a bus system 250 communicably connected to the nodes may be provided, so that the communication connection between the above-described nodes can be realized by the bus system 250.
  • one node may be an independently configured computer entity, or multiple nodes may be configured in the same computer entity, or multiple computer entities may constitute one node, and the present invention is not special. limited.
  • the method 100 may be performed by the management node 210, which may be independent of each node in the computer system (including the failed computing node 240, the storage node for storing the recovery file of the failed process). 220 and recovery node 230 for failover failure. And, the management node is in communication connection with each node to transmit information such as control instructions or data; or the method 100 may be integrated with the management node in the computer system 200. The function of one or more other nodes is executed.
  • the method 100 implemented by the present invention is applied to a computer system including at least two recovery nodes, each of which is capable of providing computing resources (e.g., a central processing unit and memory, etc.) to enable recovery of the failed process.
  • computing resources e.g., a central processing unit and memory, etc.
  • management node can instruct the recovery node to perform recovery for the failed process.
  • the recovery file of the process may be periodically stored, for example, the execution state or the calculation state, and when the process fails, roll back to the previously saved one. The status is restarted.
  • the recovery file in the embodiment of the present invention is used to record data when the process is running in a normal state, so as to perform fault recovery for each invalid process according to the content recorded by the recovery file.
  • the recovery file may be a checkpoint file that is generated based on a checkpoint technique; in other embodiments, the recovery file may be a log file that is generated based on a logging technique of. It should be noted that the generation of the checkpoint file or the log file may be implemented by the prior art, and is not included in the scope of the present invention, and therefore will not be described in detail herein.
  • the number of invalid processes may be one or multiple, and the present invention is not particularly limited, and when the number of failed processes is plural, the processing for each failed process is similar, and the following is For ease of understanding and explanation, the processing of the method 100 implemented by the present invention will be described in detail by taking the processing of the invalidation process #A as an example.
  • the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes.
  • the corresponding recovery file may be stored in two or more corresponding to the invalidation process #A. In the storage node.
  • the storage nodes in which the recovery files of any two failed processes are stored may be the same or different, and the present invention is not particularly limited.
  • the above “identical” may include: identical, for example, the recovery text of the invalidation process #A
  • the pieces may be stored in the storage node # ⁇ and the storage node # ⁇
  • the recovery file of the invalidation process #B may be stored in the storage node # ⁇ and the storage node # ⁇ ; or, partially, for example, the recovery file of the invalidation process #C may be stored.
  • the recovery file of the invalidation process #D can be stored in the storage node # ⁇ and the storage node # ⁇ ).
  • the above “different” may include: completely different, for example, the recovery file of the invalidation process #A may be stored in the storage node # ⁇ and the storage node # ⁇ , and the recovery file of the invalidation process #B may be stored in the storage node # ⁇ and Storage node # ⁇ ; or, partially different, for example, the recovery file of the invalidation process #C may be stored in the storage node #n and the storage node # ⁇ , and the recovery file of the invalidation process #D may be stored in the storage node # ⁇ and the storage node # ⁇ ).
  • the number of storage nodes in which the recovery files of one of the above-mentioned failed processes are stored is merely exemplary.
  • the present invention is not limited thereto.
  • a recovery file of a failed process may be stored only in one storage node.
  • the number of storage nodes in which the recovery files of the failed processes are stored may be the same or different, and the present invention is not particularly limited.
  • the recovery file corresponding to the invalidation process #A may be stored in a plurality of (at least two) storage nodes in the following manner.
  • the recovery files corresponding to the first invalidation process stored in each storage node are the same.
  • a complete recovery file corresponding to the invalidation process #A may be stored in a plurality of storage nodes (hereinafter, for ease of understanding and distinction, it is recorded as: recovery file #A).
  • recovery file #A the "complete recovery file” means that the failure process #A can be handled by the recovery file #A stored in one storage node.
  • the method for fault processing according to the present invention by making the recovery files corresponding to the first invalidation process stored in each storage node the same (ie, each storage node stores a consistent, complete recovery file), When one or more storage nodes fail, the recovery files can still be obtained from other non-failed storage nodes, thereby further improving the reliability of the fault handling.
  • the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different. .
  • the recovery file #A can be divided into a plurality of sub-recovery files (for ease of understanding and distinction, it is recorded as: sub-recovery file #A 1 - sub-recovery file #A X ).
  • the sub-recovery file #A 1 - sub-recovery file #A X are respectively stored in a plurality of storage nodes, wherein one sub-recovery file may be stored in one storage node, or may be stored (repeatedly or dividedly) in multiple Among the storage nodes, the present invention is not particularly limited, and any two sub-recovery files are stored in different storage nodes, or the sub-recovery files stored in the storage nodes are different.
  • sub-recovery file #A 1 and the sub-recovery file #A 2 may be completely different.
  • the sub-recovery file #A 1 may be stored in the storage node #1 and the storage node #2
  • the sub-recovery file #A 2 may be stored in the storage node #3 and Storage node #4; or, "different” may also be partially different, for example, for sub-recovery file #A 3 and sub-recovery file #A 4 , sub-recovery file #A 3 may be stored in storage node #5 and storage node #6, Sub-recovery file #A 4 can be stored in storage node #6 and storage node #7.
  • the method for fault processing provided by the embodiment of the present invention, by dividing the recovery file corresponding to the failed process into multiple sub-recovery files and storing each sub-recovery file in a different storage node, it is possible to simultaneously perform the fault processing.
  • the plurality of storage nodes acquire the sub-recovery files, thereby reducing the time required for transmitting the restored files and improving the efficiency of troubleshooting.
  • the storage of the recovery file of the "failed process" is performed periodically when the process is valid (or normal operation), that is, in the implementation of the present invention,
  • Each process running in a computer system periodically stores its recovery files while the process is active (or, in normal operation).
  • the recovery file may also be stored in the computing node running after the failure process fails, and uploaded to the storage node before the failure, that is, the backup of the recovery file may also be performed periodically. of.
  • the storage methods of the recovery files listed above are merely exemplary, and the present invention is not limited thereto.
  • the recovery files of the processes may be uniformly stored in one storage node.
  • the method for fault processing implemented by the present invention may be executed when the management node determines that the process is invalid.
  • a method for determining the process failure for example, a process running on each node in the computer system may periodically send a heartbeat message to the management node, if the management node If the heartbeat message of a process is not received within the specified time, the process may be considered invalid. It should be understood that the above-listed methods for determining the process failure are merely exemplary, and the present invention is not limited thereto, and the methods for determining the process failure in the prior art all fall within the protection scope of the present invention.
  • the management node may determine the size of the recovery file corresponding to each failed node, for example, for the invalidation process #A (ie, an example of the invalidation process), the management node may fail.
  • the storage device(s) corresponding to the process #A acquires information indicating the size of the restored file #A (or, the sub-recovery file #A 1 to the sub-recovery file #A X ), and determines based on the information.
  • the size of the recovery file corresponding to the invalidation process #A is ie, an example of the invalidation process.
  • each process may determine the size of the restored file when generating the recovery file, and send information for indicating the size of the restored file to the management node, and the management node may store the information according to the identifier of the process, that is, may be based on the process. And mapping the information indicating the size of each recovery file to the information indicating the size of the recovery file from the process, and storing and indexing the obtained information indicating the size of each recovery file, so that when the management node finds that the process is invalid, the The identifier of the process finds information indicating the size of the recovery file corresponding to the process.
  • the management node can determine the operational status of each recovery node.
  • the running state may include restoring a resource usage state of the node.
  • the recovery node may report the central processor of the recovery node to the management node according to the indication of the management node or periodically (CPU, Central Processing) Unit) Load information such as usage rate or memory usage, so that the management node can determine the resource usage status based on the load information from the recovery node.
  • the operational state may include restoring a communication state of the node.
  • the recovery node may report to the management node a communication status indicating the recovery node and other nodes in the computer system (eg, the status may include Communication status information of normal communication, inability to communicate, or communication delay, etc., whereby the management node can determine its communication status based on the communication status information from the recovery node.
  • each storage node may constitute a storage network (or a storage grid), and the storage network provides data to each recovery node in the computer system through a unified external interface (for example, recovery).
  • File a unified external interface
  • each recovery node and the storage network can communicate through a message queue or the like, so that each recovery node does not need to know the specific address of the storage node storing the required recovery file, for example, Internet Protocol (IP, Internet Protocol) address or media access control address (MAC, Media Access Control) address, etc.
  • IP Internet Protocol
  • MAC Media Access Control
  • the recovery node only needs to send the identifier of the invalidation process #A, which is the object of the fault processing, to the storage network, and the interface device of the storage network can map the relationship between the identifier of each process and the storage node where the recovery file is stored according to the pre-stored identifier.
  • the storage node corresponding to the received invalidation process #A is found, and the recovery file #A corresponding to the invalidation process #A can be obtained.
  • the communication state can be a communication state between the recovery node and the interface device of the storage network.
  • the plurality of processes include the above-mentioned failed process #A and process #B
  • the plurality of processes include the above-mentioned failed process #A and process #B
  • the recovery node that fails the process #A for recovery needs to communicate with the node running Process #B (ie, the associated node of the recovery node).
  • the communication state may also be a communication state of the recovery node and the associated node.
  • S120 determines the recovery corresponding to each invalidation process according to the size of the recovery file corresponding to each invalidation process and the operating state of each recovery node. node.
  • the node whose current running state can satisfy the operating condition required by the invalidation process #A can be selected as the node.
  • the recovery node corresponding to the invalid process #A is selected as the node.
  • the computing resource or the storage resource can satisfy the processing requirement of the invalid process #A, that is, the idle resource of the node can ensure the running of the invalid process #A, for example, the memory, CPU, storage, and the like of the recovery node satisfy the process #A. Operational requirements.
  • the communication state can satisfy the processing requirement of the invalidation process #A, that is, the recovery node can communicate with the associated node of the failed process #A, or the recovery node can communicate with the storage node of the failed process #A to obtain the invalidation process# A recovery file.
  • the management node may determine the recovery order for each invalidation process according to the size of each recovery file, that is,
  • the recovery node corresponding to each invalidation process is determined according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, including:
  • the recovery nodes corresponding to each invalidation process are sequentially determined in descending order.
  • the recovery node corresponding to each invalid process may be sequentially determined according to the size of the restored file, and the recovery node corresponding to the largest invalidation process of the recovery file is preferentially determined.
  • Step 3 if i>K, then perform step 6, otherwise, perform step 4;
  • the management node may perform recovery processing on each failure according to the recovery node corresponding to each failure process.
  • the management node may directly instruct the recovery node to obtain a recovery file of the invalidation process from the storage node, and perform recovery based on the recovery file.
  • the method and the process of the recovery node performing the recovery process based on the recovery file pair may be similar to the prior art, and a detailed description thereof will be omitted herein to avoid redundancy.
  • the management node may estimate the recovery time according to the recovery node corresponding to each failure process, and determine the recovery strategy according to the recovery time, that is,
  • the fault recovery process is performed on each failed process according to the recovery node corresponding to each failure process, including:
  • each failed process is subjected to failure recovery processing.
  • the management node can sample the recovery time of the failed process in the following manner.
  • the management node can match the failed process and the recovery node by using multiple schemes (for example, preset), or the management node can simulate and place the failed process with a recovery by using multiple schemes (for example, preset). Node, and calculate the recovery time under the scenario.
  • schemes for example, preset
  • Node and calculate the recovery time under the scenario.
  • the recovery node matched by the invalidation process #K needs to meet the requirement of the failure recovery process of the failure process #K for the CPU processing capability, that is, the above condition 1 can be expressed as the following formula (1)
  • the recovery node matched by the invalidation process #K needs to meet the memory recovery requirement of the failure process #K, that is, the above condition 2 can be expressed as the following formula (2)
  • is preset coefficients, determined by experiments, and v represents the number of failures.
  • the failure recovery of a failed process is only performed in one recovery node, for example, the CPU processing capability of the recovery node for performing failure recovery for the failed process #1 (for example, The number of virtual CPUs needs to meet the CPU processing capability requirements for failure recovery of failure process #1, and the recovery node that can be used for failure recovery for failure process #1 can provide memory recovery that needs to satisfy failure process #1. Memory requirements.
  • a and ⁇ are coefficients, which are determined by experiments.
  • the failure node recovery time of the jth placement scheme takes the maximum value of the recovery time of the failure process set on all nodes.
  • R j can be expressed by the following formula (5):
  • the management node may determine whether the estimated recovery time of each failure process satisfies the recovery time requirement of the process, and perform recovery processing according to the determination result, for example, if the estimated The recovery time is less than or equal to the maximum value of the recovery time required by the process, and the management node may instruct the recovery node to recover the invalidation process. For another example, if the estimated recovery time is greater than the maximum value of the recovery time required by the process, the management node may perform troubleshooting processing or the like on the original node running the failed process.
  • the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
  • the storage node corresponding to the invalidation process #A is different from the recovery node corresponding to the invalidation process #A, that is, in the computer system of the embodiment of the present invention, it can be used for storage.
  • the storage node that restores the file is independent of the recovery node used to perform the recovery process, thereby facilitating maintenance and reducing the burden on each node.
  • the relationship between the storage node and the recovery node enumerated above is only an exemplary description, and the storage node and the recovery node corresponding to one failure process may also be the same node, and the present invention is not particularly limited.
  • a recovery node that performs fault recovery on the failed process is determined from at least two recovery nodes according to a size of the recovery file corresponding to the failed process and an operating state of the at least two recovery nodes. More reliable than just one recovery node, It can ensure to a certain extent that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
  • FIG. 3 shows a schematic block diagram of an apparatus 300 for fault recovery in accordance with an embodiment of the present invention.
  • the apparatus 300 includes:
  • the determining unit 310 is configured to determine a size of the recovery file corresponding to each of the N failed processes, and an operating state of each of the M recovery nodes, and according to the recovery file corresponding to each invalid process
  • the size and the running status of the M recovery nodes determine the recovery node corresponding to each failed process, wherein the running status includes a resource usage status or a communication status, N ⁇ 1, M ⁇ 2;
  • the processing unit 320 is configured to perform, according to the recovery node corresponding to each invalidation process, to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
  • the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes.
  • the recovery files corresponding to the first invalidation process stored in each storage node are the same.
  • the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different.
  • the determining unit is specifically configured to sequentially determine each of the recovery files according to the running state of each recovery node according to the size of the recovery file corresponding to each invalidation process, in descending order. The recovery node corresponding to the failed process.
  • the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
  • the processing unit is specifically configured to estimate, according to an operation state of the recovery node corresponding to each failure process and a size of the recovery file corresponding to each failure process, a recovery time of each failure process, and according to each failure The recovery time of the process, and the failure recovery process is performed for each failed process.
  • the apparatus 300 for fault processing may correspond to an execution subject in a method of the embodiment of the present invention, for example, a management node, and each unit in the apparatus 300 of the fault processing, that is, a module and the other operations described above and/or For the sake of brevity, the functions of the method 100 in FIG. 1 are not described here.
  • the recovery node that recovers the failure process from the at least two recovery nodes is determined according to the size of the recovery file corresponding to the failure process and the operation state of the at least two recovery nodes, Compared with only one recovery node, the reliability is higher, and at the same time, it can ensure that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
  • FIG. 4 shows a schematic block diagram of a device 400 for fault recovery in accordance with an embodiment of the present invention.
  • the device 400 includes:
  • processor 420 connected to the bus system 410;
  • a memory 430 connected to the bus system 410;
  • the processor by using the bus, invokes a program stored in the memory to determine a size of a recovery file corresponding to each of the N failed processes, and determines each of the M recovery nodes.
  • Operating state wherein N ⁇ 1, M ⁇ 2;
  • It is used to control according to the recovery node corresponding to each failure process, to recover the failure process for each failure process in the recovery node corresponding to each failure process.
  • the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes corresponding to the first invalid process.
  • the recovery files corresponding to the first invalidation process stored in each storage node are the same.
  • the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different.
  • the processor is specifically configured to sequentially determine each of the recovery files according to the running state of each recovery node according to the size of the recovery file corresponding to each invalidation process, in descending order. The recovery node corresponding to the failed process.
  • the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
  • the processor is specifically configured to estimate, according to an operating state of the recovery node corresponding to each failed process and a size of the recovery file corresponding to each failed process, a recovery time of each invalid process;
  • It is used to perform fault recovery processing for each failed process according to the recovery time of each failed process.
  • the processor can also be referred to as a CPU.
  • the memory can include read only memory and random access memory and provides instructions and data to the processor.
  • a portion of the memory may also include a Non-Volatile Random Access Memory (NVRAM).
  • NVRAM Non-Volatile Random Access Memory
  • the device 400 may be embedded or may be a computer device.
  • the bus includes a power bus, a control bus, and a status signal bus in addition to the data bus. However, for the sake of clarity, various buses are labeled as bus system 410 in the figure.
  • the decoder in a specific different product may be integrated with the processing unit.
  • the processor may implement or perform the steps and logic blocks disclosed in the method embodiments of the present invention.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor, decoder or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the processor 420 may be a central processing unit ("CPU"), and the processor 420 may also be other general-purpose processors and digital signal processors (digital signals). Processor, DSP, Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA), etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 430 can include read only memory and random access memory and provides instructions and data to the processor 420. A portion of the memory 430 may also include a non-volatile random access memory. For example, the memory 430 can also store information of the device type.
  • the bus system 410 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 410 in the figure. It should be noted that, in the embodiment of the present invention, "connected to the bus system 410" may include direct connection or indirect connection.
  • each step of the above method may be completed by an integrated logic circuit of hardware in the processor 420 or an instruction in a form of software.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware processor, or may be performed by a combination of hardware and software modules in the processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 430, and the processor 420 reads the information in the memory 430 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
  • the device 400 for failure recovery may correspond to an execution body (eg, a management node) in the method of the embodiment of the present invention, and each unit in the device 400 of the failure recovery, that is, the module and the other operations described above and/or For the sake of brevity, the functions of the method 100 in FIG. 1 are not described here.
  • an execution body eg, a management node
  • the fault recovery device determines, according to the size of the recovery file corresponding to the invalidation process and the operating state of the at least two recovery nodes, the recovery node that recovers the failed process from the at least two recovery nodes, Compared with only one recovery node, the reliability is higher, and at the same time, it can ensure that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
  • the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be taken to the embodiments of the present invention.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative
  • the division of the unit is only a logical function division, and the actual implementation may have another division manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be Ignore, or not execute.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Abstract

Provided are a method and apparatus for recovering a fault, which can improve the reliability of fault recovery. The method comprises: determining the size of a recovery file corresponding to each failure process in N failure processes, and determining a running state of each recovery node in M recovery nodes, where N ≥ 1 and M ≥ 2; according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, determining a recovery node corresponding to each failure process, wherein the running state comprises a resource usage state or a communication state; and performing control according to the recovery node corresponding to each failure process, so as to perform fault recovery on each failure process at the recovery node corresponding to each failure process.

Description

故障恢复的方法和装置Method and device for fault recovery
本申请要求于2015年09月10日提交中国专利局、申请号为201510573922.6、发明名称为“故障恢复的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 2015-A No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No
技术领域Technical field
本发明涉及计算机领域,并且更具体地,涉及故障恢复的方法和装置。The present invention relates to the field of computers and, more particularly, to methods and apparatus for fault recovery.
背景技术Background technique
随着计算机技术的飞速发展,越来越多的行业利用计算机技术提高本行业的工作效率。With the rapid development of computer technology, more and more industries are using computer technology to improve the efficiency of the industry.
在计算机系统中,计算机主要依靠执行进程来实现其功能,当进程出现故障甚至失效时,可能影响程序乃至整个计算机系统的正常运行,因此,如何实现进程的故障恢复,成为急需解决的问题。In a computer system, the computer mainly relies on the execution process to realize its function. When the process fails or even fails, it may affect the normal operation of the program and even the entire computer system. Therefore, how to realize the process recovery of the process becomes an urgent problem to be solved.
目前,已知一种故障恢复的方法,通过周期性备份记录有进程正常时的计算状态的恢复文件,在运行有该进程的节点出现故障而导致该进程出现故障时,在与该节点相对应的恢复节点,基于所保存的恢复文件对该进程进行恢复处理。At present, a method for recovering a fault is known, which periodically records a recovery file that records a calculation state when a process is normal, and corresponds to the node when a node running the process fails and the process fails. The recovery node recovers the process based on the saved recovery file.
但是,当该节点所对应的恢复节点也出现故障时,将无法实现该进程的恢复,严重影响了故障恢复的可靠性。However, when the recovery node corresponding to the node also fails, the recovery of the process cannot be achieved, which seriously affects the reliability of the fault recovery.
发明内容Summary of the invention
本发明实施例提供一种故障恢复的方法和装置,能够提高故障恢复的可靠性。Embodiments of the present invention provide a method and apparatus for fault recovery, which can improve the reliability of fault recovery.
第一方面,提供了一种故障恢复的方法,该方法包括:确定N个失效进程中每个失效进程所对应的恢复文件的大小,并确定M个恢复节点中每个恢复节点的运行状态,其中,N≥1,M≥2;根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,其中,该运行状态包括资源使用状态;对每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行 故障恢复。In a first aspect, a method for fault recovery is provided, the method comprising: determining a size of a recovery file corresponding to each of the N failed processes, and determining an operating state of each of the M recovery nodes, Wherein, N≥1, M≥2; determining a recovery node corresponding to each invalidation process according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, wherein the running state includes a resource usage state Controlling the recovery nodes corresponding to each failed process to perform a failure process for each failed process Recovery.
结合第一方面,在第一方面的第一种实现方式中,该N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。In conjunction with the first aspect, in a first implementation manner of the first aspect, the recovery file corresponding to the first invalid process of the N failed processes is stored in the at least two storage nodes.
结合第一方面及其上述实现方式,在第一方面的第二种实现方式中,在各该存储节点中存储的该第一失效进程所对应的恢复文件相同。With reference to the first aspect and the foregoing implementation manner, in the second implementation manner of the first aspect, the recovery files corresponding to the first invalidation process stored in each storage node are the same.
结合第一方面及其上述实现方式,在第一方面的第三种实现方式中,该第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各该存储节点中存储的子恢复文件不同。With reference to the first aspect and the foregoing implementation manner, in a third implementation manner of the first aspect, the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node different.
结合第一方面及其上述实现方式,在第一方面的第四种实现方式中,当N≥2时,该根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,包括:根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。With reference to the first aspect and the foregoing implementation manner, in a fourth implementation manner of the first aspect, when N≥2, the size of the recovery file corresponding to each failure process and the running state of each recovery node are determined. The recovery node corresponding to each failure process includes: determining, according to the running state of each recovery node, the size of the recovery file corresponding to each failure process, in descending order, determining each failure process in turn Recovery node.
结合第一方面及其上述实现方式,在第一方面的第五种实现方式中,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。With reference to the first aspect and the foregoing implementation manner, in the fifth implementation manner of the first aspect, the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalidation process.
结合第一方面及其上述实现方式,在第一方面的第六种实现方式中,该根据每个失效进程所对应的恢复节点进行控制,包括:根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间;根据每个失效进程的恢复时间,进行控制。With reference to the first aspect and the foregoing implementation manner, in a sixth implementation manner of the first aspect, the controlling, by the recovery node corresponding to each failure process, includes: performing, according to the recovery node corresponding to each failure process The state and the size of the recovery file corresponding to each failed process, the recovery time of each failed process is estimated; and the control is performed according to the recovery time of each failed process.
第二方面,提供了一种故障恢复的装置,该装置包括:确定单元,用于确定N个失效进程中每个失效进程所对应的恢复文件的大小,及M个恢复节点中每个恢复节点的运行状态,并根据每个失效进程所对应的恢复文件的大小和M个恢复节点的运行状态,确定每个失效进程所对应的恢复节点,其中,该运行状态包括资源使用状态,N≥1,M≥2;处理单元,用于对每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。In a second aspect, a device for fault recovery is provided, the device comprising: a determining unit, configured to determine a size of a recovery file corresponding to each of the N failed processes, and each of the M recovery nodes The running state, and determining the recovery node corresponding to each invalid process according to the size of the recovery file corresponding to each failure process and the running state of the M recovery nodes, wherein the running state includes the resource usage state, N≥1 , M≥2; a processing unit, configured to control a recovery node corresponding to each failed process to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
结合第二方面,在第二方面的第一种实现方式中,该N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。With reference to the second aspect, in a first implementation manner of the second aspect, the recovery file corresponding to the first invalid process of the N failed processes is stored in the at least two storage nodes.
结合第二方面及其上述实现方式,在第二方面的第二种实现方式中,在各该存储节点中存储的该第一失效进程所对应的恢复文件相同。With reference to the second aspect and the foregoing implementation manner, in the second implementation manner of the second aspect, the recovery files corresponding to the first invalidation process stored in each storage node are the same.
结合第二方面及其上述实现方式,在第二方面的第三种实现方式中,该 第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各该存储节点中存储的子恢复文件不同。In combination with the second aspect and the foregoing implementation manner, in a third implementation manner of the second aspect, the The recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each of the storage nodes are different.
结合第二方面及其上述实现方式,在第二方面的第四种实现方式中,当N≥2时,该确定单元具体用于根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。With reference to the second aspect and the foregoing implementation manner, in a fourth implementation manner of the second aspect, when N≥2, the determining unit is specifically configured to be configured according to each failure process according to an operating state of each recovery node. The size of the restored files, in descending order, determines the recovery nodes corresponding to each failed process.
结合第二方面及其上述实现方式,在第二方面的第五种实现方式中,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。With reference to the second aspect and the foregoing implementation manner, in the fifth implementation manner of the second aspect, the recovery node corresponding to one failure process is different from the storage node corresponding to the same invalidation process.
结合第二方面及其上述实现方式,在第二方面的第六种实现方式中,该处理单元具体用于根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间,并根据每个失效进程的恢复时间,进行控制。With reference to the second aspect and the foregoing implementation manner, in a sixth implementation manner of the second aspect, the processing unit is specifically configured to: restore, according to an operating state of the recovery node corresponding to each failure process, and recovery corresponding to each failure process. The size of the file, the recovery time of each failed process is estimated, and control is performed according to the recovery time of each failed process.
可见,根据本发明实施例的故障恢复的方法,根据失效进程所对应的恢复文件的大小和至少两个恢复节点的运行状态,从至少两个恢复节点中确定对该失效进程进行故障恢复的恢复节点,相比只有一个恢复节点可靠性更高,同时能够一定程度上确保所确定的恢复节点能够实现对失效进程的故障恢复,从而进一步提高故障恢复的可靠性。It can be seen that, according to the method for recovering faults according to the embodiment of the present invention, the recovery of the failure recovery of the failed process is determined from the at least two recovery nodes according to the size of the recovery file corresponding to the failed process and the running status of the at least two recovery nodes. A node is more reliable than only one recovery node, and at the same time can ensure that the determined recovery node can achieve fault recovery for the failed process to further improve the reliability of the fault recovery.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the present invention, Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1是根据本发明实施例的故障恢复的方法的示意性流程图。1 is a schematic flow chart of a method of fault recovery according to an embodiment of the present invention.
图2是适用本发明实施例的故障恢复的方法的系统示意性架构图。2 is a schematic architectural diagram of a method of applying fault recovery in accordance with an embodiment of the present invention.
图3是根据本发明实施例的故障恢复的装置的示意性框图。3 is a schematic block diagram of an apparatus for fault recovery in accordance with an embodiment of the present invention.
图4是根据本发明实施例的故障恢复的设备的示意性结构图。4 is a schematic structural diagram of an apparatus for fault recovery according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的 实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are a part of the embodiments of the present invention, and not all of them. Example. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本发明实施例提供的故障恢复的方法和装置,可以应用于计算机上,该计算机包括硬件层、运行在硬件层之上的操作系统层,以及运行在操作系统层上的应用层。该硬件层包括CPU、存储器管理单元(MMU,Memory Management Unit)和内存(也称为存储器)等硬件。该操作系统可以是任意一种或多种通过进程实现业务处理的计算机操作系统,例如,Linux系统、Unix系统、Android系统、iOS系统或windows系统等。该应用层包含浏览器、通讯录、文字处理软件、即时通信软件等应用。应理解,以上列举的计算机设备仅为示例性说明,本发明并未特别限定。The method and apparatus for fault recovery provided by the embodiments of the present invention can be applied to a computer, which includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer. The hardware layer includes hardware such as a CPU, a Memory Management Unit (MMU), and a memory (also referred to as a memory). The operating system may be any one or more computer operating systems that implement business processing through processes, such as a Linux system, a Unix system, an Android system, an iOS system, or a Windows system. The application layer includes applications such as browsers, contacts, word processing software, and instant messaging software. It should be understood that the above-listed computer devices are merely illustrative and the invention is not particularly limited.
图1是根据本发明实施例提供的故障恢复的方法100的示意性流程图,如图1所示,该方法100包括:1 is a schematic flowchart of a method 100 for fault recovery according to an embodiment of the present invention. As shown in FIG. 1, the method 100 includes:
S110,确定N个失效进程中每个失效进程所对应的恢复文件的大小,并确定M个恢复节点中每个恢复节点的运行状态,其中,N≥1,M≥2;S110, determining a size of a recovery file corresponding to each of the N failed processes, and determining an operation state of each of the M recovery nodes, where N≥1, M≥2;
需要说明的是,作为实例而非限定,在本发明实施例中,该失效进程指的是运行(或处理)的结果未能达到期望的进程,例如在运行的过程中就因故障被中止的进程,即该进程没有运行结果(也可以看做该进程的运行结果不符合预期),又例如看似运行成功结束但运行结果却不符合期望的进程。It should be noted that, as an example and not by way of limitation, in the embodiment of the present invention, the failure process refers to the result of the operation (or processing) failing to reach the desired process, for example, the operation is aborted due to the failure. The process, that is, the process does not run the result (it can also be seen that the running result of the process is not as expected), and for example, the process seems to be successful but the running result is not in accordance with the expected process.
S120,根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,其中,该运行状态包括资源使用状态;S120. Determine, according to a size of the recovery file corresponding to each failure process and an operation state of each recovery node, a recovery node corresponding to each failure process, where the operation state includes a resource usage state.
需要说明的是,作为实例而非限定,该资源使用状态包括该恢复节点上的硬件利用率,例如CPU利用率和/或内存利用率。举例说明,若某一恢复节点的CPU利用率已经很高,即该恢复节点已经很繁忙,同时该失效进程对应的恢复文件又较大,即恢复该失效进程需要消耗较多资源,那么该恢复节点就不适合作为该失效进程的恢复节点。It should be noted that, as an example and not by way of limitation, the resource usage status includes hardware utilization on the recovery node, such as CPU utilization and/or memory utilization. For example, if the CPU usage of a recovery node is already high, that is, the recovery node is already busy, and the recovery file corresponding to the invalidation process is large, that is, the recovery process needs to consume more resources, then the recovery is performed. The node is not suitable as a recovery node for the failed process.
可选的,在其他的实施例中,该运行状态还可以包括该恢复节点的通信状态,具体的,该通信状态指的是该恢复节点与其他节点的通信状态是可以通信还是不可以通信。举例说明,若某一恢复节点与存储失效进程A对应的恢复文件的存储节点不能通信,那么该恢复节点就不能作为该失效进程A的恢复节点。 Optionally, in other embodiments, the running state may further include a communication state of the recovery node. Specifically, the communication state refers to whether the communication state of the recovery node and other nodes is communicable or not. For example, if a recovery node cannot communicate with the storage node of the recovery file corresponding to the storage invalidation process A, the recovery node cannot be the recovery node of the invalidation process A.
S130,对每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。S130. Control a recovery node corresponding to each failed process to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
图2示出了适用该方法100的计算机系统200的示意性架构图,如图2所示,计算机系统200包括管理节点210、多个存储节点220和多个恢复节点230和多个计算节点240,为了便于理解和说明,图2中仅示出一个计算节点240。2 shows a schematic architectural diagram of a computer system 200 to which the method 100 is applied. As shown in FIG. 2, the computer system 200 includes a management node 210, a plurality of storage nodes 220, and a plurality of recovery nodes 230 and a plurality of computing nodes 240. For ease of understanding and illustration, only one computing node 240 is shown in FIG.
其中,各计算节点240运行有一个或多个进程。Each computing node 240 runs one or more processes.
并且,各计算节点240与一个或多个存储节点220通信连接,从而计算节点240能够将所运行的各进程的恢复文件传输至所连接的存储节点220进行备份。作为实例而非限定,在本发明实施例中,恢复文件可以是其进程在正常状态下运行时的数据。Moreover, each computing node 240 is in communication with one or more storage nodes 220 such that the computing node 240 can transfer the recovered files of the running processes to the connected storage node 220 for backup. By way of example and not limitation, in an embodiment of the invention, the recovery file may be data when its process is running in a normal state.
另外,管理节点210与各计算节点240通信连接,从而管理节点210能够监控各计算节点240中所运行的各进程的运行状态。In addition, the management node 210 is in communication with each of the computing nodes 240 such that the management node 210 can monitor the operational status of each of the processes running in each computing node 240.
此外,管理节点210与各恢复节点230通信连接,从而管理节点210能够监控各恢复节点230的运行状态,并向恢复节点230发送控制指令。Further, the management node 210 is communicatively coupled to each of the recovery nodes 230 such that the management node 210 can monitor the operational status of each of the recovery nodes 230 and send control instructions to the recovery node 230.
可选地,在本发明实施例中,各恢复节点230与各存储节点220可以通信连接,从而,恢复节点230在对进程进行恢复时,能够从存储节点220中获取恢复文件。Optionally, in the embodiment of the present invention, each recovery node 230 and each storage node 220 can be communicably connected, so that the recovery node 230 can obtain the recovery file from the storage node 220 when recovering the process.
应理解,以上列举的各恢复节点230与各存储节点220的连接关系仅为实例性说明,例如,也可以使管理节点210与各存储节点220通信连接,并使恢复节点230通过管理节点210从存储节点220中获取恢复文件。It should be understood that the connection relationship between each of the recovery nodes 230 and the storage nodes 220 enumerated above is only an example. For example, the management node 210 may be communicably connected to each storage node 220, and the recovery node 230 may be passed through the management node 210. The recovery file is obtained in the storage node 220.
另外,在计算机系统200中,可以设置有与个节点通信连接的总线系统250,从而,能够通过总线系统250实现上述各节点之间的通信连接。Further, in the computer system 200, a bus system 250 communicably connected to the nodes may be provided, so that the communication connection between the above-described nodes can be realized by the bus system 250.
在本发明实施例中,一个节点可以是一台独立配置的计算机实体,或者,也可以多个节点配置在同一计算机实体,或者,也可以由多个计算机实体构成一个节点,本发明并未特别限定。In the embodiment of the present invention, one node may be an independently configured computer entity, or multiple nodes may be configured in the same computer entity, or multiple computer entities may constitute one node, and the present invention is not special. limited.
在本发明实施例中,该方法100可以由管理节点210执行,该管理节点210可以独立于计算机系统中的各节点(包括发生故障的计算节点240、用于存储失效进程的恢复文件的存储节点220和用于对失效进行故障恢复的恢复节点230)。并且,该管理节点与各节点通信连接,以传输控制指令或数据等信息;或者,该方法100也可以由该计算机系统200中集成有该管理节点 的功能的一个或多个其他节点执行。In the embodiment of the present invention, the method 100 may be performed by the management node 210, which may be independent of each node in the computer system (including the failed computing node 240, the storage node for storing the recovery file of the failed process). 220 and recovery node 230 for failover failure. And, the management node is in communication connection with each node to transmit information such as control instructions or data; or the method 100 may be integrated with the management node in the computer system 200. The function of one or more other nodes is executed.
并且,本发明实施了的方法100应用于包括至少两个恢复节点的计算机系统中,各恢复节点能够提供计算资源(例如,中央处理器和存储器等),从而能够对失效进程进行恢复。Moreover, the method 100 implemented by the present invention is applied to a computer system including at least two recovery nodes, each of which is capable of providing computing resources (e.g., a central processing unit and memory, etc.) to enable recovery of the failed process.
此外,管理节点可以指示恢复节点进行针对失效进程的恢复。In addition, the management node can instruct the recovery node to perform recovery for the failed process.
以下,为了便于理解和说明,以管理节点作为该方法100的执行主体,对该方法100的处理过程进行详细说明。Hereinafter, in order to facilitate understanding and explanation, the processing procedure of the method 100 will be described in detail with the management node as the execution subject of the method 100.
具体地说,在本发明实施了中,为了防止进程失效而影响业务服务,可以周期性的存储进程的恢复文件,例如,执行状态或计算状态,进程失效时,回滚到之前保存的某个状态处重新开始执行。Specifically, in the implementation of the present invention, in order to prevent the process from being invalid and affecting the service service, the recovery file of the process may be periodically stored, for example, the execution state or the calculation state, and when the process fails, roll back to the previously saved one. The status is restarted.
本发明实施例中的恢复文件用来记录进程在正常状态下运行时的数据,以便于根据该恢复文件记录的内容对每个失效进程进行故障恢复。在一些实施例中,该恢复文件可以是检查点文件,该检查点文件是基于检查点技术生成的;在另一些实施例中,该恢复文件可以是日志文件,该日志文件是基于日志技术生成的。需要说明的是,检查点文件或日志文件的生成可以由现有技术实现,不属于本发明涵盖的内容,因此在此不做详述。The recovery file in the embodiment of the present invention is used to record data when the process is running in a normal state, so as to perform fault recovery for each invalid process according to the content recorded by the recovery file. In some embodiments, the recovery file may be a checkpoint file that is generated based on a checkpoint technique; in other embodiments, the recovery file may be a log file that is generated based on a logging technique of. It should be noted that the generation of the checkpoint file or the log file may be implemented by the prior art, and is not included in the scope of the present invention, and therefore will not be described in detail herein.
应理解,以上列举的恢复文件的形式和所记录的内容仅为示例性说明,本发明并不限定于此,其他能够实现本发明实施例的恢复文件的功能的具体实施例均落入本发明的保护范围内。It should be understood that the form of the recovery file listed above and the recorded content are merely exemplary descriptions, and the present invention is not limited thereto, and other specific embodiments capable of implementing the function of restoring files of the embodiments of the present invention are all included in the present invention. Within the scope of protection.
在本发明实施了中,失效进程的数量可以为一个,也可以为多个,本发明并未特别限定,并且,当失效进程的数量为多个时,针对各失效进程的处理类似,以下为了便于理解和说明,以针对失效进程#A的处理为例,对本发明实施了的方法100的处理过程进行详细说明。In the implementation of the present invention, the number of invalid processes may be one or multiple, and the present invention is not particularly limited, and when the number of failed processes is plural, the processing for each failed process is similar, and the following is For ease of understanding and explanation, the processing of the method 100 implemented by the present invention will be described in detail by taking the processing of the invalidation process #A as an example.
可选地,该N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。Optionally, the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes.
具体地说,在本发明实施了中,针对失效进程#A(即,第一失效进程的一例),其所对应的恢复文件可以存储在该失效进程#A所对应的两个或两个以上的存储节点中。Specifically, in the implementation of the present invention, for the invalidation process #A (ie, an example of the first invalidation process), the corresponding recovery file may be stored in two or more corresponding to the invalidation process #A. In the storage node.
这里,当存在多个失效进程时,任意两个失效进程的恢复文件所存储于的存储节点可以相同,也可以不同,本发明并未特别限定。Here, when there are multiple invalid processes, the storage nodes in which the recovery files of any two failed processes are stored may be the same or different, and the present invention is not particularly limited.
另外,上述“相同”可以包括:完全相同,例如,失效进程#A的恢复文 件可以存储于存储节点#α和存储节点#β,失效进程#B的恢复文件可以存储于存储节点#α和存储节点#β;或者,部分相同,例如,失效进程#C的恢复文件可以存储于存储节点#η和存储节点#θ,失效进程#D的恢复文件可以存储于存储节点#θ和存储节点#λ)。In addition, the above "identical" may include: identical, for example, the recovery text of the invalidation process #A The pieces may be stored in the storage node #α and the storage node #β, and the recovery file of the invalidation process #B may be stored in the storage node #α and the storage node #β; or, partially, for example, the recovery file of the invalidation process #C may be stored. For the storage node #n and the storage node #θ, the recovery file of the invalidation process #D can be stored in the storage node #θ and the storage node #λ).
类似地,上述“不同”可以包括:完全不同,例如,失效进程#A的恢复文件可以存储于存储节点#α和存储节点#β,失效进程#B的恢复文件可以存储于存储节点#γ和存储节点#δ;或者,部分不同,例如,失效进程#C的恢复文件可以存储于存储节点#η和存储节点#θ,失效进程#D的恢复文件可以存储于存储节点#θ和存储节点#λ)。Similarly, the above "different" may include: completely different, for example, the recovery file of the invalidation process #A may be stored in the storage node #α and the storage node #β, and the recovery file of the invalidation process #B may be stored in the storage node #γ and Storage node #δ; or, partially different, for example, the recovery file of the invalidation process #C may be stored in the storage node #n and the storage node #θ, and the recovery file of the invalidation process #D may be stored in the storage node #θ and the storage node # λ).
并且,以上列举的一个失效进程的恢复文件所存储于的存储节点的数量仅为示例性说明,本发明并未限定于此,例如,一个失效进程的恢复文件也可以只存储在一个存储节点。并且,当存在多个失效进程时,各失效进程的恢复文件所存储于的存储节点的数量可以相同,也可以不同,本发明并未特别限定。Moreover, the number of storage nodes in which the recovery files of one of the above-mentioned failed processes are stored is merely exemplary. The present invention is not limited thereto. For example, a recovery file of a failed process may be stored only in one storage node. Moreover, when there are multiple invalid processes, the number of storage nodes in which the recovery files of the failed processes are stored may be the same or different, and the present invention is not particularly limited.
在本发明实施了中,该失效进程#A所对应的恢复文件可以按以下方式存储在多个(至少两个)存储节点中。In the implementation of the present invention, the recovery file corresponding to the invalidation process #A may be stored in a plurality of (at least two) storage nodes in the following manner.
方式1Mode 1
可选地,在各该存储节点中存储的该第一失效进程所对应的恢复文件相同。Optionally, the recovery files corresponding to the first invalidation process stored in each storage node are the same.
具体地说,在本发明实施了中,多个存储节点中可以存储有失效进程#A所对应的完整的恢复文件(以下,为了便于理解和区分,记做:恢复文件#A)。另外,“完整的恢复文件”是指,通过存储在一个存储节点中的恢复文件#A便能够对失效进程#A的故障处理。Specifically, in the implementation of the present invention, a complete recovery file corresponding to the invalidation process #A may be stored in a plurality of storage nodes (hereinafter, for ease of understanding and distinction, it is recorded as: recovery file #A). In addition, the "complete recovery file" means that the failure process #A can be handled by the recovery file #A stored in one storage node.
根据本发明实施了的故障处理的方法,通过使在各该存储节点中存储的该第一失效进程所对应的恢复文件相同(即,各存储节点存储有一致的、完整的恢复文件),在一个或多个存储节点发生故障时仍能够从其他的未发生故障的存储节点获取恢复文件,从而进一步提高故障处理的可靠性。The method for fault processing according to the present invention, by making the recovery files corresponding to the first invalidation process stored in each storage node the same (ie, each storage node stores a consistent, complete recovery file), When one or more storage nodes fail, the recovery files can still be obtained from other non-failed storage nodes, thereby further improving the reliability of the fault handling.
方式2Mode 2
可选地,该第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各该存储节点中存储的子恢复文件不同。。Optionally, the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different. .
具体地说,在本发明实施了中,可以将在恢复文件#A分割为多个子恢 复文件(为了便于理解和区分,记做:子恢复文件#A1~子恢复文件#AX)。并将该子恢复文件#A1~子恢复文件#AX分别存储在多个存储节点中,其中,一个子恢复文件可以存储在一个存储节点中,也可以(重复或分割地)存储在多个存储节点中,本发明并未特别限定,并且,任意两个子恢复文件所存储于的存储节点不同,或者说,各存储节点所存储的子恢复文件不同,这里,“不同”可以是完全不同,例如,对于子恢复文件#A1和子恢复文件#A2,子恢复文件#A1可以存储于存储节点#1和存储节点#2,子恢复文件#A2可以存储于存储节点#3和存储节点#4;或者,“不同”也可以部分不同,例如,对于子恢复文件#A3和子恢复文件#A4,子恢复文件#A3可以存储于存储节点#5和存储节点#6,子恢复文件#A4可以存储于存储节点#6和存储节点#7。Specifically, in the practice of the present invention, the recovery file #A can be divided into a plurality of sub-recovery files (for ease of understanding and distinction, it is recorded as: sub-recovery file #A 1 - sub-recovery file #A X ). And the sub-recovery file #A 1 - sub-recovery file #A X are respectively stored in a plurality of storage nodes, wherein one sub-recovery file may be stored in one storage node, or may be stored (repeatedly or dividedly) in multiple Among the storage nodes, the present invention is not particularly limited, and any two sub-recovery files are stored in different storage nodes, or the sub-recovery files stored in the storage nodes are different. Here, "different" may be completely different. For example, for the sub-recovery file #A 1 and the sub-recovery file #A 2 , the sub-recovery file #A 1 may be stored in the storage node #1 and the storage node #2, and the sub-recovery file #A 2 may be stored in the storage node #3 and Storage node #4; or, "different" may also be partially different, for example, for sub-recovery file #A 3 and sub-recovery file #A 4 , sub-recovery file #A 3 may be stored in storage node #5 and storage node #6, Sub-recovery file #A 4 can be stored in storage node #6 and storage node #7.
根据本发明实施例提供的故障处理的方法,通过将失效进程所对应的恢复文件分割为多个子恢复文件,并将各子恢复文件存储于相异的存储节点,能够在进行故障处理时同时从多个存储节点获取子恢复文件,从而能够减小传输恢复文件所需要的时间,提高故障处理的效率。According to the method for fault processing provided by the embodiment of the present invention, by dividing the recovery file corresponding to the failed process into multiple sub-recovery files and storing each sub-recovery file in a different storage node, it is possible to simultaneously perform the fault processing. The plurality of storage nodes acquire the sub-recovery files, thereby reducing the time required for transmitting the restored files and improving the efficiency of troubleshooting.
可选的,在一些实施例中,该“失效进程”的恢复文件的存储是在该进程有效(或者说,正常运行时)时周期性地进行的,即,在本发明实施了中,在计算机系统中运行的各进程,在该进程有效(或者说,正常运行)时,均周期性的存储其恢复文件。Optionally, in some embodiments, the storage of the recovery file of the "failed process" is performed periodically when the process is valid (or normal operation), that is, in the implementation of the present invention, Each process running in a computer system periodically stores its recovery files while the process is active (or, in normal operation).
在另一些实施例中,恢复文件也可以是在失效进程失效前存储在失效进行所运行的计算节点中,并在失效前上传至存储节点,即,恢复文件的备份也可以是周期性地进行的。In other embodiments, the recovery file may also be stored in the computing node running after the failure process fails, and uploaded to the storage node before the failure, that is, the backup of the recovery file may also be performed periodically. of.
以上列举的恢复文件的存储方式仅为示例性说明,本发明并未限定于此,例如,各进程的恢复文件还可以统一的存储于一个存储节点。The storage methods of the recovery files listed above are merely exemplary, and the present invention is not limited thereto. For example, the recovery files of the processes may be uniformly stored in one storage node.
本发明实施了的故障处理的方法可以在管理节点确定进程失效时执行,作为确定进程失效的方法,例如,在计算机系统中的各节点运行的进程可以周期性向管理节点发送心跳消息,如果管理节点在规定时间内未接收到某个进程的心跳消息,则可以认为该进程失效。应理解,以上列举的确定进程失效的方法仅为示例性说明,本发明并不限定于此,现有技术中能够确定进程失效的方法均落入本发明的保护范围内。The method for fault processing implemented by the present invention may be executed when the management node determines that the process is invalid. As a method for determining the process failure, for example, a process running on each node in the computer system may periodically send a heartbeat message to the management node, if the management node If the heartbeat message of a process is not received within the specified time, the process may be considered invalid. It should be understood that the above-listed methods for determining the process failure are merely exemplary, and the present invention is not limited thereto, and the methods for determining the process failure in the prior art all fall within the protection scope of the present invention.
从而,在如上所述确定进程失效后,在S110,管理节点可以确定各失效节点所对应的恢复文件的大小,例如,针对失效进程#A(即,失效进程的一 例),管理节点可以从失效进程#A所对应的(一个或多个)存储设备获取用于指示恢复文件#A(或者,子恢复文件#A1~子恢复文件#AX)的大小的信息,并根据该信息,确定失效进程#A所对应的恢复文件的大小。Therefore, after determining that the process is invalid as described above, at S110, the management node may determine the size of the recovery file corresponding to each failed node, for example, for the invalidation process #A (ie, an example of the invalidation process), the management node may fail. The storage device(s) corresponding to the process #A acquires information indicating the size of the restored file #A (or, the sub-recovery file #A 1 to the sub-recovery file #A X ), and determines based on the information. The size of the recovery file corresponding to the invalidation process #A.
再例如,各进程在生成恢复文件时可以确定该恢复文件的大小,并将用于指示恢复文件大小的信息发送给管理节点,管理节点可以根据进程的标识存储该信息,即,可以基于进程的标识与指示来自该进程的恢复文件的大小的信息之间的映射关系,对所获得的指示各恢复文件的大小的信息进行存储和索引,从而,当管理节点发现该进程失效时,可以基于该进程的标识查找到指示该进程所对应的恢复文件的大小的信息。For another example, each process may determine the size of the restored file when generating the recovery file, and send information for indicating the size of the restored file to the management node, and the management node may store the information according to the identifier of the process, that is, may be based on the process. And mapping the information indicating the size of each recovery file to the information indicating the size of the recovery file from the process, and storing and indexing the obtained information indicating the size of each recovery file, so that when the management node finds that the process is invalid, the The identifier of the process finds information indicating the size of the recovery file corresponding to the process.
并且,在S110,管理节点可以确定各恢复节点的运行状态。And, at S110, the management node can determine the operational status of each recovery node.
在一些实施例中,该运行状态可以包括恢复节点的资源使用状态,具体地说,恢复节点可以根据管理节点的指示或周期性的向管理节点上报该恢复节点的中央处理器(CPU,Central Processing Unit)使用率或内存使用率等负载信息,从而,管理节点可以根据来自恢复节点的负载信息,确定其资源使用状态。In some embodiments, the running state may include restoring a resource usage state of the node. Specifically, the recovery node may report the central processor of the recovery node to the management node according to the indication of the management node or periodically (CPU, Central Processing) Unit) Load information such as usage rate or memory usage, so that the management node can determine the resource usage status based on the load information from the recovery node.
在另一些实施例中,该运行状态可以包括恢复节点的通信状态,具体地说,恢复节点可以向管理节点上报指示该恢复节点与计算机系统中的其他节点的通信状态(例如,该状态可以包括正常通信、无法通信或通信时延等)的通信状态信息,从而,管理节点可以根据来自恢复节点的该通信状态信息,确定其通信状态。In other embodiments, the operational state may include restoring a communication state of the node. Specifically, the recovery node may report to the management node a communication status indicating the recovery node and other nodes in the computer system (eg, the status may include Communication status information of normal communication, inability to communicate, or communication delay, etc., whereby the management node can determine its communication status based on the communication status information from the recovery node.
需要说明的是,在本发明实施了中,各存储节点可以组成存储网络(或者说,存储网格),该存储网络通过统一的对外接口向计算机系统中的各恢复节点提供数据(例如,恢复文件),并且,各恢复节点与该存储网络之间可以通过消息队列等方式进行通信,从而各恢复节点无需获知存储有其需要的恢复文件的存储节点的具体地址,例如,网际协议(IP,Internet Protocol)地址或媒体访问控制地址(MAC,Media Access Control)地址等。It should be noted that, in the implementation of the present invention, each storage node may constitute a storage network (or a storage grid), and the storage network provides data to each recovery node in the computer system through a unified external interface (for example, recovery). File), and each recovery node and the storage network can communicate through a message queue or the like, so that each recovery node does not need to know the specific address of the storage node storing the required recovery file, for example, Internet Protocol (IP, Internet Protocol) address or media access control address (MAC, Media Access Control) address, etc.
例如,恢复节点仅需将作为故障处理的对象的失效进程#A的标识发送至存储网络,存储网络的接口设备可以根据预先存储的各进程的标识与其恢复文件所存储于的存储节点的映射关系,查找到与所接收到的失效进程#A的标识相对应的存储节点,进而能过获得与该失效进程#A相对应的恢复文件#A。 For example, the recovery node only needs to send the identifier of the invalidation process #A, which is the object of the fault processing, to the storage network, and the interface device of the storage network can map the relationship between the identifier of each process and the storage node where the recovery file is stored according to the pre-stored identifier. The storage node corresponding to the received invalidation process #A is found, and the recovery file #A corresponding to the invalidation process #A can be obtained.
因此,该通信状态可以是恢复节点与该存储网络的接口设备之间的通信状态。Therefore, the communication state can be a communication state between the recovery node and the interface device of the storage network.
另外,在本发明实施了中,存在多个进程(例如,该多个进程包括上述失效进程#A和进程#B)之间需要进行通信以实现业务功能的情况,此情况下,用于对失效进程#A进行恢复的恢复节点需要与运行有进程#B的节点(即,该恢复节点的关联节点)进行通信。In addition, in the implementation of the present invention, there are cases where a plurality of processes (for example, the plurality of processes include the above-mentioned failed process #A and process #B) need to communicate to implement a service function, in this case, for The recovery node that fails the process #A for recovery needs to communicate with the node running Process #B (ie, the associated node of the recovery node).
因此,该通信状态还可以是恢复节点与关联节点的通信状态。Therefore, the communication state may also be a communication state of the recovery node and the associated node.
在如上所述,确定恢复文件的大小以及各恢复节点的运行状态后,S120,根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点。After determining the size of the recovery file and the running status of each recovery node, as described above, S120 determines the recovery corresponding to each invalidation process according to the size of the recovery file corresponding to each invalidation process and the operating state of each recovery node. node.
例如,当N=1时,需要恢复的进程只有一个(例如,上述失效进程#A),此情况下,可以选择当前的运行状态能够满足该失效进程#A所需要的运行条件的节点作为该失效进程#A所对应的恢复节点。For example, when N=1, there is only one process that needs to be restored (for example, the above-mentioned invalidation process #A). In this case, the node whose current running state can satisfy the operating condition required by the invalidation process #A can be selected as the node. The recovery node corresponding to the invalid process #A.
作为运行条件,例如,可以列举:As the operating conditions, for example,
A.计算资源或存储资源能够满足失效进程#A的处理要求,即,节点的空闲资源能够确保该失效进程#A的运行,例如,恢复节点的内存、CPU、存储等资源满足进程#A的运行要求。A. The computing resource or the storage resource can satisfy the processing requirement of the invalid process #A, that is, the idle resource of the node can ensure the running of the invalid process #A, for example, the memory, CPU, storage, and the like of the recovery node satisfy the process #A. Operational requirements.
B.通信状态能够满足失效进程#A的处理要求,即,恢复节点能够与失效进程#A的关联节点进行通信,或者,恢复节点能够与失效进程#A的存储节点进行通信以获得失效进程#A的恢复文件。B. The communication state can satisfy the processing requirement of the invalidation process #A, that is, the recovery node can communicate with the associated node of the failed process #A, or the recovery node can communicate with the storage node of the failed process #A to obtain the invalidation process# A recovery file.
应理解,以上列举的根据恢复节点的运行状态和恢复文件的大小确定失效进程所对应的恢复节点的方法和过程仅为示例性说明,本发明并不限定于此。It should be understood that the method and process for determining the recovery node corresponding to the invalidation process according to the operating state of the recovery node and the size of the recovery file enumerated above are merely exemplary descriptions, and the present invention is not limited thereto.
再例如,当N≥2时,管理节点可以根据各恢复文件的大小,确定针对各失效进程的恢复顺序,即For another example, when N≥2, the management node may determine the recovery order for each invalidation process according to the size of each recovery file, that is,
可选地,当N≥2时,该根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,包括:Optionally, when N≥2, the recovery node corresponding to each invalidation process is determined according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, including:
根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。According to the running state of each recovery node, based on the size of the recovery file corresponding to each invalidation process, the recovery nodes corresponding to each invalidation process are sequentially determined in descending order.
具体地说,在存在多个失效进程时,需要为该多个失效进程寻找合适的恢复节点,并确保该多个失效进程的恢复时间之和最短。 Specifically, when there are multiple failed processes, it is necessary to find a suitable recovery node for the multiple failed processes, and ensure that the sum of recovery times of the multiple failed processes is the shortest.
在本发明实施例中,可以根据恢复文件的大小,按照从大到小的顺序,依次确定各失效进程所对应的恢复节点,优先确定恢复文件最大的失效进程所对应的恢复节点。In the embodiment of the present invention, the recovery node corresponding to each invalid process may be sequentially determined according to the size of the restored file, and the recovery node corresponding to the largest invalidation process of the recovery file is preferentially determined.
下面对N≥2时,各失效进程所对应的恢复节点的具体确定过程进行说明。The following describes the specific determination process of the recovery node corresponding to each failure process when N≥2.
步骤1,管理节点可以基于恢复文件大小,按照从大到小的顺序,确定失效进程集合(包括失效进程#1~失效进程#K),其中,P1≥P2≥…≥PK,其中,P表示恢复文件大小,K为待恢复的失效进程的数量,并令计数变量i=1;Step 1: The management node may determine the set of the invalid processes (including the invalid process #1 to the invalid process #K) according to the size of the restored file, in descending order, wherein P 1 ≥ P 2 ≥ ... ≥ P K , wherein , P represents the recovery file size, K is the number of invalid processes to be restored, and the count variable i=1;
步骤2,管理节点可以获取能够提供恢复处理的节点(或者说,运行有能够放置待恢复的失效进行的虚拟机),设这些节点的集合为N,已被分配的集合Nu=0;Step 2: The management node may acquire a node capable of providing recovery processing (or a virtual machine capable of placing a failure to be restored), and set the set of these nodes to N, and the allocated set Nu=0;
步骤3,若i>K,则执行步骤6,否则,执行步骤4;Step 3, if i>K, then perform step 6, otherwise, perform step 4;
步骤4,针对失效进程#i,假如集合Nu中节点的排列顺序(例如,可以按放入集合的时间顺序排列,也可以按剩余的可用资源大小排列,本发明并未特别限定)为N1,N2…Nj,则先依次尝试集合Nu中节点是否能够满足失效进程#i的运行条件,如果不存,则从集合N进行选择,并将能够满足的节点(例如,集合N中能够满足失效进程#i的运行条件的下标最小的节点)放入结合Nu中,作为Nj+1。并且,管理节点可以进行更新处理,令i=i+1,转到Step3。Step 4: For the invalidation process #i, if the order of the nodes in the set Nu is arranged (for example, it may be arranged in the chronological order of the set, or may be arranged according to the remaining available resources, the invention is not particularly limited) as N 1 , N 2 ... N j , firstly try to determine whether the node in the set Nu can satisfy the running condition of the invalid process #i, if not, select from the set N, and the node that can be satisfied (for example, the set N can The node with the smallest subscript that satisfies the operating condition of the failed process #i is placed in the combined Nu as N j+1 . And, the management node can perform update processing, let i=i+1, and go to Step3.
应理解,以上列举的确定各失效进程所对应的恢复节点的方法仅为示例性说明,本发明并未限定于此,还可以使用例如,首次适应、最佳适应、降序首次适应及降序最佳适应算法等启发式算法确定各失效进程所对应的恢复节点。It should be understood that the foregoing enumerated methods for determining the recovery nodes corresponding to each failure process are merely exemplary, and the present invention is not limited thereto, and for example, first adaptation, optimal adaptation, first-order adaptation in descending order, and optimal descending order may be used. A heuristic algorithm such as an adaptive algorithm determines the recovery node corresponding to each failed process.
在如上所述确定了各失效进程所对应的恢复节点之后,在S130,管理节点可以根据各失效进程所对应的恢复节点,对各失效进行恢复处理。After the recovery node corresponding to each failure process is determined as described above, at S130, the management node may perform recovery processing on each failure according to the recovery node corresponding to each failure process.
例如,管理节点可以直接指示恢复节点从存储节点获取失效进程的恢复文件,并基于该恢复文件进行恢复。这里,恢复节点基于恢复文件对进行进行恢复处理的方法和过程可以与现有技术相似,这里,为了避免赘述,省略其详细说明。For example, the management node may directly instruct the recovery node to obtain a recovery file of the invalidation process from the storage node, and perform recovery based on the recovery file. Here, the method and the process of the recovery node performing the recovery process based on the recovery file pair may be similar to the prior art, and a detailed description thereof will be omitted herein to avoid redundancy.
再例如,管理节点可以根据各失效进程所对应的恢复节点,估计恢复时间,并根据恢复时间,确定恢复策略,即: For another example, the management node may estimate the recovery time according to the recovery node corresponding to each failure process, and determine the recovery strategy according to the recovery time, that is,
可选地,该根据每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复处理,包括:Optionally, the fault recovery process is performed on each failed process according to the recovery node corresponding to each failure process, including:
根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间;Estimating the recovery time of each failed process according to the running state of the recovery node corresponding to each failed process and the size of the recovery file corresponding to each failed process;
根据每个失效进程的恢复时间,对每个失效进程进行故障恢复处理。According to the recovery time of each failed process, each failed process is subjected to failure recovery processing.
具体地说,管理节点可以采样以下方式估计失效进程的恢复时间。Specifically, the management node can sample the recovery time of the failed process in the following manner.
管理节点可以采用(例如,预先设置的)多种方案对失效进程和恢复节点进行匹配,或者说,管理节点可以采用(例如,预先设置的)多种方案将失效进程模拟地放置与某个恢复节点,并计算该方案下的恢复时间。The management node can match the failed process and the recovery node by using multiple schemes (for example, preset), or the management node can simulate and place the failed process with a recovery by using multiple schemes (for example, preset). Node, and calculate the recovery time under the scenario.
其中,各方案需要满足以下条件:Among them, each program needs to meet the following conditions:
条件1Condition 1
失效进程#K所匹配的恢复节点需要满足失效进程#K的故障恢复对CPU处理能力的要求,即,上述条件1可以表示为以下式(1)The recovery node matched by the invalidation process #K needs to meet the requirement of the failure recovery process of the failure process #K for the CPU processing capability, that is, the above condition 1 can be expressed as the following formula (1)
Figure PCTCN2016097957-appb-000001
Figure PCTCN2016097957-appb-000001
其中,
Figure PCTCN2016097957-appb-000002
表示失效进程#K的故障恢复对CPU处理能力的要求;aKi∈[0,1],即,如果失效进程#K放置在恢复节点i,则aKi=1,否则aKi=0;
Figure PCTCN2016097957-appb-000003
表示恢复节点#i的CPU的处理能力(例如,所能够提供的虚拟CPU的数量)。
among them,
Figure PCTCN2016097957-appb-000002
Indicates the requirement of CPU recovery capability for failure recovery of failure process #K; a Ki ∈[0,1], ie, if failure process #K is placed at recovery node i, then a Ki =1, otherwise a Ki =0;
Figure PCTCN2016097957-appb-000003
Indicates the processing power of the CPU of the recovery node #i (for example, the number of virtual CPUs that can be provided).
条件2Condition 2
失效进程#K所匹配的恢复节点需要满足失效进程#K的故障恢复对内存的要求,即,上述条件2可以表示为以下式(2)The recovery node matched by the invalidation process #K needs to meet the memory recovery requirement of the failure process #K, that is, the above condition 2 can be expressed as the following formula (2)
Figure PCTCN2016097957-appb-000004
Figure PCTCN2016097957-appb-000004
其中,
Figure PCTCN2016097957-appb-000005
表示失效进程#K的故障恢复对内存的要求;aKi∈[0,1],即,如果失效进程K放置在恢复节点i,则aKi=1,否则aKi=0;
Figure PCTCN2016097957-appb-000006
表示恢复节点#i能够提供的内存。
among them,
Figure PCTCN2016097957-appb-000005
Represents the memory recovery requirement for failure process #K; a Ki ∈[0,1], ie, if the failure process K is placed at recovery node i, a Ki =1, otherwise a Ki =0;
Figure PCTCN2016097957-appb-000006
Indicates the memory that can be provided by the recovery node #i.
并且,上述
Figure PCTCN2016097957-appb-000007
可以根据以下式(3)确定:
And above
Figure PCTCN2016097957-appb-000007
It can be determined according to the following formula (3):
Figure PCTCN2016097957-appb-000008
Figure PCTCN2016097957-appb-000008
其中,mK表示失效进程#K申请内存的大小(或者说,失效进程#K的恢复文件的大小),μ、ε为预设系数,由实验测定,v表示失效进行的数量。Where m K represents the size of the invalidation process #K application memory (or the size of the recovery file of the invalidation process #K), μ, ε are preset coefficients, determined by experiments, and v represents the number of failures.
条件3Condition 3
一个失效进程的故障恢复只在一个恢复节点中进行,例如,用于进行针对失效进程#1的故障恢复的恢复节点的CPU处理能力(例如,所能够提供 的虚拟CPU的数量)需要满足失效进程#1的故障恢复对CPU处理能力的要求,且用于进行针对失效进程#1的故障恢复的恢复节点能够提供的内存需要满足失效进程#1的故障恢复对内存的要求。The failure recovery of a failed process is only performed in one recovery node, for example, the CPU processing capability of the recovery node for performing failure recovery for the failed process #1 (for example, The number of virtual CPUs needs to meet the CPU processing capability requirements for failure recovery of failure process #1, and the recovery node that can be used for failure recovery for failure process #1 can provide memory recovery that needs to satisfy failure process #1. Memory requirements.
恢复节点#i上的失效进程恢复时间T(ni)与失效进程#K的恢复文件的大小mK之间的量化关系可以由以下式(4)表达:The quantized relationship between the failed process recovery time T(n i ) on the recovery node #i and the size m K of the recovery file of the invalidation process #K can be expressed by the following equation (4):
Figure PCTCN2016097957-appb-000009
Figure PCTCN2016097957-appb-000009
其中,a、β为系数,由实验测定。Among them, a and β are coefficients, which are determined by experiments.
第j种放置方案的失效节点恢复时间取所有节点上失效进程集合恢复时间的最大值Rj可以由以下式(5)表达:The failure node recovery time of the jth placement scheme takes the maximum value of the recovery time of the failure process set on all nodes. R j can be expressed by the following formula (5):
Rj=max(Tj(nj)),i=1,2,…s            式(5)R j =max(T j (n j )), i=1,2,...s (5)
从而,基于上述条件1~条件3能够确定使恢复时间开销量化模型的目标函数min(Rj)达到最小的匹配方案,以及该方案下各失效进程的恢复时间。。Thus, based on the above conditions 1 to 3, it is possible to determine a matching scheme in which the objective function min(R j ) of the recovery time overhead quantization model is minimized, and the recovery time of each invalidation process in the scheme. .
在如上所述确定了各失效进程的恢复时间之后,管理节点可以判定各失效进程的估计的恢复时间是否满足该进程对恢复时间的要求,并根据判定结果执行恢复处理,例如,如果所估计的恢复时间小于或等于该进程所要求的恢复时间的最大值,则管理节点可以指示恢复节点对该失效进程进行恢复。再例如,如果所估计的恢复时间大于该进程所要求的恢复时间的最大值,则管理节点可以在运行有该失效进程的原节点进行故障排除处理等。After determining the recovery time of each failure process as described above, the management node may determine whether the estimated recovery time of each failure process satisfies the recovery time requirement of the process, and perform recovery processing according to the determination result, for example, if the estimated The recovery time is less than or equal to the maximum value of the recovery time required by the process, and the management node may instruct the recovery node to recover the invalidation process. For another example, if the estimated recovery time is greater than the maximum value of the recovery time required by the process, the management node may perform troubleshooting processing or the like on the original node running the failed process.
可选地,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点相异。Optionally, the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
具体地说,在本发明实施了中失效进程#A所对应的存储节点,与该失效进程#A所对应的恢复节点相异,即,在本发明实施例的计算机系统中,可以使用于存储恢复文件的存储节点与用于执行恢复处理的恢复节点独立,从而能够便于维护,并减小各节点的负担。Specifically, in the implementation of the present invention, the storage node corresponding to the invalidation process #A is different from the recovery node corresponding to the invalidation process #A, that is, in the computer system of the embodiment of the present invention, it can be used for storage. The storage node that restores the file is independent of the recovery node used to perform the recovery process, thereby facilitating maintenance and reducing the burden on each node.
应理解,以上列举的存储节点与恢复节点的关系仅为示例性说明,一个失效进程所对应的存储节点与恢复节点也可以为同一节点,本发明并未特别限定。It should be understood that the relationship between the storage node and the recovery node enumerated above is only an exemplary description, and the storage node and the recovery node corresponding to one failure process may also be the same node, and the present invention is not particularly limited.
根据本发明实施例的故障恢复的方法,根据失效进程所对应的恢复文件的大小和至少两个恢复节点的运行状态,从至少两个恢复节点中确定对该失效进程进行故障恢复的恢复节点,相比只有一个恢复节点可靠性更高,同时 能够一定程度上确保所确定的恢复节点能够实现对失效进程的故障恢复,从而进一步提高故障恢复的可靠性。According to the method for recovering a fault according to an embodiment of the present invention, a recovery node that performs fault recovery on the failed process is determined from at least two recovery nodes according to a size of the recovery file corresponding to the failed process and an operating state of the at least two recovery nodes. More reliable than just one recovery node, It can ensure to a certain extent that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
以上,结合图1和图2详细说明了本发明实施了的故障恢复的方法,下面结合图3,详细说明发明的故障恢复的装置。The method of fault recovery implemented by the present invention has been described in detail above with reference to FIGS. 1 and 2. Hereinafter, the apparatus for fault recovery of the present invention will be described in detail with reference to FIG.
图3示出了根据本发明实施例的故障恢复的装置300的示意性框图。如图3所示,该装置300包括:FIG. 3 shows a schematic block diagram of an apparatus 300 for fault recovery in accordance with an embodiment of the present invention. As shown in FIG. 3, the apparatus 300 includes:
确定单元310,用于确定N个失效进程中每个失效进程所对应的恢复文件的大小,及M个恢复节点中每个恢复节点的运行状态,并根据每个失效进程所对应的恢复文件的大小和M个恢复节点的运行状态,确定每个失效进程所对应的恢复节点,其中,该运行状态包括资源使用状态或通信状态,N≥1,M≥2;The determining unit 310 is configured to determine a size of the recovery file corresponding to each of the N failed processes, and an operating state of each of the M recovery nodes, and according to the recovery file corresponding to each invalid process The size and the running status of the M recovery nodes determine the recovery node corresponding to each failed process, wherein the running status includes a resource usage status or a communication status, N≥1, M≥2;
处理单元320,用于根据每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。The processing unit 320 is configured to perform, according to the recovery node corresponding to each invalidation process, to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
可选地,该N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。Optionally, the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes.
可选地,在各该存储节点中存储的该第一失效进程所对应的恢复文件相同。Optionally, the recovery files corresponding to the first invalidation process stored in each storage node are the same.
可选地,该第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各该存储节点中存储的子恢复文件不同。Optionally, the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different.
可选地,当N≥2时,该确定单元具体用于根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。Optionally, when N≥2, the determining unit is specifically configured to sequentially determine each of the recovery files according to the running state of each recovery node according to the size of the recovery file corresponding to each invalidation process, in descending order. The recovery node corresponding to the failed process.
可选地,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。Optionally, the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
可选地,该处理单元具体用于根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间,并根据每个失效进程的恢复时间,对每个失效进程进行故障恢复处理。Optionally, the processing unit is specifically configured to estimate, according to an operation state of the recovery node corresponding to each failure process and a size of the recovery file corresponding to each failure process, a recovery time of each failure process, and according to each failure The recovery time of the process, and the failure recovery process is performed for each failed process.
根据本发明实施例的故障处理的装置300可对应于本发明实施例的方法中的执行主体,例如,管理节点,并且,该故障处理的装置300中的各单元即模块和上述其他操作和/或功能分别为了实现图1中的方法100的相应流程,为了简洁,在此不再赘述。 The apparatus 300 for fault processing according to an embodiment of the present invention may correspond to an execution subject in a method of the embodiment of the present invention, for example, a management node, and each unit in the apparatus 300 of the fault processing, that is, a module and the other operations described above and/or For the sake of brevity, the functions of the method 100 in FIG. 1 are not described here.
根据本发明实施例的故障恢复的装置,根据失效进程所对应的恢复文件的大小和至少两个恢复节点的运行状态,从至少两个恢复节点中确定对该失效进程进行故障恢复的恢复节点,相比只有一个恢复节点可靠性更高,同时能够一定程度上确保所确定的恢复节点能够实现对失效进程的故障恢复,从而进一步提高故障恢复的可靠性。According to the apparatus for fault recovery according to the embodiment of the present invention, the recovery node that recovers the failure process from the at least two recovery nodes is determined according to the size of the recovery file corresponding to the failure process and the operation state of the at least two recovery nodes, Compared with only one recovery node, the reliability is higher, and at the same time, it can ensure that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
以上,结合图1和图2详细说明了本发明实施了的故障恢复的方法,下面结合图4,详细说明发明的故障恢复的设备。The method of fault recovery implemented by the present invention has been described in detail above with reference to FIG. 1 and FIG. 2. The apparatus for fault recovery of the invention will be described in detail below with reference to FIG.
图4示出了根据本发明实施例的故障恢复的设备400的示意性框图。如图4所示,该设备400包括:FIG. 4 shows a schematic block diagram of a device 400 for fault recovery in accordance with an embodiment of the present invention. As shown in FIG. 4, the device 400 includes:
总线系统410;Bus system 410;
与该总线系统410相连的处理器420;a processor 420 connected to the bus system 410;
与该总线系统410相连的存储器430;a memory 430 connected to the bus system 410;
其中,该处理器通过该总线,调用该存储器中存储的程序,以用于确定N个失效进程中每个失效进程所对应的恢复文件的大小,并确定M个恢复节点中每个恢复节点的运行状态,其中,N≥1,M≥2;The processor, by using the bus, invokes a program stored in the memory to determine a size of a recovery file corresponding to each of the N failed processes, and determines each of the M recovery nodes. Operating state, wherein N≥1, M≥2;
用于根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,其中,该运行状态包括资源使用状态或通信状态;And determining, by the size of the recovery file corresponding to each failure process, and the operation state of each recovery node, the recovery node corresponding to each failure process, where the operation state includes a resource usage state or a communication state;
用于根据每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。It is used to control according to the recovery node corresponding to each failure process, to recover the failure process for each failure process in the recovery node corresponding to each failure process.
可选地,该N个失效进程中的第一失效进程所对应的恢复文件存储在该第一失效进程所对应的至少两个存储节点中。Optionally, the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes corresponding to the first invalid process.
可选地,在各该存储节点中存储的该第一失效进程所对应的恢复文件相同。Optionally, the recovery files corresponding to the first invalidation process stored in each storage node are the same.
可选地,该第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各该存储节点中存储的子恢复文件不同。Optionally, the recovery file corresponding to the first invalidation process includes at least two sub-recovery files, and the sub-recovery files stored in each storage node are different.
可选地,当N≥2时,该处理器具体用于根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。Optionally, when N≥2, the processor is specifically configured to sequentially determine each of the recovery files according to the running state of each recovery node according to the size of the recovery file corresponding to each invalidation process, in descending order. The recovery node corresponding to the failed process.
可选地,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。 Optionally, the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalid process.
可选地,可选地,该处理器具体用于根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间;Optionally, the processor is specifically configured to estimate, according to an operating state of the recovery node corresponding to each failed process and a size of the recovery file corresponding to each failed process, a recovery time of each invalid process;
用于根据每个失效进程的恢复时间,对每个失效进程进行故障恢复处理。It is used to perform fault recovery processing for each failed process according to the recovery time of each failed process.
处理器还可以称为CPU。存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失随机访问存储器(Non-Volatile Random Access Memory,NVRAM)。具体的应用中,设备400可以嵌入或者本身可以就是计算机设备,其中,总线除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚明起见,在图中将各种总线都标为总线系统410。具体的不同产品中解码器可能与处理单元集成为一体。The processor can also be referred to as a CPU. The memory can include read only memory and random access memory and provides instructions and data to the processor. A portion of the memory may also include a Non-Volatile Random Access Memory (NVRAM). In a specific application, the device 400 may be embedded or may be a computer device. The bus includes a power bus, a control bus, and a status signal bus in addition to the data bus. However, for the sake of clarity, various buses are labeled as bus system 410 in the figure. The decoder in a specific different product may be integrated with the processing unit.
处理器可以实现或者执行本发明方法实施例中的公开的各步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器,解码器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用解码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。The processor may implement or perform the steps and logic blocks disclosed in the method embodiments of the present invention. The general purpose processor may be a microprocessor or the processor or any conventional processor, decoder or the like. The steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware processor, or may be performed by a combination of hardware and software modules in the decoding processor. The software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
应理解,在本发明实施例中,该处理器420可以是中央处理单元(Central Processing Unit,简称为“CPU”),该处理器420还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in the embodiment of the present invention, the processor 420 may be a central processing unit ("CPU"), and the processor 420 may also be other general-purpose processors and digital signal processors (digital signals). Processor, DSP, Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA), etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
该存储器430可以包括只读存储器和随机存取存储器,并向处理器420提供指令和数据。存储器430的一部分还可以包括非易失性随机存取存储器。例如,存储器430还可以存储设备类型的信息。The memory 430 can include read only memory and random access memory and provides instructions and data to the processor 420. A portion of the memory 430 may also include a non-volatile random access memory. For example, the memory 430 can also store information of the device type.
该总线系统410除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统410。需要说明的是,本发明实施例中“与总线系统410相连”既可以包括直接相连,也可以包括间接相连。 The bus system 410 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for clarity of description, various buses are labeled as bus system 410 in the figure. It should be noted that, in the embodiment of the present invention, "connected to the bus system 410" may include direct connection or indirect connection.
在实现过程中,上述方法的各步骤可以通过处理器420中的硬件的集成逻辑电路或者软件形式的指令完成。结合本发明实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器430,处理器420读取存储器430中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 420 or an instruction in a form of software. The steps of the method disclosed in the embodiments of the present invention may be directly implemented as a hardware processor, or may be performed by a combination of hardware and software modules in the processor. The software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like. The storage medium is located in the memory 430, and the processor 420 reads the information in the memory 430 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
根据本发明实施例的故障恢复的设备400可对应于本发明实施例的方法中的执行主体(例如,管理节点),并且,故障恢复的设备400中的各单元即模块和上述其他操作和/或功能分别为了实现图1中的方法100的相应流程,为了简洁,在此不再赘述。The device 400 for failure recovery according to an embodiment of the present invention may correspond to an execution body (eg, a management node) in the method of the embodiment of the present invention, and each unit in the device 400 of the failure recovery, that is, the module and the other operations described above and/or For the sake of brevity, the functions of the method 100 in FIG. 1 are not described here.
根据本发明实施例的故障恢复的设备,根据失效进程所对应的恢复文件的大小和至少两个恢复节点的运行状态,从至少两个恢复节点中确定对该失效进程进行故障恢复的恢复节点,相比只有一个恢复节点可靠性更高,同时能够一定程度上确保所确定的恢复节点能够实现对失效进程的故障恢复,从而进一步提高故障恢复的可靠性。According to the embodiment of the present invention, the fault recovery device determines, according to the size of the recovery file corresponding to the invalidation process and the operating state of the at least two recovery nodes, the recovery node that recovers the failed process from the at least two recovery nodes, Compared with only one recovery node, the reliability is higher, and at the same time, it can ensure that the determined recovery node can achieve fault recovery for the failed process, thereby further improving the reliability of the fault recovery.
需要说明的是,本发明实施例中提到的A和/或B代表的含义包括A,B以及A和B。It should be noted that the meanings of A and/or B mentioned in the embodiments of the present invention include A, B, and A and B.
应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。It should be understood that, in various embodiments of the present invention, the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be taken to the embodiments of the present invention. The implementation process constitutes any limitation.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the implementations disclosed herein can be implemented in electronic hardware, or in combination with computer hardware and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示 意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative For example, the division of the unit is only a logical function division, and the actual implementation may have another division manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be Ignore, or not execute. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。 The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.

Claims (14)

  1. 一种故障恢复的方法,其特征在于,所述方法包括:A method for fault recovery, characterized in that the method comprises:
    确定N个失效进程中每个失效进程所对应的恢复文件的大小,并确定M个恢复节点中每个恢复节点的运行状态,其中,N≥1,M≥2;Determining a size of a recovery file corresponding to each of the N failed processes, and determining an operating state of each of the M recovery nodes, wherein N≥1, M≥2;
    根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,其中,所述运行状态包括资源使用状态;Determining, according to the size of the recovery file corresponding to each failure process and the running state of each recovery node, the recovery node corresponding to each failure process, wherein the operation state includes a resource usage state;
    对每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。The recovery nodes corresponding to each failed process are controlled to perform fault recovery for each failed process at the recovery node corresponding to each failed process.
  2. 根据权利要求1所述的方法,其特征在于,所述N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。The method according to claim 1, wherein the recovery file corresponding to the first invalid process of the N failed processes is stored in at least two storage nodes.
  3. 根据权利要求2所述的方法,其特征在于,在各所述存储节点中存储的所述第一失效进程所对应的恢复文件相同。The method according to claim 2, wherein the recovery files corresponding to the first invalidation process stored in each of the storage nodes are the same.
  4. 根据权利要求2所述的方法,其特征在于,所述第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各所述存储节点中存储的子恢复文件不同。The method according to claim 2, wherein the recovery file corresponding to the first invalidation process comprises at least two sub-recovery files, and the sub-recovery files stored in each of the storage nodes are different.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,当N≥2时,所述根据每个失效进程所对应的恢复文件的大小以及每个恢复节点的运行状态确定每个失效进程所对应的恢复节点,包括:The method according to any one of claims 1 to 4, wherein, when N ≥ 2, each of the size of the recovery file corresponding to each failure process and the operation state of each recovery node is determined. The recovery node corresponding to the invalidation process, including:
    根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。According to the running state of each recovery node, based on the size of the recovery file corresponding to each invalidation process, the recovery nodes corresponding to each invalidation process are sequentially determined in descending order.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。The method according to any one of claims 1 to 5, characterized in that the recovery node corresponding to a failed process is different from the storage node corresponding to the same invalidation process.
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述根据每个失效进程所对应的恢复节点进行控制,包括:The method according to any one of claims 1 to 6, wherein the controlling according to the recovery node corresponding to each failure process comprises:
    根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间;Estimating the recovery time of each failed process according to the running state of the recovery node corresponding to each failed process and the size of the recovery file corresponding to each failed process;
    根据每个失效进程的恢复时间,进行控制。Control is performed according to the recovery time of each failed process.
  8. 一种故障恢复的装置,其特征在于,所述装置包括:A device for fault recovery, characterized in that the device comprises:
    确定单元,用于确定N个失效进程中每个失效进程所对应的恢复文件的大小,及M个恢复节点中每个恢复节点的运行状态,并根据每个失效进程 所对应的恢复文件的大小和M个恢复节点的运行状态,确定每个失效进程所对应的恢复节点,其中,所述运行状态包括资源使用状态,N≥1,M≥2;a determining unit, configured to determine a size of a recovery file corresponding to each of the N failed processes, and an operating state of each of the M recovery nodes, and according to each invalid process Determining a recovery node corresponding to each failure process, wherein the operation state includes a resource usage state, N≥1, M≥2;
    处理单元,用于对每个失效进程所对应的恢复节点进行控制,以在每个失效进程所对应的恢复节点,对每个失效进程进行故障恢复。The processing unit is configured to control the recovery node corresponding to each failed process to perform fault recovery for each failed process in the recovery node corresponding to each failed process.
  9. 根据权利要求8所述的装置,其特征在于,所述N个失效进程中的第一失效进程所对应的恢复文件存储在至少两个存储节点中。The apparatus according to claim 8, wherein the recovery file corresponding to the first invalidation process of the N failed processes is stored in at least two storage nodes.
  10. 根据权利要求9所述的装置,其特征在于,在各所述存储节点中存储的所述第一失效进程所对应的恢复文件相同。The apparatus according to claim 9, wherein the recovery files corresponding to the first invalidation process stored in each of the storage nodes are the same.
  11. 根据权利要求9所述的装置,其特征在于,所述第一失效进程所对应的恢复文件包括至少两个子恢复文件,在各所述存储节点中存储的子恢复文件不同。The apparatus according to claim 9, wherein the recovery file corresponding to the first invalidation process comprises at least two sub-recovery files, and the sub-recovery files stored in each of the storage nodes are different.
  12. 根据权利要求8至11中任一项所述的装置,其特征在于,当N≥2时,所述确定单元具体用于根据每个恢复节点的运行状态,基于每个失效进程所对应的恢复文件的大小,按照从大到小的顺序,依次确定每个失效进程所对应的恢复节点。The device according to any one of claims 8 to 11, characterized in that, when N≥2, the determining unit is specifically configured to recover according to each failure process according to an operating state of each recovery node. The size of the file, in descending order, determines the recovery node corresponding to each failed process.
  13. 根据权利要求8至12中任一项所述的装置,其特征在于,一个失效进程所对应的恢复节点与同一失效进程所对应的存储节点不同。The apparatus according to any one of claims 8 to 12, characterized in that the recovery node corresponding to a failure process is different from the storage node corresponding to the same failure process.
  14. 根据权利要求8至13中任一项所述的装置,其特征在于,所述处理单元具体用于根据每个失效进程所对应的恢复节点的运行状态和每个失效进程所对应的恢复文件的大小,估计每个失效进程的恢复时间,并根据每个失效进程的恢复时间,进行控制。 The apparatus according to any one of claims 8 to 13, wherein the processing unit is specifically configured to: according to an operating state of the recovery node corresponding to each failure process and a recovery file corresponding to each failure process Size, estimate the recovery time of each failed process, and control according to the recovery time of each failed process.
PCT/CN2016/097957 2015-09-10 2016-09-02 Method and apparatus for recovering fault WO2017041671A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510573922.6 2015-09-10
CN201510573922.6A CN106528324A (en) 2015-09-10 2015-09-10 Fault recovery method and apparatus

Publications (1)

Publication Number Publication Date
WO2017041671A1 true WO2017041671A1 (en) 2017-03-16

Family

ID=58239099

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/097957 WO2017041671A1 (en) 2015-09-10 2016-09-02 Method and apparatus for recovering fault

Country Status (2)

Country Link
CN (1) CN106528324A (en)
WO (1) WO2017041671A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111045845B (en) * 2019-11-29 2021-09-17 苏州浪潮智能科技有限公司 Data returning method, device, equipment and computer readable storage medium
CN111338848B (en) * 2020-02-24 2021-11-19 深圳华锐金融技术股份有限公司 Failure application copy processing method and device, computer equipment and storage medium
CN114697328A (en) * 2022-03-25 2022-07-01 浪潮云信息技术股份公司 Method and system for realizing NiFi high-availability cluster mode
CN114780296A (en) * 2022-05-09 2022-07-22 马上消费金融股份有限公司 Data backup method, device and system for database cluster

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070180288A1 (en) * 2005-12-22 2007-08-02 International Business Machines Corporation Method, system and program for securing redundancy in parallel computing sytem
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN103019889A (en) * 2012-12-21 2013-04-03 曙光信息产业(北京)有限公司 Distributed file system and failure processing method thereof
CN103067229A (en) * 2013-01-22 2013-04-24 浪潮(北京)电子信息产业有限公司 Method, control center, computational node and system of automatic management of computing resource
CN103853634A (en) * 2014-02-26 2014-06-11 北京优炫软件股份有限公司 Disaster recovery system and disaster recovery method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103795759B (en) * 2012-10-31 2018-02-09 北京搜狐新媒体信息技术有限公司 The dispatching method and system of a kind of virtual machine image file
CN103440111B (en) * 2013-08-05 2016-08-10 北京京东尚科信息技术有限公司 The extended method in magnetic disk of virtual machine space, host and platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070180288A1 (en) * 2005-12-22 2007-08-02 International Business Machines Corporation Method, system and program for securing redundancy in parallel computing sytem
CN102411520A (en) * 2011-09-21 2012-04-11 电子科技大学 Data-unit-based disaster recovery method for seismic data
CN103019889A (en) * 2012-12-21 2013-04-03 曙光信息产业(北京)有限公司 Distributed file system and failure processing method thereof
CN103067229A (en) * 2013-01-22 2013-04-24 浪潮(北京)电子信息产业有限公司 Method, control center, computational node and system of automatic management of computing resource
CN103853634A (en) * 2014-02-26 2014-06-11 北京优炫软件股份有限公司 Disaster recovery system and disaster recovery method

Also Published As

Publication number Publication date
CN106528324A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
US11729044B2 (en) Service resiliency using a recovery controller
US10261853B1 (en) Dynamic replication error retry and recovery
JP5851503B2 (en) Providing high availability for applications in highly available virtual machine environments
US8108715B1 (en) Systems and methods for resolving split-brain scenarios in computer clusters
US10983880B2 (en) Role designation in a high availability node
US8688642B2 (en) Systems and methods for managing application availability
WO2017041671A1 (en) Method and apparatus for recovering fault
US8281071B1 (en) Systems and methods for managing cluster node connectivity information
US8732429B2 (en) Preserving a deleted data volume
US10826812B2 (en) Multiple quorum witness
RU2653254C1 (en) Method, node and system for managing data for database cluster
CN109245926B (en) Intelligent network card, intelligent network card system and control method
WO2015058711A1 (en) Rapid fault detection method and device
US11061603B1 (en) Systems and methods for switching replication modes in a volume replication system
US9148479B1 (en) Systems and methods for efficiently determining the health of nodes within computer clusters
US11953976B2 (en) Detecting and recovering from fatal storage errors
US8438277B1 (en) Systems and methods for preventing data inconsistency within computer clusters
US9830263B1 (en) Cache consistency
US8621260B1 (en) Site-level sub-cluster dependencies
US9124534B1 (en) Systems and methods for managing sub-clusters within dependent clustered computing systems subsequent to partition events
US8935695B1 (en) Systems and methods for managing multipathing configurations for virtual machines
US11392423B2 (en) Method for running a quorum-based system by dynamically managing the quorum
US9880855B2 (en) Start-up control program, device, and method
US8650433B2 (en) Shared ethernet adapter (SEA) load sharing and SEA fail-over configuration as set by a user interface
US8938639B1 (en) Systems and methods for performing fast failovers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16843609

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16843609

Country of ref document: EP

Kind code of ref document: A1