CN109634769B - Fault-tolerant processing method, device, equipment and storage medium in data storage - Google Patents

Fault-tolerant processing method, device, equipment and storage medium in data storage Download PDF

Info

Publication number
CN109634769B
CN109634769B CN201811528482.2A CN201811528482A CN109634769B CN 109634769 B CN109634769 B CN 109634769B CN 201811528482 A CN201811528482 A CN 201811528482A CN 109634769 B CN109634769 B CN 109634769B
Authority
CN
China
Prior art keywords
fault
data
tolerant processing
storage
command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811528482.2A
Other languages
Chinese (zh)
Other versions
CN109634769A (en
Inventor
鲍明通
张在理
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811528482.2A priority Critical patent/CN109634769B/en
Publication of CN109634769A publication Critical patent/CN109634769A/en
Application granted granted Critical
Publication of CN109634769B publication Critical patent/CN109634769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The application discloses a fault-tolerant processing method in data storage, which comprises the following steps: calling a fault-tolerant processing process when the storage main process runs, wherein the fault-tolerant processing process and the storage main process run relatively independently and in parallel; monitoring IO fault data and command fault data generated during the running of the storage main process; when IO fault data or command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library; a fault handling scheme is executed. The method and the device reduce the coupling degree between the fault-tolerant processing mechanism and the storage main process, improve the service operation efficiency of data storage, enable the storage system to be more convenient for system upgrading and maintenance, effectively improve the concentration degree and efficiency of the fault-tolerant processing process, and greatly improve the user experience. The application also provides a fault-tolerant processing device, equipment and a computer readable storage medium in data storage, and the fault-tolerant processing device, the equipment and the computer readable storage medium also have the beneficial effects.

Description

Fault-tolerant processing method, device, equipment and storage medium in data storage
Technical Field
The present application relates to the field of fault tolerance technologies, and in particular, to a fault tolerance processing method, apparatus, device, and computer readable storage medium in data storage.
Background
Generally, the storage system in the prior art is designed with a fault tolerance mechanism during data storage. The types of system errors handled by the fault tolerance mechanism can be divided into two types, command data and IO data. In the process of storing the main service operation, the host upper control module issues command data, and after a certain intermediate processing process (such as protocol conversion or analysis), the storage bottom control module performs bottom processing, thereby completing the data-dropping operation. However, the fault-tolerant processing program in the prior art is coupled in the intermediate processing process of the main service storage process, so that the interactive data between the upper control module and the lower control module of the host computer is only sent after being monitored by the fault-tolerant processing module, which not only seriously reduces the operating efficiency of the main service storage, but also makes the executable program and the additional program library of the storage system software huge due to the fault-tolerant processing program, which is not beneficial to system upgrade and maintenance. In view of the above, it is an important need for those skilled in the art to provide a solution to the above technical problems.
Disclosure of Invention
The present application aims to provide a method, an apparatus, a device, and a computer readable storage medium for fault-tolerant processing in data storage, so as to effectively reduce the coupling degree between a fault-tolerant processing mechanism and a storage host process, thereby improving the operating efficiency of storage services and simplifying a storage system.
In order to solve the above technical problem, in a first aspect, the present application discloses a fault-tolerant processing method in data storage, including:
calling a fault-tolerant processing process when a storage main process runs, wherein the fault-tolerant processing process and the storage main process run relatively independently and in parallel;
monitoring IO fault data and command fault data generated during the running of the storage main process;
when the IO fault data or the command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library;
executing the fault handling scheme.
Optionally, the fault-tolerant processing process is deployed locally or in a cloud.
Optionally, the monitoring IO failure data and command failure data generated during the running of the storage host process includes:
receiving the IO fault data by using a preset IO interrupt interface, and regularly searching the command fault data in a preset shared storage area; the preset shared storage area is stored with command data sent to the disk bottom control module by the host upper control module during the running of the storage main process.
Optionally, the command data is stored in the preset shared storage area in a queue data form.
Optionally, the fault handling scheme includes any one or any combination of the following:
command data retransmission, command data state modification and correction data generation aiming at the disk bottom layer control module.
Optionally, after the executing the fault handling scheme, the method further includes:
fault tolerant processing log data is generated and stored.
In a second aspect, the present application further discloses a fault-tolerant processing apparatus in data storage, including:
the calling unit is used for calling a fault-tolerant processing process when the storage main process runs, and the fault-tolerant processing process and the storage main process run independently and in parallel;
the monitoring unit is used for monitoring IO fault data and command fault data generated during the running of the storage main process;
the searching unit is used for searching a corresponding fault processing scheme in a preset fault processing scheme library when the monitoring unit monitors the IO fault data or the command fault data;
a processing unit for executing the fault handling scheme.
Optionally, a log generating unit is further included, configured to generate and store fault-tolerant processing log data after the processing unit executes the fault processing scheme.
In a third aspect, the present application further discloses a fault-tolerant processing device in data storage, including:
a memory for storing a computer program;
a processor for executing said computer program to implement the steps of the fault tolerant processing method in any of the data stores as described above.
In a fourth aspect, the present application also discloses a computer-readable storage medium having a computer program stored therein, which when executed by a processor is configured to implement the steps of the fault-tolerant processing method in data storage as described above.
The fault-tolerant processing method in the data storage provided by the application comprises the following steps: calling a fault-tolerant processing process when a storage main process runs, wherein the fault-tolerant processing process and the storage main process run relatively independently and in parallel; monitoring IO fault data and command fault data generated during the running of the storage main process; when the IO fault data or the command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library; executing the fault handling scheme. Therefore, the fault-tolerant processing process which can run in parallel is formed by extracting the fault-tolerant processing process from the storage main process in the prior art, the coupling degree between the fault-tolerant processing mechanism and the storage main process is effectively reduced, the data processing flow of the storage main process and the size of a program file of a storage system are simplified, the service operation efficiency of data storage is improved, the storage system is more convenient for system upgrade and maintenance, the concentration degree and efficiency of the fault-tolerant processing process are also effectively improved, and the user experience is greatly improved. The fault-tolerant processing device, the equipment and the computer readable storage medium in the data storage provided by the application can realize the fault-tolerant processing method in the data storage, and also have the beneficial effects.
Drawings
In order to more clearly illustrate the technical solutions in the prior art and the embodiments of the present application, the drawings that are needed to be used in the description of the prior art and the embodiments of the present application will be briefly described below. Of course, the following description of the drawings related to the embodiments of the present application is only a part of the embodiments of the present application, and it will be obvious to those skilled in the art that other drawings can be obtained from the provided drawings without any creative effort, and the obtained other drawings also belong to the protection scope of the present application.
FIG. 1 is a flow chart of a fault tolerance processing method in a data storage according to the present application;
FIG. 2 is a block diagram of a fault tolerant processing apparatus in a data storage according to the present application;
fig. 3 is a block diagram of another fault tolerant processing apparatus in a data storage according to the present application.
Detailed Description
The core of the application is to provide a fault-tolerant processing method, a device, equipment and a computer readable storage medium in data storage, so as to effectively reduce the coupling degree of a fault-tolerant processing mechanism and a storage main process, thereby improving the operating efficiency of storage services and simplifying a storage system.
In order to more clearly and completely describe the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application discloses a fault-tolerant processing method in data storage, which mainly comprises the following steps as shown in figure 1:
s1: and calling a fault-tolerant processing process when the storage main process runs, wherein the fault-tolerant processing process and the storage main process run relatively independently and in parallel.
Specifically, different from the prior art, the fault-tolerant processing process is extracted from the storage main process in the prior art, so that a fault-tolerant processing process which is relatively independent from the storage main process and can run in parallel is formed, and the fault-tolerant processing process is used for monitoring and processing system errors. The storage main process which is extracted from the fault-tolerant processing process is used for executing the main service of data storage, namely after the upper control module of the host generates related command data, the host can give the lower control module of the disk to execute related operations without the processes of detection, judgment, waiting and the like in the fault-tolerant processing after only needing to perform related protocol conversion or analysis and the like basic communication operations, thereby quickly finishing the whole data storage process. The fault-tolerant processing process can run concurrently with the storage main process, and because the fault-tolerant processing process and the storage main process are two different processes, the fault-tolerant processing process does not occupy the resources of the storage main process.
S2: IO fault data and command fault data generated during the running of the storage main process are monitored.
As mentioned above, the storage system mostly works by using a numerical control offloading mechanism, so the system errors targeted by the fault-tolerant processing mechanism can be generally classified into two types, i.e. IO stream type and control stream type. The IO stream system error is a fault on a bottom IO read-write layer, and generally comprises hard disk pull, disk drive fault and the like; the control flow system error is a problem of command data of a component such as an upper CPU or the like or a state of the command data.
S3: and when IO fault data or command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library.
S4: a fault handling scheme is executed.
Specifically, a fault handling scheme library is preset, and various fault handling schemes for dealing with various fault problems are stored in the fault handling scheme library. Once IO fault data or command fault data is monitored through the execution of the fault-tolerant processing process, the IO fault data or the command fault data can be searched in a preset fault processing scheme library, and a corresponding fault processing scheme is determined and executed, so that the fault problem of the system can be solved in time, and the normal operation of the storage system and the storage service can be guaranteed.
As a preferred embodiment, the recommendation of the fault handling scheme is not limited to include any one or any combination of the following items: command data retransmission, command data state modification and correction data generation aiming at the disk bottom layer control module.
Of course, a person skilled in the art may also select and set other fault handling schemes according to the actual application situation, which is not limited in this application. In addition, it is also necessary to supplement that, because the fault-tolerant processing process provided by the present application is dedicated to performing fault-tolerant processing, the whole fault-tolerant processing process is more centralized and systematic, and the situation that the multi-layer processing is repeatedly performed in the storage main process in the prior art does not occur. The fault handling scheme stored in the preset fault handling scheme library may be a simplified public scheme designed after induction, sorting and statistics of fault problems, so that upgrading of a fault handling system is further facilitated, and fault handling capability is improved.
According to the fault-tolerant processing method in the data storage, the fault-tolerant processing process is called when the storage main process runs, and the fault-tolerant processing process and the storage main process are relatively independent and run in parallel; monitoring IO fault data and command fault data generated during the running of the storage main process; when IO fault data or command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library; a fault handling scheme is executed. Therefore, the fault-tolerant processing process which can run in parallel is formed by extracting the fault-tolerant processing process from the storage main process in the prior art, the coupling degree between the fault-tolerant processing mechanism and the storage main process is effectively reduced, the data processing flow of the storage main process and the size of a program file of a storage system are simplified, the service operation efficiency of data storage is improved, the storage system is more convenient for system upgrade and maintenance, the concentration degree and efficiency of the fault-tolerant processing process are also effectively improved, and the user experience is greatly improved.
On the basis of the above contents, as a preferred embodiment, the fault-tolerant processing method in the data storage is implemented by deploying the fault-tolerant processing process in a local or cloud side.
Specifically, the fault-tolerant processing process provided in the present application may be deployed in a local storage device, or may be deployed in a cloud, and a person skilled in the art may specifically implement remote invocation by means of an rpc (remote Procedure call) service.
On the basis of the above contents, as a preferred embodiment, the fault-tolerant processing method in data storage, which is provided by the application, monitoring IO fault data and command fault data generated during the running of a storage host process includes:
receiving IO fault data by using a preset IO interrupt interface, and regularly searching command fault data in a preset shared storage area; the preset shared memory area is stored with command data sent to the disk bottom control module by the host upper control module during the running of the storage main process.
Specifically, when it is monitored that IO fault data is generated due to a problem occurring in the bottom storage IO read-write process of the storage device, interrupt triggering can be performed through a preset IO interrupt interface, and a corresponding fault processing scheme is searched and determined in a preset fault processing scheme library so as to perform fault processing.
On the other hand, for the control flow system error, a shared storage area can be preset, and command data generated by the host upper control module can be stored in the shared storage area so as to receive timing inquiry of fault-tolerant processing. By regularly judging the command data in the preset shared memory area, the command data with faults, namely the command fault data, can be detected, and further, similarly, the corresponding fault processing scheme can be searched and determined in the preset fault processing scheme library so as to carry out fault processing.
In the method, command data is stored in a preset shared memory area in a queue data form as a preferred embodiment.
Based on the above, as a preferred embodiment, the fault tolerance processing method in data storage further includes, after executing the fault handling scheme:
fault tolerant processing log data is generated and stored.
Specifically, it is easily understood that after a fault occurs, fault-tolerant processing log data may be further generated, and the specific condition of the fault is recorded for subsequent analysis by a related technician, and the like.
The following describes a fault-tolerant processing apparatus in data storage provided by the present application.
Referring to fig. 2, fig. 2 is a block diagram of a fault tolerant processing apparatus in a data storage according to the present application, including:
the calling unit 1 is used for calling a fault-tolerant processing process when the storage main process runs, and the fault-tolerant processing process and the storage main process run independently and parallelly;
the monitoring unit 2 is used for monitoring IO fault data and command fault data generated during the running of the storage main process;
the searching unit 3 is used for searching a corresponding fault processing scheme in the preset fault processing scheme library when the monitoring unit 2 monitors IO fault data or command fault data;
a processing unit 4 for executing a fault handling scheme.
Therefore, the fault-tolerant processing device in the data storage provided by the application forms a fault-tolerant processing process capable of running in parallel by extracting the fault-tolerant processing process from the storage main process in the prior art, effectively reduces the coupling degree between the fault-tolerant processing mechanism and the storage main process, simplifies the data processing flow of the storage main process and the size of a program file of a storage system, improves the service running efficiency of the data storage, enables the storage system to be more convenient for system upgrade and maintenance, effectively improves the concentration degree and efficiency of the fault-tolerant processing process, and greatly improves the user experience.
On the basis of the above, as a preferred embodiment, in the fault-tolerant processing apparatus in the data storage provided by the present application, the fault-tolerant processing process is deployed locally or in a cloud.
On the basis of the above, as a preferred embodiment, in the fault-tolerant processing apparatus in data storage provided by the present application, the monitoring unit 2 is specifically configured to: receiving IO fault data by using a preset IO interrupt interface, and regularly searching command fault data in a preset shared storage area; the preset shared memory area is stored with command data sent to the disk bottom control module by the host upper control module during the running of the storage main process.
Based on the above, as a preferred embodiment, the fault-tolerant processing apparatus in data storage provided by the present application stores command data in a preset shared storage area in a form of data of a queue.
Based on the above, as a preferred embodiment, the fault-tolerant processing apparatus in data storage provided by the present application includes any one or any combination of the following: command data retransmission, command data state modification and correction data generation aiming at the disk bottom layer control module.
Referring to fig. 3, fig. 3 is a block diagram illustrating a fault tolerant processing apparatus in a data storage according to another embodiment of the present invention. On the basis of the above, as a preferred embodiment, the fault-tolerant processing apparatus in data storage provided by the present application further includes:
a log generating unit 5 for generating and storing fault-tolerant processing log data after the processing unit 4 executes the fault processing scheme.
Further, the present application also discloses a fault-tolerant processing device in data storage, including:
a memory for storing a computer program;
a processor for executing said computer program to implement the steps of the fault tolerant processing method in any of the data stores as described above.
Further, the present application also discloses a computer-readable storage medium having a computer program stored therein, which, when being executed by a processor, is adapted to implement the steps of the fault-tolerant processing method in any one of the data stores as described above.
The specific embodiments of the fault-tolerant processing apparatus, device and computer-readable storage medium in data storage provided in the present application and the fault-tolerant processing method in data storage described above may be referred to correspondingly, and are not described herein again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
It is further noted that, throughout this document, relational terms such as "first" and "second" are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The technical solutions provided by the present application are described in detail above. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, without departing from the principle of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall into the protection scope of the present application.

Claims (10)

1. A fault-tolerant processing method in data storage is characterized by comprising the following steps:
calling a fault-tolerant processing process when a storage main process runs, wherein the fault-tolerant processing process and the storage main process run relatively independently and in parallel;
monitoring IO fault data and command fault data generated during the running of the storage main process;
when the IO fault data or the command fault data are monitored, searching a corresponding fault processing scheme in a preset fault processing scheme library;
executing the fault handling scheme.
2. The fault-tolerant processing method according to claim 1, wherein the fault-tolerant processing process is deployed locally or in a cloud.
3. The fault-tolerant processing method according to claim 2, wherein the monitoring of IO fault data and command fault data generated during the running of the storage host process comprises:
receiving the IO fault data by using a preset IO interrupt interface, and regularly searching the command fault data in a preset shared storage area; the preset shared storage area is stored with command data sent to the disk bottom control module by the host upper control module during the running of the storage main process.
4. The fault-tolerant processing method according to claim 3, wherein the command data is stored in the preset shared memory area in the form of a queue of data.
5. The fault-tolerant processing method according to claim 4, wherein the fault processing scheme comprises any one or any combination of the following:
command data retransmission, command data state modification and correction data generation aiming at the disk bottom layer control module.
6. The fault-tolerant processing method according to any one of claims 1 to 5, further comprising, after the executing the fault handling scheme:
fault tolerant processing log data is generated and stored.
7. A fault tolerant processing apparatus in a data store, comprising:
the calling unit is used for calling a fault-tolerant processing process when the storage main process runs, and the fault-tolerant processing process and the storage main process run independently and in parallel;
the monitoring unit is used for monitoring IO fault data and command fault data generated during the running of the storage main process;
the searching unit is used for searching a corresponding fault processing scheme in a preset fault processing scheme library when the monitoring unit monitors the IO fault data or the command fault data;
a processing unit for executing the fault handling scheme.
8. The fault-tolerant processing apparatus according to claim 7, further comprising:
and the log generation unit is used for generating and storing fault-tolerant processing log data after the processing unit executes the fault processing scheme.
9. A fault tolerant processing device in a data store, comprising:
a memory for storing a computer program;
processor for executing said computer program for implementing the steps of the fault tolerant processing method in a data storage according to any of claims 1 to 6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the fault-tolerant processing method in a data storage according to one of claims 1 to 6.
CN201811528482.2A 2018-12-13 2018-12-13 Fault-tolerant processing method, device, equipment and storage medium in data storage Active CN109634769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811528482.2A CN109634769B (en) 2018-12-13 2018-12-13 Fault-tolerant processing method, device, equipment and storage medium in data storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811528482.2A CN109634769B (en) 2018-12-13 2018-12-13 Fault-tolerant processing method, device, equipment and storage medium in data storage

Publications (2)

Publication Number Publication Date
CN109634769A CN109634769A (en) 2019-04-16
CN109634769B true CN109634769B (en) 2021-11-09

Family

ID=66073840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811528482.2A Active CN109634769B (en) 2018-12-13 2018-12-13 Fault-tolerant processing method, device, equipment and storage medium in data storage

Country Status (1)

Country Link
CN (1) CN109634769B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0590866A2 (en) * 1992-09-30 1994-04-06 AT&T Corp. Apparatus and methods for fault-tolerant computing
CN101777020A (en) * 2009-12-25 2010-07-14 北京讯鸟软件有限公司 Fault tolerance method and system used for distributed program
CN103593251A (en) * 2013-11-07 2014-02-19 浪潮电子信息产业股份有限公司 Fault-tolerant system based on process redundancy and design method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0590866A2 (en) * 1992-09-30 1994-04-06 AT&T Corp. Apparatus and methods for fault-tolerant computing
US5748882A (en) * 1992-09-30 1998-05-05 Lucent Technologies Inc. Apparatus and method for fault-tolerant computing
CN101777020A (en) * 2009-12-25 2010-07-14 北京讯鸟软件有限公司 Fault tolerance method and system used for distributed program
CN103593251A (en) * 2013-11-07 2014-02-19 浪潮电子信息产业股份有限公司 Fault-tolerant system based on process redundancy and design method thereof

Also Published As

Publication number Publication date
CN109634769A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
US20210004262A1 (en) Managed orchestration of virtual machine instance migration
CN103152419B (en) A kind of high availability cluster management method of cloud computing platform
US20190007290A1 (en) Automatic recovery engine with continuous recovery state machine and remote workflows
US10545807B2 (en) Method and system for acquiring parameter sets at a preset time interval and matching parameters to obtain a fault scenario type
US20130073894A1 (en) Techniques for achieving high availability with multi-tenant storage when a partial fault occurs or when more than two complete faults occur
CN111314125A (en) System and method for fault tolerant communication
US8898520B1 (en) Method of assessing restart approach to minimize recovery time
US10924326B2 (en) Method and system for clustered real-time correlation of trace data fragments describing distributed transaction executions
CN108804215B (en) Task processing method and device and electronic equipment
CN106789141B (en) Gateway equipment fault processing method and device
EP3591485B1 (en) Method and device for monitoring for equipment failure
JP2005346331A (en) Failure recovery apparatus, method for restoring fault, manager apparatus, and program
JP2014067089A (en) Distributed system, server computer, distributed management server and failure occurrence prevention method
US10228969B1 (en) Optimistic locking in virtual machine instance migration
CN109286529A (en) A kind of method and system for restoring RabbitMQ network partition
CN111880906A (en) Virtual machine high-availability management method, system and storage medium
US7373542B2 (en) Automatic startup of a cluster system after occurrence of a recoverable error
CN111865695A (en) Method and system for automatic fault handling in cloud environment
CN112988433A (en) Method, apparatus and computer program product for fault management
US20180188713A1 (en) Method and Apparatus for Automatically Maintaining Very Large Scale of Machines
US10042670B2 (en) Providing automatic retry of transactions with diagnostics
CN113573344A (en) SMF session detection method based on 5G and terminal
CN109634769B (en) Fault-tolerant processing method, device, equipment and storage medium in data storage
CN102221995A (en) Break restoration method of seismic data processing work
US8090997B2 (en) Run-time fault resolution from development-time fault and fault resolution path identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant