CN111309524A - Distributed storage system fault recovery method, device, terminal and storage medium - Google Patents

Distributed storage system fault recovery method, device, terminal and storage medium Download PDF

Info

Publication number
CN111309524A
CN111309524A CN202010095163.8A CN202010095163A CN111309524A CN 111309524 A CN111309524 A CN 111309524A CN 202010095163 A CN202010095163 A CN 202010095163A CN 111309524 A CN111309524 A CN 111309524A
Authority
CN
China
Prior art keywords
cluster
state
osd
group
deleting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010095163.8A
Other languages
Chinese (zh)
Inventor
任洪亮
李景要
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010095163.8A priority Critical patent/CN111309524A/en
Publication of CN111309524A publication Critical patent/CN111309524A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1482Generic software techniques for error detection or fault masking by means of middleware or OS functionality

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The embodiment of the application provides a distributed storage system fault recovery method, a device, a terminal and a storage medium, and the method comprises the following steps: searching a placing group where the lost object is located; deleting the placement group; deleting the osd process data corresponding to the placement group and marking the osd process as finished; and checking the cluster state, and checking a cluster fault recovery result according to the cluster state. The invention can lead the cluster to continuously provide the service to the outside under the condition of not damaging the existing cluster for the cluster which exceeds the fault domain, and ensure the cyclic coverage writing and the playback scene of the required video under certain conditions.

Description

Distributed storage system fault recovery method, device, terminal and storage medium
Technical Field
The invention relates to the technical field of storage, in particular to a distributed storage system fault recovery method, a distributed storage system fault recovery device, a distributed storage system fault recovery terminal and a storage medium.
Background
The distributed storage system is a reliable and autonomous distributed object storage, and can simultaneously provide file system storage, object storage and block storage. In the fault domain range, when a node or a hard disk fails, such as downtime, power failure and other accidents, the service of the node can be taken over by other standby nodes to ensure that the service is normally provided, the service is not affected, and the storage still can normally provide the service at the moment.
If the cluster has a disk hardware fault and can not be recovered to normal, other nodes have a hard disk fault and can not be recovered to normal in the process of replacing the disk. Causing the cluster to go beyond the fault domain and partial data loss occurs. When the front-end service reads or modifies the lost object, the osd is abnormal or returns the EIO, so that the front-end service is interrupted. At this time, a large number of video files are created in the cluster according to the video service requirement, and the video files are continuously written in a circular covering manner, so that the video playback cannot be executed under the condition that the osd is abnormal.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a distributed storage system fault recovery method, a distributed storage system fault recovery device, a distributed storage system terminal and a storage medium, so as to solve the technical problems.
In a first aspect, an embodiment of the present application provides a distributed storage system failure recovery method, where the method includes:
searching a placing group where the lost object is located;
deleting the placement group;
deleting the osd process data corresponding to the placement group and marking the osd process as finished;
and checking the cluster state, and checking a cluster fault recovery result according to the cluster state.
Further, the deleting placement group includes:
stop the osd service;
calling an icfs-objectstore-tool to delete the placement group;
reloading the system configuration file;
the osd service is started.
Further, deleting the osd process data corresponding to the placement group and marking the osd process as completed includes:
stop the osd service;
calling an icfs-objectstore-tool to delete data in the osd process corresponding to the placement group;
marking the corresponding osd process as a completion state;
reloading the system configuration file;
the osd service is opened.
Further, the checking the cluster state and verifying the cluster failure recovery result according to the cluster state includes:
calling an icfs-s tool to check the cluster state;
judging whether the cluster state is an active clean state:
if so, judging that the cluster fault recovery is successful;
if not, judging that the cluster fault recovery fails.
In a second aspect, an embodiment of the present application provides a distributed storage system failure recovery apparatus, where the apparatus includes:
the group searching unit is configured for searching a placing group in which the lost object is positioned;
a group deleting unit configured to delete the placement group;
the process deleting unit is configured to delete the osd process data corresponding to the placement group and mark the osd process as complete;
and the state verification unit is configured to check the cluster state and verify the cluster fault recovery result according to the cluster state.
Further, the group deleting unit includes:
a first stopping module configured to stop the osd service;
the group deleting module is configured to call an icfs-objectstore-tool to delete the placement group;
the first loading module is configured to reload the system configuration file;
a first initiation module configured to initiate the osd service.
Further, the process deleting unit includes:
a second stopping module configured to stop the osd service;
the process deleting module is configured to call the icfs-objectstore-tool to delete the data in the osd process corresponding to the placement group;
a process marking module configured to mark the corresponding osd process as a completed state;
a second loading module configured to reload the system configuration file;
a second starting module configured to start the osd service.
Further, the state verifying unit includes:
the state viewing module is configured for calling an icfs-s tool to view the cluster state;
a state judgment module configured to judge the state;
the success judging module is configured to judge that the cluster fault is successfully recovered if the cluster state is an active clean state;
and the failure determination module is configured to determine that the cluster failure recovery fails if the cluster state is not the active clean state.
In a third aspect, a terminal is provided, including:
a processor, a memory, wherein,
the memory is used for storing a computer program which,
the processor is configured to call and run the computer program from the memory, so that the terminal performs the above-mentioned method of the terminal.
In a fourth aspect, a computer storage medium is provided having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
The beneficial effect of the invention is that,
the fault recovery method, the device, the terminal and the storage medium of the distributed storage system provided by the invention have the advantages that the fault recovery result is verified by positioning the placement group where the lost object is located, deleting the placement group where the lost object is located and osd process data of the placement group, marking the osd process as a completion state and checking the cluster state. The invention can lead the cluster to continuously provide the service to the outside under the condition of not damaging the existing cluster for the cluster which exceeds the fault domain, and ensure the cyclic coverage writing and the playback scene of the required video under certain conditions.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of a method of one embodiment of the present application.
FIG. 2 is a schematic block diagram of an apparatus of one embodiment of the present application.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following explains key terms appearing in the present application.
Placement group
Placement group for placement purpose Placement group
The PG is used for specifying the number of the directories of the storage objects in the storage pool;
the PGP is the OSD distribution combination number of the storage pool PG.
BlueStore object storage engine
Build on top of bare disk devices and do much optimization work for new storage devices such as SSDs.
Osd Object Storage Device
And the process responsible for responding to the client requests and returning the specific data corresponds to a hard disk in the cluster.
FIG. 1 is a schematic flow chart diagram of a method of one embodiment of the present application. The execution subject in fig. 1 may be a distributed storage system failure recovery apparatus.
As shown in fig. 1, the method 100 includes:
step 110, searching a placing group where the lost object is located;
step 120, deleting the placement group;
step 130, deleting the osd process data corresponding to the placement group and marking the osd process as finished;
step 140, checking the cluster state, and checking the cluster fault recovery result according to the cluster state.
In order to facilitate understanding of the present invention, the distributed storage system fault recovery method provided in the present invention is further described below with reference to the principle of the distributed storage system fault recovery method of the present invention and the process of performing fault recovery on the distributed storage system in the embodiment.
Specifically, the distributed storage system fault recovery method includes:
and S1, searching the placement group of the lost object.
Calling a tool icfs health detail | grep underfound # to search a pg group with object loss, and acquiring an osd number in the group.
And S2, deleting the placement group.
The osd service is stopped first, calling the command systemctl stop icfs-osd @ $ i to stop the osd service.
Calling the icfs-ObjectStore-tool to delete the pg found in step S1, where the icfs-ObjectStore-tool is a tool provided by ceph that can operate the object and pg, and can be used to view, modify, and delete the data in the object, pg, and dump OSD journal on the ObjectStore. The call command for the icfs-objectstore-tool is: icfs-objectstore-tool-data-path/var/lib/icfs/osd/icfs- $ i-journel-path/var/lib/icfs/osd/icfs- $ i/journel-type filescore-op remove-pgid $ { pg _ num } s $ num. When the cluster fault exceeds the fault domain, the data cannot be recovered by using a rollback mode; at this time, if the cluster still wants to provide services to the outside, it needs to delete the object pg that has been lost and make the failed OSD come back quickly.
And calling a systemctl daemon-load command to reload the configuration file of the distributed file system, and deleting the corresponding placement group information in the reloaded configuration file.
The system lctl start icfs-osd @ $ i command is called to open the osd service.
S3, deleting the osd process data corresponding to the placement group and marking the osd process as finished.
First, the call systemctl stop icfs-osd @ $ i command stops the osd service.
The icfs-objectstore-tool is invoked to delete the data in the OSDjournal corresponding to the placement group deleted in step S2. The specific calling method of the icfs-object store-tool side comprises the following steps: icfs-objectstore-tool-data-path/var/lib/icfs/osd/icfs- $ i-journel-path/var/lib/icfs/osd/icfs- $ i/journel-type filescore-oprm-past-intervals-pgid $ { pg _ num } s $ num.
The osd process is marked as complete, and after the marking is completed, bluestore considers that pg can be listed as authority date. The specific marking method of the process comprises the following steps: icfs-objectstore-tool-data-path/var/lib/icfs/osd/icfs- $ i-journel-path/var/lib/icfs/osd/icfs- $ i/journel-typefilescore-op mark-complete-pgid $ { pg _ num } s num.
And calling a command system daemon-load to reload the configuration file, and updating the process information of the deleted placement group in the reloaded configuration file to ensure that the process data does not interfere with the distributed file system.
The command system start icfs-osd @ $ i is called to open the osd service.
And S4, checking the cluster state, and checking the cluster fault recovery result according to the cluster state.
Invoke the icfs-s command to view cluster state, for example:
Figure BDA0002384609470000081
Figure BDA0002384609470000091
the cluster state mainly includes the following categories:
HEALTH _ WARN represents the cluster war state, and the normal state is OK
Several states of pgs are illustrated below:
waiting is in a synchronous state. PG is executing synchronous processing
degraded is degraded state. After Peering is completed, the PG detects that any PG instance has an inconsistent (needs to be synchronized/repaired) object or the current ActingSet is less than the storage pool copy number.
Recovering: is recovering state. The cluster is performing migration or synchronization of objects and their copies.
recovery _ wait: waiting for Recovery resource reservation.
stuck unclean: a non-clean state. PG cannot recover from the last failure.
Active: an active state. The PG can handle read and write requests from clients normally.
The normal PG state is 100% active + clean, which means that all PGs are accessible and all copies are available for all PGs. Therefore, after the cluster state is obtained, whether the cluster state is 100% active + clean (active clean state) or not is checked, and if the cluster state is the active + clean (active clean state), the cluster fault recovery is successful; if not, indicating that the cluster failure recovery fails.
As shown in fig. 2, the apparatus 200 includes:
a group search unit 210 configured to search a placement group in which the lost object is located;
a group deleting unit 220 configured to delete the placement group;
a process deleting unit 230 configured to delete the osd process data corresponding to the placement group and mark the osd process as completed;
and the state verification unit 240 is configured to check the cluster state and verify a cluster failure recovery result according to the cluster state.
Optionally, as an embodiment of the present application, the group deleting unit includes:
a first stopping module configured to stop the osd service;
the group deleting module is configured to call an icfs-objectstore-tool to delete the placement group;
the first loading module is configured to reload the system configuration file;
a first initiation module configured to initiate the osd service.
Optionally, as an embodiment of the present application, the process deleting unit includes:
a second stopping module configured to stop the osd service;
the process deleting module is configured to call the icfs-objectstore-tool to delete the data in the osd process corresponding to the placement group;
a process marking module configured to mark the corresponding osd process as a completed state;
a second loading module configured to reload the system configuration file;
a second starting module configured to start the osd service.
Optionally, as an embodiment of the present application, the state verification unit includes:
the state viewing module is configured for calling an icfs-s tool to view the cluster state;
a state judgment module configured to judge the state;
the success judging module is configured to judge that the cluster fault is successfully recovered if the cluster state is an active clean state;
and the failure determination module is configured to determine that the cluster failure recovery fails if the cluster state is not the active clean state.
Fig. 3 is a schematic structural diagram of a terminal device 300 according to an embodiment of the present invention, where the terminal device 300 may be used to execute the method for recovering from a failure in a distributed storage system according to the embodiment of the present application.
Among them, the terminal apparatus 300 may include: a processor 310, a memory 320, and a communication unit 330. The components communicate via one or more buses, and those skilled in the art will appreciate that the architecture of the servers shown in the figures is not limiting of the application, and may be a bus architecture, a star architecture, a combination of more or fewer components than those shown, or a different arrangement of components.
The memory 320 may be used for storing instructions executed by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile storage terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The executable instructions in memory 320, when executed by processor 310, enable terminal 300 to perform some or all of the steps in the method embodiments described below.
The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory. The processor may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 310 may include only a Central Processing Unit (CPU). In the embodiments of the present application, the CPU may be a single arithmetic core or may include multiple arithmetic cores.
A communication unit 330, configured to establish a communication channel so that the storage terminal can communicate with other terminals. And receiving user data sent by other terminals or sending the user data to other terminals.
The present application also provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments provided in the present application when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
Therefore, the fault recovery result is verified by positioning the placement group where the lost object is located, deleting the placement group where the lost object is located and osd process data of the placement group, marking the osd process as a completion state, and checking the cluster state. For the cluster which exceeds the fault domain, the invention can continuously provide the service to the outside by the cluster without destroying the existing cluster condition, and ensure the cyclic coverage writing and the playback scene of the video required under certain conditions.
Those skilled in the art will clearly understand that the techniques in the embodiments of the present application may be implemented by way of software plus a required general hardware platform. Based on such understanding, the technical solutions in the embodiments of the present application may be embodied in the form of a software product, where the computer software product is stored in a storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, and includes several instructions to enable a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, and the like) to perform all or part of the steps of the method according to the embodiments of the present invention.
The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the description in the method embodiment.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A distributed storage system failure recovery method, the method comprising:
searching a placing group where the lost object is located;
deleting the placement group;
deleting the osd process data corresponding to the placement group and marking the osd process as finished;
and checking the cluster state, and checking a cluster fault recovery result according to the cluster state.
2. The method of claim 1, wherein deleting the placement group comprises:
stop the osd service;
calling an icfs-objectstore-tool to delete the placement group;
reloading the system configuration file;
the osd service is started.
3. The method of claim 1, wherein deleting osd process data corresponding to a placement group and marking the osd process as complete comprises:
stop the osd service;
calling an icfs-objectstore-tool to delete data in the osd process corresponding to the placement group;
marking the corresponding osd process as a completion state;
reloading the system configuration file;
the osd service is opened.
4. The method of claim 1, wherein said checking the cluster status and verifying the cluster failure recovery result according to the cluster status comprises:
calling an icfs-s tool to check the cluster state;
judging whether the cluster state is an active clean state:
if so, judging that the cluster fault recovery is successful;
if not, judging that the cluster fault recovery fails.
5. A distributed storage system failover system, the system comprising:
the group searching unit is configured for searching a placing group in which the lost object is positioned;
a group deleting unit configured to delete the placement group;
the process deleting unit is configured to delete the osd process data corresponding to the placement group and mark the osd process as complete;
and the state verification unit is configured to check the cluster state and verify the cluster fault recovery result according to the cluster state.
6. The system of claim 5, wherein the group deletion unit comprises:
a first stopping module configured to stop the osd service;
the group deleting module is configured to call an icfs-objectstore-tool to delete the placement group;
the first loading module is configured to reload the system configuration file;
a first initiation module configured to initiate the osd service.
7. The system of claim 5, wherein the process deletion unit comprises:
a second stopping module configured to stop the osd service;
the process deleting module is configured to call the icfs-objectstore-tool to delete the data in the osd process corresponding to the placement group;
a process marking module configured to mark the corresponding osd process as a completed state;
a second loading module configured to reload the system configuration file;
a second starting module configured to start the osd service.
8. The system of claim 5, wherein the state verification unit comprises:
the state viewing module is configured for calling an icfs-s tool to view the cluster state;
a state judgment module configured to judge the state;
the success judging module is configured to judge that the cluster fault is successfully recovered if the cluster state is an active clean state;
and the failure determination module is configured to determine that the cluster failure recovery fails if the cluster state is not the active clean state.
9. A terminal, comprising:
a processor;
a memory for storing instructions for execution by the processor;
wherein the processor is configured to perform the method of any one of claims 1-4.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.
CN202010095163.8A 2020-02-14 2020-02-14 Distributed storage system fault recovery method, device, terminal and storage medium Withdrawn CN111309524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010095163.8A CN111309524A (en) 2020-02-14 2020-02-14 Distributed storage system fault recovery method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010095163.8A CN111309524A (en) 2020-02-14 2020-02-14 Distributed storage system fault recovery method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN111309524A true CN111309524A (en) 2020-06-19

Family

ID=71158499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010095163.8A Withdrawn CN111309524A (en) 2020-02-14 2020-02-14 Distributed storage system fault recovery method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN111309524A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111813604A (en) * 2020-07-17 2020-10-23 济南浪潮数据技术有限公司 Data recovery method, system and related device of fault storage equipment
CN111984470A (en) * 2020-08-07 2020-11-24 苏州浪潮智能科技有限公司 Storage cluster system fault recovery automatic detection method and device
CN112162883A (en) * 2020-09-27 2021-01-01 北京浪潮数据技术有限公司 Duplicate data recovery method and system, electronic equipment and storage medium
CN112486731A (en) * 2020-11-03 2021-03-12 苏州浪潮智能科技有限公司 Method, device, equipment and product for restoring distribution of placement groups
CN112711497A (en) * 2021-01-05 2021-04-27 浪潮云信息技术股份公司 Recovery method and system for unfolded faults of container deployment Ceph cluster object
CN113535474A (en) * 2021-06-30 2021-10-22 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for automatically repairing heterogeneous cloud storage cluster fault
CN116302673A (en) * 2023-05-26 2023-06-23 四川省华存智谷科技有限责任公司 Method for improving data recovery rate of Ceph storage system
CN118093250A (en) * 2024-04-25 2024-05-28 苏州元脑智能科技有限公司 Fault processing method and device, electronic equipment and storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111813604B (en) * 2020-07-17 2022-06-10 济南浪潮数据技术有限公司 Data recovery method, system and related device of fault storage equipment
CN111813604A (en) * 2020-07-17 2020-10-23 济南浪潮数据技术有限公司 Data recovery method, system and related device of fault storage equipment
CN111984470A (en) * 2020-08-07 2020-11-24 苏州浪潮智能科技有限公司 Storage cluster system fault recovery automatic detection method and device
CN111984470B (en) * 2020-08-07 2022-12-20 苏州浪潮智能科技有限公司 Storage cluster system fault recovery automatic detection method and device
CN112162883A (en) * 2020-09-27 2021-01-01 北京浪潮数据技术有限公司 Duplicate data recovery method and system, electronic equipment and storage medium
CN112486731A (en) * 2020-11-03 2021-03-12 苏州浪潮智能科技有限公司 Method, device, equipment and product for restoring distribution of placement groups
CN112486731B (en) * 2020-11-03 2023-01-10 苏州浪潮智能科技有限公司 Method, device, equipment and product for restoring distribution of placement groups
CN112711497A (en) * 2021-01-05 2021-04-27 浪潮云信息技术股份公司 Recovery method and system for unfolded faults of container deployment Ceph cluster object
CN113535474A (en) * 2021-06-30 2021-10-22 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for automatically repairing heterogeneous cloud storage cluster fault
CN116302673A (en) * 2023-05-26 2023-06-23 四川省华存智谷科技有限责任公司 Method for improving data recovery rate of Ceph storage system
CN116302673B (en) * 2023-05-26 2023-08-22 四川省华存智谷科技有限责任公司 Method for improving data recovery rate of Ceph storage system
CN118093250A (en) * 2024-04-25 2024-05-28 苏州元脑智能科技有限公司 Fault processing method and device, electronic equipment and storage medium
CN118093250B (en) * 2024-04-25 2024-08-02 苏州元脑智能科技有限公司 Fault processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111309524A (en) Distributed storage system fault recovery method, device, terminal and storage medium
CN112596951B (en) NAS data disaster recovery method, device, equipment and storage medium
CN111124755A (en) Cluster node fault recovery method and device, electronic equipment and storage medium
CN111858468B (en) Method, system, terminal and storage medium for verifying metadata of distributed file system
US8612799B2 (en) Method and apparatus of backing up subversion repository
CN113626256A (en) Virtual machine disk data backup method, device, terminal and storage medium
CN107506295A (en) Method of testing, equipment and the computer-readable recording medium of virtual machine backup
CN103064759B (en) The method of data restore and device
CN108733808B (en) Big data software system switching method, system, terminal equipment and storage medium
JP2778798B2 (en) Queue structure management processing method for control data
CN112540873B (en) Disaster tolerance method and device, electronic equipment and disaster tolerance system
CN113806309A (en) Metadata deleting method, system, terminal and storage medium based on distributed lock
CN114328374A (en) Snapshot method, device, related equipment and database system
CN118377657B (en) Data recovery method and device, storage medium and electronic equipment
CN110572442A (en) Method and system for configuring file path
CN111124787B (en) Method, system and equipment for verifying stability of MCS multi-node concurrent dump
CN113467717B (en) Dual-machine volume mirror image management method, device and equipment and readable storage medium
CN118331516B (en) Data processing method and device
CN112269738B (en) CTF target range debugging method, device, electronic equipment and medium
CN110008114A (en) Configuration information maintaining method, device, equipment and readable storage medium storing program for executing
CN107707402B (en) Management system and management method for service arbitration in distributed system
CN116302696A (en) Archive log generation method of database system, storage medium and computer device
JPH08335206A (en) Automatic transaction restoration system of loosely coupled multicomputer system
CN118377657A (en) Data recovery method and device, storage medium and electronic equipment
CN114880152A (en) Data three-party disaster tolerance abnormity monitoring processing method, system, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200619