WO2019020081A1 - Système distribué, procédé de correction de défaut et appareil associé, produit et support de stockage - Google Patents

Système distribué, procédé de correction de défaut et appareil associé, produit et support de stockage Download PDF

Info

Publication number
WO2019020081A1
WO2019020081A1 PCT/CN2018/097262 CN2018097262W WO2019020081A1 WO 2019020081 A1 WO2019020081 A1 WO 2019020081A1 CN 2018097262 W CN2018097262 W CN 2018097262W WO 2019020081 A1 WO2019020081 A1 WO 2019020081A1
Authority
WO
WIPO (PCT)
Prior art keywords
master node
metadata
redo log
node
distributed system
Prior art date
Application number
PCT/CN2018/097262
Other languages
English (en)
Chinese (zh)
Inventor
褚建辉
卢申朋
刘东辉
王新栋
Original Assignee
广东神马搜索科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东神马搜索科技有限公司 filed Critical 广东神马搜索科技有限公司
Publication of WO2019020081A1 publication Critical patent/WO2019020081A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • the present invention relates to the field of distributed technologies, and in particular, to a distributed system and a method, apparatus, and storage medium for the same.
  • FIG. 1 is a schematic diagram showing the structure of a distributed system employing a master-slave architecture.
  • the distributed system of the master-slave architecture is mostly composed of a master node and a plurality of slave nodes.
  • the master node usually has functions such as metadata storage and query, cluster node state management, decision making and task delivery.
  • the metadata managed by the master node is the more important data in the system. The loss of data on the node has a greater impact on the system.
  • the invention provides a distributed system and a fault recovery method, device, product and storage medium thereof, which acquires metadata mirroring of a master node at one or more moments, and records the operation of the master node in a redo log, When the primary node fails, the primary node can be quickly restored to the pre-failure state based on the previously recorded metadata mirroring and redo logs.
  • a distributed system comprising a master node for scheduling tasks and managing system states and a plurality of slave nodes for running scheduled tasks, wherein one or more slaves
  • the node and/or the master node acquires and saves a metadata image recorded with scheduling information and system status at a certain moment on the master node; the master node acquires and saves a redo log recording all operations of the master node after the moment; and the master node
  • the metadata mirror and its corresponding redo log are called for failure recovery when the fault is recovered.
  • the primary node can be quickly restored to the state before the failure, and the recovery efficiency can be improved compared with the manner of recording only the log files.
  • one or more slave nodes and/or master nodes perform metadata mirroring acquisition and save operations triggered by the master node and/or external commands. Therefore, different trigger modes can be set according to the characteristics of the distributed system to trigger the acquisition and save operation of the metadata mirror.
  • the master node responds to the slave node's request after each operation is recorded in the redo log and stored. This ensures that the redo log can fully record every operation of the primary node.
  • the one or more slave nodes and/or the master node continuously acquire and save the metadata mirror of the master node at a plurality of different moments, and the master node continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the master node can call the latest metadata mirror and its corresponding redo log for fault recovery when the fault is recovered, and call the metadata mirror and its corresponding when the latest metadata mirror and/or its corresponding redo log are unavailable.
  • the redo logs are available for recovery at the most recent time.
  • the fault tolerance rate at the time of failure recovery can be improved.
  • one or more slave nodes and/or master nodes directly acquire and save the memory state of the master node at a certain moment as a metadata mirror.
  • Metadata mirroring can be stored in groups of tasks. Thereby, the corresponding metadata mirror can be efficiently organized according to the grouping at the subsequent recovery.
  • a fault recovery apparatus for a distributed system, the distributed system including a master node for scheduling tasks and managing system states, and a plurality of slave nodes for running tasks, the device
  • the method is used for recovering the fault when the primary node fails, and includes: a mirroring acquiring unit, configured to acquire and save a metadata mirror that records scheduling information and system status at a certain moment on the primary node; and the redo log obtaining unit uses Obtaining and saving a redo log of all operations of the primary node after the record is recorded; and a fault recovery unit for invoking the metadata mirror and its corresponding redo log for failure recovery when the fault is recovered.
  • the image acquisition unit performs the acquisition and save operation of the metadata mirroring under the trigger of the master node, the device, and/or the external command.
  • the master node responds to the request of the slave node after each operation thereof is recorded in the redo log by the redo log obtaining unit and stored.
  • the image obtaining unit continuously acquires and saves the metadata mirror of the master node at a plurality of different times
  • the redo log obtaining unit continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the fault recovery unit calls the latest metadata mirror and its corresponding redo log for fault recovery when the fault is recovered.
  • the fault recovery unit calls the data of the latest time available for the metadata mirror and its corresponding redo log to perform fault recovery when the latest metadata mirror and/or its corresponding redo log is unavailable.
  • the image acquisition unit directly acquires and saves the memory state of the master node at a certain moment as a metadata image.
  • the image acquisition unit stores the metadata image according to the task group.
  • a method for recovering a fault of a distributed system comprising a master node for scheduling tasks and managing system states, and a plurality of slave nodes for running tasks
  • the method is configured to perform fault recovery when the primary node fails.
  • the method includes: acquiring and saving a metadata image of the scheduling information and the system state recorded at a certain moment; acquiring and saving the weight of all the scheduling operations after the recording has a time Do the log; and call the metadata mirror and its corresponding redo log for failback when the failure recovers.
  • the metadata mirroring of the master node at a plurality of different moments is continuously acquired and saved, and the redo logs respectively corresponding to the plurality of different moments are continuously acquired and saved.
  • invoking the metadata mirror and its corresponding redo log for fault recovery during fault recovery may include: calling the latest metadata mirror and its corresponding redo log for fault recovery during fault recovery; and in the latest element When data mirroring and/or its corresponding redo log is unavailable, the data of the latest time that the metadata mirror and its corresponding redo log are available are called for failure recovery.
  • the memory state of the master node at a certain moment can be directly obtained and saved as a metadata mirror.
  • the obtaining and saving operation of the metadata mirroring is performed under the trigger of the master node and/or an external command.
  • the master node responds to the request of the slave node after each operation thereof is recorded in the redo log and stored.
  • the metadata image is stored in accordance with a task grouping.
  • a computer program product comprising: a memory; a processor; and a computer program; wherein the computer program is stored in the memory and configured to be processed by the The method of the third aspect of the invention and any of its preferred aspects is performed.
  • a fifth aspect of the invention provides a computer readable storage medium comprising: a program, when executed on a computer, causing a computer to perform the method of the third aspect of the invention and any of the preferred aspects thereof.
  • the distributed system, the fault recovery method, the device, the product and the storage medium of the present invention obtain the metadata mirroring of the master node at one or more moments, and record the subsequent operations of the master node in the redo log, so that When the primary node fails, the primary node can be quickly restored to the pre-fault state based on the previously recorded metadata mirroring and redo logs.
  • FIG. 1 is a schematic diagram showing the architecture of a distributed system of a master-slave architecture.
  • FIG. 2 is a schematic flow chart showing a fault recovery method according to an embodiment of the present invention.
  • FIG. 3 is a diagram showing the continuous storage of a plurality of metadata mirrors and redo logs.
  • FIG. 4 is a schematic block diagram showing the structure of a failure recovery device according to an embodiment of the present invention.
  • FIG. 5 is a structural diagram of a computer program product according to an exemplary embodiment of the present invention.
  • the operation flow of the master node in the scheme is as follows: before the master node performs the operation, the operation may be recorded in the log file, and the operation may be performed after the recording succeeds, that is, the data in the memory may be updated based on the operation;
  • the recovery process is as follows: the log file is read, and the data in the memory is sequentially modified based on the operation of the master node recorded in the log file. This method of recovering log files only by recording write operations is simple, but the recovery process takes a very long time.
  • the inventor found that in the process of recording the log file of the operation of the master node, the image file of the memory data of the master node at a certain moment can be interspersed, and the image file can represent that the master node is corresponding.
  • the current state data at the moment so that when the master node fails, the latest image file and the operation recorded in the log file after the time corresponding to the called image file can be called, and the master node can be implemented according to the called data.
  • Recovery can significantly reduce the time required for recovery compared to just logging log files.
  • the present invention proposes a failure recovery scheme for a primary node in a distributed system, and the failure recovery scheme of the present invention can be implemented by the distributed system shown in FIG. 1.
  • the distributed system of the present invention may include a master node for scheduling tasks and managing system states and a plurality of slave nodes for running scheduled tasks. Both the master node and the slave node can be deployed in the server, and the master node can be deployed in a separate server different from the slave node, or can be deployed in the same server as one of the slave nodes. As a preferred embodiment, different nodes can be deployed in different servers.
  • the distributed system shown in FIG. 1 is composed of a master node and a plurality of slave nodes. It should be understood that the distributed system of the present invention may further include a plurality of master nodes, and may also include other nodes than the master node and the slave node. Devices such as backup master nodes, failover databases, and more.
  • FIG. 2 is a schematic flow chart showing a fault recovery method according to an embodiment of the present invention.
  • the method shown in FIG. 2 can be implemented by the distributed system shown in FIG. 1, and in particular, can be implemented by a master node in a distributed system.
  • step S210 the metadata image of the scheduling information and the system state recorded at a certain moment on the master node is acquired and saved.
  • the master node For a distributed system with a master-slave architecture, after the master node crashes, the entire distributed system is unavailable, so considering the importance of the master node, the master node usually does not directly run specific tasks, but is only responsible for maintaining distributed The operation of the system and the scheduling of tasks are assigned, and specific tasks can be performed by the slave nodes. That is to say, the primary node is mainly responsible for parsing the task request, allocating resources, and locating the target data or nodes according to the metadata, and the specific task is performed by the slave node specified by the master node.
  • the metadata is data for describing data
  • the metadata in the present invention refers specifically to data that the primary node is responsible for saving and managing.
  • the metadata may refer to data that records scheduling information and system status at a certain moment on the master node.
  • the metadata may be system related description data, system state data, current task scheduling and status data, etc.
  • the metadata may be a state describing user data. Data for information such as storage location.
  • the obtained metadata mirror of the master node at a certain time may be a mapping of the memory state of the master node at that moment, so that the memory state of the master node at a certain moment can be directly obtained and saved as a metadata mirror.
  • the metadata mirror of the master node at a certain moment can be obtained by means of Snapshot or dump (backup file system).
  • the operation of obtaining the metadata image may be performed by the master node, by one or more slave nodes, or by a backup master node in the distributed system.
  • the obtained metadata image can be stored persistently on a local disk or a distributed file system, for example, can be stored persistently in the failover database.
  • the master node may perform scheduling according to the packet concurrently when scheduling the task, and the obtained metadata mirror may be a metadata mirror under multiple groups, and therefore, the acquired metadata
  • the mirroring can be stored according to the task group, and the metadata mirrors belonging to the same task group are stored in the same directory, so that the corresponding metadata mirror can be efficiently organized according to the grouping in subsequent recovery.
  • step S220 the redo log in which all operations of the master node after the time is recorded may be acquired and saved by the master node.
  • the operations described herein may refer to operations performed by the primary node on metadata or operations performed by the primary node on its in-memory data.
  • the primary node For each operation performed by the primary node, it can be recorded in the redo log.
  • the operation information of the master node can be sequentially recorded in the redo log.
  • the operation For each operation that the primary node will perform, the operation can be performed by the primary node after the operation is recorded in the redo log and persisted. In this way, when the primary node fails during the execution of the operation, the operation can be resumed according to the data recorded in the redo log. Otherwise, if the re-recording is performed for an operation first, and the operation is interrupted during the execution of the operation or before the operation is recorded or saved, the operation cannot be resumed and can only be repeated.
  • the master node may first record the operation of delivering the target data to the slave node in the redo log, and successfully record and persist the save. After that, the master node sends the target data to the slave node in response to the request of the slave node.
  • the request for the slave node can be responded to the slave node's request after the master node's operation for the request is recorded in the redo log and stored (persistent storage).
  • step S230 the metadata mirror and its corresponding redo log are called for failure recovery when the fault is recovered.
  • metadata mirroring can be seen as a mapping of the memory state of the master node at a certain time, while redo logs record all operations of the master node. Therefore, when the primary node fails, the operation of the primary node may occur according to the metadata mirror acquired before the failure occurs and the operation of the primary node during the period before the failure of the primary node after the time corresponding to the metadata mirror recorded in the redo log. Fault recovery, restore the primary node to the state before the failure occurred.
  • redo log records in the file system for example, you can recover as follows: After the primary node restarts, first traverse the metadata mirror directory in the file system, find the most recent metadata mirror, load it into memory, and then start. The redo log after loading the latest metadata image and start replay, so after the loading is complete, the entire recovery process is complete.
  • a plurality of metadata mirrors corresponding to different time instants may be saved.
  • the acquisition operation of the metadata mirror may be performed periodically or in response to satisfying the predetermined trigger condition.
  • the above trigger condition may be, for example, a certain parameter satisfies a predetermined value, reaches a predetermined interval, or directly responds to an external trigger command.
  • the acquisition operation of the metadata mirror may be performed once every predetermined number of operations are recorded in the redo log, or the acquisition operation of the metadata mirror may be performed once every predetermined time.
  • FIG. 3 is a schematic diagram showing the principle of continuously saving a plurality of metadata mirror files and their corresponding redo logs.
  • the metadata mirror 1 of the master node at time t1 can be obtained first, and the operation of the master node between t1 and t2 can be recorded and stored in the redo log 1, and the metadata mirror of the master node can be acquired again at time t2.
  • the operation of the master node between t2-t3 can be recorded and stored in the redo log 2, and so on, and the metadata mirrors respectively corresponding to the times t1, t2, and t3, and the metadata corresponding to the different moments respectively can be obtained.
  • the master node can first call the latest metadata mirror (ie metadata image at time t3) and its corresponding redo log (the weight within t3-t4 segment) during fault recovery. Do log) for failure recovery. If the latest metadata mirroring and redo logs are not available, you can further call the new metadata mirror (that is, the metadata mirror at time t2) and the redo log (that is, the redo log in the t2-t3 segment). Recovery, and so on, can be pushed back until the available data files are available.
  • the fault tolerance rate at the time of failure recovery can be improved.
  • the solution of the present application can trigger the acquisition and storage of the metadata image (for example, save the state at time t3) under certain conditions or commands, and then start continuous recording of the redo log (ie, record t3). After all the operations). After the failure occurs at time t4, all the operations after t3 can be played back by restoring the state at time t3 so that the master node quickly returns to the state at time t4.
  • the metadata image 1 acquired at time t1 may contain some operations in redo log 1 after time t1. Therefore, when the master node fails at time t2, the metadata image 1 at time t1 and the corresponding redo are used. When log 1 is restored, it is likely that the state of the last restored primary node is inconsistent with the state before the recovery.
  • the time of the operation recorded in the redo log at this time can be recorded in real time, and the metadata mirroring at a certain moment is obtained.
  • the corresponding operation can be removed from the redo log to avoid the phenomenon that the acquired metadata mirror includes some operations recorded in the redo log, so that the metadata mirror can be corresponding to the redo log at the time. Strictly contrasted.
  • FIG. 4 is a block diagram showing the structure of a fault recovery apparatus according to an embodiment of the present invention.
  • the functional modules of the fault recovery device 400 may be implemented by hardware, software, or a combination of hardware and software that implements the principles of the present invention.
  • the functional blocks depicted in FIG. 4 can be combined or divided into sub-modules to implement the principles of the above described invention. Accordingly, the description herein may support any possible combination, or division, or further limitation of the functional modules described herein.
  • the fault recovery apparatus 400 shown in FIG. 4 can be used to implement the fault recovery method shown in FIG. 2, and only the functional modules that the fault recovery apparatus 400 can have and the operations that can be performed by the functional modules are briefly described. For details, please refer to the description above in conjunction with FIG. 2, and details are not described herein again. It should be noted that the fault recovery apparatus 400 may be the primary node itself or a backup primary node.
  • the fault recovery apparatus of the present invention may include a mirror acquisition unit 410, a redo log acquisition unit 420, and a failure recovery unit 430.
  • the image obtaining unit 410 can acquire and save the metadata image of the scheduling information and the system state recorded at a certain moment on the master node
  • the redo log obtaining unit 420 can acquire and save the redo log of all the operations of the master node after the recording time.
  • the fault recovery unit 430 can invoke the metadata mirror and its corresponding redo log for fault recovery when the fault is recovered.
  • the image acquisition unit 410 can perform the acquisition and save operation of the metadata mirror under the trigger of the master node, the device, and/or the external command.
  • the image obtaining unit 410 can directly acquire and save the memory state of the master node at a certain moment as a metadata mirror. Further, the image obtaining unit 410 may store the metadata image according to the task group.
  • the master node responds to the new request of the slave node after each operation thereof is recorded in the redo log and stored in the redo log and stored.
  • the image obtaining unit 410 continuously acquires and saves the metadata mirror of the master node at a plurality of different times
  • the redo log obtaining unit 420 continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the fault recovery unit 430 calls the latest metadata mirror and its corresponding redo log for failure recovery when the fault is recovered, and the fault recovery unit 430, when the latest metadata mirror and/or its corresponding redo log is unavailable, The data of the latest time that the metadata mirror and its corresponding redo log are available can be called for failure recovery.
  • the method according to the invention may also be embodied as a computer program or computer program product comprising computer program code instructions for performing the various steps defined above in the above method of the invention.
  • the invention may be embodied as a computer program product comprising: a memory; a processor; and a computer program; wherein the computer program is stored in the memory and configured to perform the invention by the processor The above method.
  • FIG. 5 is a structural diagram of an apparatus for displaying a power amount according to an exemplary embodiment of the present invention.
  • the embodiment provides a computer program product, including: at least one processor 51 and a memory 52.
  • a processor 51 is taken as an example.
  • the processor 51 and the memory 52 are connected by a bus 50.
  • 52 stores instructions executable by at least one processor 51, the instructions being executed by at least one processor 51 to cause at least one processor 51 to perform the above described method of the present invention.
  • the present invention may be embodied as a non-transitory machine readable storage medium (or computer readable storage medium, or machine readable storage medium) having stored thereon executable code (or computer program, or computer instruction code)
  • executable code or computer program, or computer instruction code
  • a processor of an electronic device or computing device, server, etc.
  • each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those illustrated in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Retry When Errors Occur (AREA)

Abstract

La présente invention concerne un système distribué, un procédé de correction de défaut et un appareil associé, un produit et un support de stockage. Le procédé comprend les étapes suivantes : un nœud esclave et/ou un nœud maître obtiennent et stockent un miroir de métadonnées possédant un enregistrement d'informations d'ordonnancement et un état système du nœud maître à un certain instant; le nœud maître obtient et stocke un journal "refaire" possédant un enregistrement de toutes les opérations du nœud maître effectuées après ledit instant; et lorsqu'un défaut se produit dans le nœud maître, le nœud maître appelle le miroir de métadonnées et son journal "refaire" correspondant pour exécuter une correction de défaut. Lorsqu'un défaut se produit, le procédé permet à un nœud maître de revenir rapidement à un état d'avant le défaut au moyen d'un miroir de métadonnées préalablement enregistré et d'un journal "refaire".
PCT/CN2018/097262 2017-07-28 2018-07-26 Système distribué, procédé de correction de défaut et appareil associé, produit et support de stockage WO2019020081A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710630823.6A CN107357688B (zh) 2017-07-28 2017-07-28 分布式系统及其故障恢复方法和装置
CN201710630823.6 2017-07-28

Publications (1)

Publication Number Publication Date
WO2019020081A1 true WO2019020081A1 (fr) 2019-01-31

Family

ID=60285161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097262 WO2019020081A1 (fr) 2017-07-28 2018-07-26 Système distribué, procédé de correction de défaut et appareil associé, produit et support de stockage

Country Status (2)

Country Link
CN (1) CN107357688B (fr)
WO (1) WO2019020081A1 (fr)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357688B (zh) * 2017-07-28 2020-06-12 广东神马搜索科技有限公司 分布式系统及其故障恢复方法和装置
CN108390771B (zh) * 2018-01-25 2021-04-16 中国银联股份有限公司 一种网络拓扑重建方法和装置
CN108427728A (zh) * 2018-02-13 2018-08-21 百度在线网络技术(北京)有限公司 元数据的管理方法、设备及计算机可读介质
CN109189480B (zh) * 2018-07-02 2021-11-09 新华三技术有限公司成都分公司 文件系统启动方法及装置
CN109144792A (zh) * 2018-10-08 2019-01-04 郑州云海信息技术有限公司 数据恢复方法、装置及、系统及计算机可读存储介质
CN109656911B (zh) * 2018-12-11 2023-08-01 江苏瑞中数据股份有限公司 分布式并行处理数据库系统及其数据处理方法
CN111104226B (zh) * 2019-12-25 2024-01-26 东北大学 一种多租户服务资源的智能管理系统及方法
CN112379977A (zh) * 2020-07-10 2021-02-19 中国航空工业集团公司西安飞行自动控制研究所 一种基于时间触发的任务级故障处理方法
CN111880969B (zh) * 2020-07-30 2024-06-04 上海达梦数据库有限公司 存储节点恢复方法、装置、设备和存储介质
CN115563028B (zh) * 2022-12-06 2023-03-14 苏州浪潮智能科技有限公司 一种数据缓存方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294701A (zh) * 2012-02-24 2013-09-11 联想(北京)有限公司 一种分布式文件系统以及数据处理的方法
CN104216802A (zh) * 2014-09-25 2014-12-17 北京金山安全软件有限公司 一种内存数据库恢复方法和设备
US9053123B2 (en) * 2010-09-02 2015-06-09 Microsoft Technology Licensing, Llc Mirroring file data
CN107357688A (zh) * 2017-07-28 2017-11-17 广东神马搜索科技有限公司 分布式系统及其故障恢复方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9053123B2 (en) * 2010-09-02 2015-06-09 Microsoft Technology Licensing, Llc Mirroring file data
CN103294701A (zh) * 2012-02-24 2013-09-11 联想(北京)有限公司 一种分布式文件系统以及数据处理的方法
CN104216802A (zh) * 2014-09-25 2014-12-17 北京金山安全软件有限公司 一种内存数据库恢复方法和设备
CN107357688A (zh) * 2017-07-28 2017-11-17 广东神马搜索科技有限公司 分布式系统及其故障恢复方法和装置

Also Published As

Publication number Publication date
CN107357688A (zh) 2017-11-17
CN107357688B (zh) 2020-06-12

Similar Documents

Publication Publication Date Title
WO2019020081A1 (fr) Système distribué, procédé de correction de défaut et appareil associé, produit et support de stockage
US11809726B2 (en) Distributed storage method and device
CN105389230B (zh) 一种结合快照技术的持续数据保护系统及方法
WO2019154394A1 (fr) Système en grappes de bases de données réparties, procédé de synchronisation de données et support de stockage
US10817478B2 (en) System and method for supporting persistent store versioning and integrity in a distributed data grid
WO2017177941A1 (fr) Procédé et appareil de commutation de base de données active/en attente
JP2019036353A (ja) 索引更新パイプライン
US8949190B2 (en) Point-in-time database recovery using log holes
US9652520B2 (en) System and method for supporting parallel asynchronous synchronization between clusters in a distributed data grid
WO2017128764A1 (fr) Procédé et système de mise en cache basés sur un groupe de caches
US10831741B2 (en) Log-shipping data replication with early log record fetching
WO2018098972A1 (fr) Technologie de récupération de journal, dispositif de stockage et nœud de stockage
JP2016524750A5 (fr)
CN102158540A (zh) 分布式数据库实现系统及方法
WO2021226905A1 (fr) Procédé et système de stockage de données, et support de stockage
US9830228B1 (en) Intelligent backup model for snapshots
WO2015184925A1 (fr) Procédé de traitement de données pour système de fichiers distribués et système de fichiers distribués
US11500812B2 (en) Intermediate file processing method, client, server, and system
US11042454B1 (en) Restoration of a data source
CN113946471A (zh) 基于对象存储的分布式文件级备份方法及系统
CN116389233B (zh) 容器云管理平台主备切换系统、方法、装置和计算机设备
JP5154843B2 (ja) クラスタシステム、計算機、および障害回復方法
CN113986450A (zh) 一种虚拟机备份方法及装置
CN112650447B (zh) 一种ceph分布式块存储的备份方法、系统及装置
CN103095767B (zh) 分布式缓存系统及基于分布式缓存系统的数据重构方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18837616

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18837616

Country of ref document: EP

Kind code of ref document: A1