CN112068994A - Method, apparatus, device and medium for data persistence during storage cluster runtime - Google Patents
Method, apparatus, device and medium for data persistence during storage cluster runtime Download PDFInfo
- Publication number
- CN112068994A CN112068994A CN202010846365.1A CN202010846365A CN112068994A CN 112068994 A CN112068994 A CN 112068994A CN 202010846365 A CN202010846365 A CN 202010846365A CN 112068994 A CN112068994 A CN 112068994A
- Authority
- CN
- China
- Prior art keywords
- cluster
- event
- commit
- node
- batch processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000002688 persistence Effects 0.000 title claims abstract description 31
- 230000002085 persistent effect Effects 0.000 claims abstract description 25
- 238000004590 computer program Methods 0.000 claims description 9
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 230000009471 action Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000011084 recovery Methods 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1471—Saving, restoring, recovering or retrying involving logging of persistent data for recovery
Abstract
The invention provides a method, a device, equipment and a medium for data persistence during the operation of a storage cluster, wherein the method comprises the following steps: the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing; after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data. And when the cluster crashes, the node exits from the cluster, and then the exiting node reappears to join the cluster, plays back the event batch processing when exiting, and interacts with the cluster with the current event batch processing sequence after the processing is finished. Based on the method, the device, the equipment and the medium for data persistence are also provided. The invention changes the mode of persisting the configuration data from the mode of persisting once per event or persisting the cluster state at fixed time into the mode of batch processing the events, thereby enhancing the stability of the cluster.
Description
Technical Field
The invention belongs to the technical field of storage systems, and particularly relates to a method, a device, equipment and a medium for data persistence during the operation of a storage cluster.
Background
The cluster system in the storage system is a small distributed control node cluster system, and configuration data among control nodes needs to be synchronized at any time in an event and state machine mode along with the progress of a service process. As each control node processes events, it needs to persist updated configuration data into non-volatile storage so that cluster configuration can be recovered from hardenda (persistent data) when the cluster crashes.
In the prior art, performing a persistence operation once per event will affect the efficiency of cluster operation, because the disk writing action is a long time consuming action. Or the manner in which cluster state data is persisted periodically is not conducive to determining from which event the cluster started processing when recovering from a crash.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method, a device, equipment and a medium for persisting data during the operation of a storage cluster, which improve the efficiency of persisting data through event batch processing and synchronously add protection to the states of all nodes among clusters.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method of storage cluster runtime data persistence, comprising the steps of:
the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing;
and after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data.
Further, the cluster master node generates a first event batch processing sequence for an event between the first commit event and the second commit event according to the commit event sent by the cluster master node, and persists the first event batch processing sequence to the first data.
Further, the cluster slave node or the cluster master node generates a first event batch processing sequence by the second commit event and persists the first event batch processing sequence into the first data.
Further, the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time, and the method further includes: and if the cluster slave node does not receive the commit event sent by the cluster master node within the preset time, the cluster slave node which does not receive the commit event automatically exits from the cluster and rejoins the cluster.
Further, the preset time is the time specified by event batch processing; the event batch specified time is 2 seconds to 3 seconds.
Further, when the cluster crashes, the node exits from the cluster, and then the exiting node reappears to join the cluster, the exiting node plays back the event batch processing when exiting, and interacts with the cluster in the current event batch processing sequence after the processing is finished.
Further, if the deviation of the current event batch processing sequence is less than or equal to 16 time batches, the cluster initiates synchronization, so that the returned node state and the cluster state are kept consistent;
if the deviation of the current event batch processing sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
The invention also provides a device for persisting data during the operation of the storage cluster, which comprises the following components:
the sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
The processing device for storing the cluster runtime data persistence further comprises:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence of any one of claims 1 to 7 when executing the computer program.
A computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for storage cluster runtime data persistence according to one of claims 1 to 7.
The effect provided in the summary of the invention is only the effect of the embodiment, not all the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
the invention provides a method, a device, equipment and a medium for persisting data during the running of a storage cluster, wherein the method comprises the following steps: the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing; after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data. And the cluster main node also generates a first event batch processing sequence for the event between the first commit event and the second commit event according to the commit event sent by the cluster main node, and persists the first event batch processing sequence into the first data. The cluster slave node or the cluster master node also generates a first event batch processing sequence by the second commit event and persists the first event batch processing sequence into the first data. The invention also provides a device for persisting the data during the running of the storage cluster, processing equipment and a computer-readable storage medium. According to the invention, through changing the configuration data persistence mode, the common mode of persisting the cluster state once per event or at regular time is changed into the mode of persisting the cluster state through event batch processing, so that the cluster state can be synchronized and persisted among nodes in a staged mode, the stability of the cluster is enhanced, and the cluster persistence efficiency is improved; the granularity of cluster recovery is changed from the original event to event batch processing, so that the state recovery stage of the node is more definite. In addition, the invention adds extra protection to the synchronization of each node state among the clusters.
Drawings
Fig. 1 is a flowchart of a method for persisting data during runtime of a storage cluster in embodiment 1 of the present invention;
fig. 2 is a schematic diagram of a device for persisting data during runtime of a storage cluster according to embodiment 2 of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
Example 1
The invention provides a method for persisting data during the operation of a storage cluster, wherein the event mode of the existing storage cluster system is mostly based on an event, and an accurate state machine excited by the event is used for ensuring the state synchronization among all nodes in the cluster. More or less of each cluster event may change the cluster's critical configuration data, which needs to be persisted to non-volatile storage to prevent the cluster from crashing unexpectedly.
Event batch processing (events batch) is an idea of grouping frequent event interactions, when event interactions reach time specified by the events batch, a state data persistence action is triggered, after the event processing in the last events batch is finished, the event in the next events batch is not processed immediately, but a boss node of a cluster sends a commit notification, and the event in the events batch can be processed normally only after the completion of the commit of the events batch by each node in the cluster. This ensures that the phased state of the cluster is synchronized. Where the commit command is used to save the modifications made by the transaction to the database, it saves all transactions following the last commit or rollback command to the database. A Commit event is an event that saves modifications made by a transaction.
Fig. 1 shows a flowchart of a method for persisting data during runtime of a storage cluster in embodiment 1 of the present invention.
In step S101, the cluster master node sequentially sends commit events to the cluster slave nodes according to a preset time, and generates event batch processing.
In step S102, after the cluster receives the commit event from the node, a first event batch processing sequence is generated for an event intermediate between the first commit event and the second commit event, and persisted into the first data.
And the cluster main node also generates a first event batch processing sequence for the event between the first commit event and the second commit event according to the commit event sent by the cluster main node, and persists the first event batch processing sequence into the first data.
The cluster slave node or the cluster master node also generates a first event batch processing sequence from the second commit event and persists the first event batch processing sequence into the first data, and the purpose of doing so is to save event processing records within a certain time so as to prepare for replay events when the cluster crashes unexpectedly.
In the invention, as the time specified by the events batch is known in the cluster range, if the slave nodes of the cluster do not receive the commit event sent by the master node of the cluster within the preset time, the slave nodes of the cluster which do not receive the commit event automatically quit from the cluster and rejoin the cluster. The preset time in the invention is the time specified by the time batch processing, and is generally 2 seconds to 3 seconds, and the protection scope of the invention is not limited to the time disclosed in the embodiment.
While the current events batch event of the cluster is being processed, new cluster events are accumulated but not immediately processed because the event processing principle of the cluster is sequential and is implemented in a single thread. After all nodes (including the boss node) in the cluster receive and process the commit event, the business event in the events batch is formally processed, and after the batch of events is completely processed, the commit event inserted by the boss node is processed, wherein the action corresponding to the commit event is to persist the final state of the cluster into hardendata (persistent data).
In the scenario of an unexpected cluster crash, events batch plays a role. When a node unexpectedly exits, and then reappears to join the cluster, the node may reapply the event batch being processed during exiting, and after the processing is completed, interact with the cluster for the current event batch sequence (event batch sequence).
If the deviation of the current event batch sequence is less than or equal to 16 time batches, the cluster initiates synchronization, so that the returned node state and the cluster state are kept consistent; if the deviation of the current event batch sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
Example 2
Based on the method for persisting data during the storage cluster runtime provided by the present invention, a device for persisting data during the storage cluster runtime is also provided, for example, fig. 2 shows a schematic diagram of the device for persisting data during the storage cluster runtime.
The sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
The processing device for storing the cluster runtime data persistence further comprises:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of data persistence when a storage cluster is run.
According to the invention, through changing the configuration data persistence mode, the mode of persisting each event once or persisting the cluster state at fixed time is changed into the mode of batch processing events through events batch, so that the cluster state can be synchronized and persisted among nodes in a staged mode, the stability of the cluster is enhanced, the persistence efficiency of the cluster is improved, the granularity of cluster recovery is changed from the original event to events batch, and the state recovery stage of the nodes is clearer.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto. Various modifications and alterations will occur to those skilled in the art based on the foregoing description. And are neither required nor exhaustive of all embodiments. On the basis of the technical scheme of the invention, various modifications or changes which can be made by a person skilled in the art without creative efforts are still within the protection scope of the invention.
Claims (10)
1. A method for storage cluster runtime data persistence, comprising the steps of:
the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing;
and after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data.
2. The method for data persistence during runtime of a storage cluster as claimed in claim 1, wherein the cluster master node further generates a first event batch sequence from an event intermediate the first commit event and the second commit event according to the commit event sent by the cluster master node, and persists the first event batch sequence into the first data.
3. The method for storage cluster runtime data persistence as claimed in claim 1 or 2, wherein the cluster slave node or the cluster master node further generates a second commit event into the first event batch sequence and persists into the first data.
4. The method for data persistence during runtime of a storage cluster according to claim 1, wherein the cluster master node sequentially sends commit events to the cluster slave nodes according to a preset time, and further comprising: and if the cluster slave node does not receive the commit event sent by the cluster master node within the preset time, the cluster slave node which does not receive the commit event automatically exits from the cluster and rejoins the cluster.
5. The storage cluster runtime data persistence method of claim 1, wherein the preset time is an event batch specified time; the event batch specified time is 2 seconds to 3 seconds.
6. The method of claim 1, wherein when a cluster crashes, a node exits the cluster, and when the exiting node reappears to join the cluster, the exiting node plays back the event batch processing when exiting, and interacts with the cluster with the current event batch processing sequence after processing is complete.
7. The method for data persistence during runtime of a storage cluster according to claim 6, wherein if there are 16 time batches with a deviation less than or equal to the current event batch sequence, the cluster initiates synchronization to keep the returned node state consistent with the cluster state;
if the deviation of the current event batch processing sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
8. An apparatus for storage cluster runtime data persistence, comprising:
the sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
9. Processing device for storage cluster runtime data persistence, further comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for storage cluster runtime data persistence according to one of claims 1 to 7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010846365.1A CN112068994A (en) | 2020-08-21 | 2020-08-21 | Method, apparatus, device and medium for data persistence during storage cluster runtime |
PCT/CN2021/096686 WO2022037173A1 (en) | 2020-08-21 | 2021-05-28 | Method and apparatus for data persistence in storage cluster runtime, and device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010846365.1A CN112068994A (en) | 2020-08-21 | 2020-08-21 | Method, apparatus, device and medium for data persistence during storage cluster runtime |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112068994A true CN112068994A (en) | 2020-12-11 |
Family
ID=73662555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010846365.1A Withdrawn CN112068994A (en) | 2020-08-21 | 2020-08-21 | Method, apparatus, device and medium for data persistence during storage cluster runtime |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112068994A (en) |
WO (1) | WO2022037173A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022037173A1 (en) * | 2020-08-21 | 2022-02-24 | 苏州浪潮智能科技有限公司 | Method and apparatus for data persistence in storage cluster runtime, and device and medium |
CN114969072A (en) * | 2022-06-06 | 2022-08-30 | 北京友友天宇系统技术有限公司 | Data transmission method, device and equipment based on state machine and data persistence |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8521854B2 (en) * | 2010-08-06 | 2013-08-27 | International Business Machines Corporation | Minimising network resource overhead consumption by reports from one or more agents distributed in an electronic data network of nodes |
CN103412768A (en) * | 2013-07-19 | 2013-11-27 | 蓝盾信息安全技术股份有限公司 | Zookeeper cluster automatic-deployment method based on script program |
CN106528574A (en) * | 2015-09-14 | 2017-03-22 | 阿里巴巴集团控股有限公司 | Data synchronization method and device |
CN110209726B (en) * | 2018-02-12 | 2023-10-20 | 金篆信科有限责任公司 | Distributed database cluster system, data synchronization method and storage medium |
CN111400268B (en) * | 2020-03-13 | 2022-06-17 | 清华大学 | Log management method of distributed persistent memory transaction system |
CN112068994A (en) * | 2020-08-21 | 2020-12-11 | 苏州浪潮智能科技有限公司 | Method, apparatus, device and medium for data persistence during storage cluster runtime |
-
2020
- 2020-08-21 CN CN202010846365.1A patent/CN112068994A/en not_active Withdrawn
-
2021
- 2021-05-28 WO PCT/CN2021/096686 patent/WO2022037173A1/en active Application Filing
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022037173A1 (en) * | 2020-08-21 | 2022-02-24 | 苏州浪潮智能科技有限公司 | Method and apparatus for data persistence in storage cluster runtime, and device and medium |
CN114969072A (en) * | 2022-06-06 | 2022-08-30 | 北京友友天宇系统技术有限公司 | Data transmission method, device and equipment based on state machine and data persistence |
Also Published As
Publication number | Publication date |
---|---|
WO2022037173A1 (en) | 2022-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20090063807A1 (en) | Data redistribution in shared nothing architecture | |
CN112068994A (en) | Method, apparatus, device and medium for data persistence during storage cluster runtime | |
CN110941502B (en) | Message processing method, device, storage medium and equipment | |
CN110413687B (en) | Distributed transaction fault processing method and related equipment based on node interaction verification | |
WO2019020081A1 (en) | Distributed system and fault recovery method and apparatus thereof, product, and storage medium | |
CN105915391B (en) | The distributed key assignments storage method of self-recovering function is submitted and had based on single phase | |
US11748215B2 (en) | Log management method, server, and database system | |
WO2022134876A1 (en) | Data synchronization method and apparatus, and electronic device and storage medium | |
CN106815094B (en) | Method and equipment for realizing transaction submission in master-slave synchronization mode | |
CN113051110A (en) | Cluster switching method, device and equipment | |
EP4275129A1 (en) | Distributed database remote backup | |
CN115617908A (en) | MySQL data synchronization method, device, database terminal, medium and system | |
CN111400086B (en) | Method and system for realizing fault tolerance of virtual machine | |
CN115994053A (en) | Parallel playback method and device of database backup machine, electronic equipment and medium | |
CN115617571A (en) | Data backup method, device, system, equipment and storage medium | |
CN110417882B (en) | Method and device for determining main node and storage medium | |
US20100274758A1 (en) | Data processing method, computer, and data processing program | |
CN108108119B (en) | Configuration method and device for extensible storage cluster things | |
CN111352704A (en) | Distributed global transaction processing system and method based on policy management | |
CN113900788A (en) | Distributed work scheduling method and distributed workflow engine system | |
CN110532069B (en) | Distributed transaction submitting method and device | |
CN115658245A (en) | Transaction submitting system, method and device based on distributed database system | |
CN113312211B (en) | Method for ensuring high availability of distributed learning system | |
CN115328931A (en) | Database cluster data verification method and device, storage medium and electronic equipment | |
CN115202925A (en) | Common identification method and system supporting fine-grained fault tolerance based on RDMA |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20201211 |
|
WW01 | Invention patent application withdrawn after publication |