CN112068994A - Method, apparatus, device and medium for data persistence during storage cluster runtime - Google Patents

Method, apparatus, device and medium for data persistence during storage cluster runtime Download PDF

Info

Publication number
CN112068994A
CN112068994A CN202010846365.1A CN202010846365A CN112068994A CN 112068994 A CN112068994 A CN 112068994A CN 202010846365 A CN202010846365 A CN 202010846365A CN 112068994 A CN112068994 A CN 112068994A
Authority
CN
China
Prior art keywords
cluster
event
commit
node
batch processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010846365.1A
Other languages
Chinese (zh)
Inventor
李国栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010846365.1A priority Critical patent/CN112068994A/en
Publication of CN112068994A publication Critical patent/CN112068994A/en
Priority to PCT/CN2021/096686 priority patent/WO2022037173A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery

Abstract

The invention provides a method, a device, equipment and a medium for data persistence during the operation of a storage cluster, wherein the method comprises the following steps: the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing; after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data. And when the cluster crashes, the node exits from the cluster, and then the exiting node reappears to join the cluster, plays back the event batch processing when exiting, and interacts with the cluster with the current event batch processing sequence after the processing is finished. Based on the method, the device, the equipment and the medium for data persistence are also provided. The invention changes the mode of persisting the configuration data from the mode of persisting once per event or persisting the cluster state at fixed time into the mode of batch processing the events, thereby enhancing the stability of the cluster.

Description

Method, apparatus, device and medium for data persistence during storage cluster runtime
Technical Field
The invention belongs to the technical field of storage systems, and particularly relates to a method, a device, equipment and a medium for data persistence during the operation of a storage cluster.
Background
The cluster system in the storage system is a small distributed control node cluster system, and configuration data among control nodes needs to be synchronized at any time in an event and state machine mode along with the progress of a service process. As each control node processes events, it needs to persist updated configuration data into non-volatile storage so that cluster configuration can be recovered from hardenda (persistent data) when the cluster crashes.
In the prior art, performing a persistence operation once per event will affect the efficiency of cluster operation, because the disk writing action is a long time consuming action. Or the manner in which cluster state data is persisted periodically is not conducive to determining from which event the cluster started processing when recovering from a crash.
Disclosure of Invention
In order to solve the technical problems, the invention provides a method, a device, equipment and a medium for persisting data during the operation of a storage cluster, which improve the efficiency of persisting data through event batch processing and synchronously add protection to the states of all nodes among clusters.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method of storage cluster runtime data persistence, comprising the steps of:
the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing;
and after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data.
Further, the cluster master node generates a first event batch processing sequence for an event between the first commit event and the second commit event according to the commit event sent by the cluster master node, and persists the first event batch processing sequence to the first data.
Further, the cluster slave node or the cluster master node generates a first event batch processing sequence by the second commit event and persists the first event batch processing sequence into the first data.
Further, the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time, and the method further includes: and if the cluster slave node does not receive the commit event sent by the cluster master node within the preset time, the cluster slave node which does not receive the commit event automatically exits from the cluster and rejoins the cluster.
Further, the preset time is the time specified by event batch processing; the event batch specified time is 2 seconds to 3 seconds.
Further, when the cluster crashes, the node exits from the cluster, and then the exiting node reappears to join the cluster, the exiting node plays back the event batch processing when exiting, and interacts with the cluster in the current event batch processing sequence after the processing is finished.
Further, if the deviation of the current event batch processing sequence is less than or equal to 16 time batches, the cluster initiates synchronization, so that the returned node state and the cluster state are kept consistent;
if the deviation of the current event batch processing sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
The invention also provides a device for persisting data during the operation of the storage cluster, which comprises the following components:
the sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
The processing device for storing the cluster runtime data persistence further comprises:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence of any one of claims 1 to 7 when executing the computer program.
A computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for storage cluster runtime data persistence according to one of claims 1 to 7.
The effect provided in the summary of the invention is only the effect of the embodiment, not all the effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
the invention provides a method, a device, equipment and a medium for persisting data during the running of a storage cluster, wherein the method comprises the following steps: the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing; after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data. And the cluster main node also generates a first event batch processing sequence for the event between the first commit event and the second commit event according to the commit event sent by the cluster main node, and persists the first event batch processing sequence into the first data. The cluster slave node or the cluster master node also generates a first event batch processing sequence by the second commit event and persists the first event batch processing sequence into the first data. The invention also provides a device for persisting the data during the running of the storage cluster, processing equipment and a computer-readable storage medium. According to the invention, through changing the configuration data persistence mode, the common mode of persisting the cluster state once per event or at regular time is changed into the mode of persisting the cluster state through event batch processing, so that the cluster state can be synchronized and persisted among nodes in a staged mode, the stability of the cluster is enhanced, and the cluster persistence efficiency is improved; the granularity of cluster recovery is changed from the original event to event batch processing, so that the state recovery stage of the node is more definite. In addition, the invention adds extra protection to the synchronization of each node state among the clusters.
Drawings
Fig. 1 is a flowchart of a method for persisting data during runtime of a storage cluster in embodiment 1 of the present invention;
fig. 2 is a schematic diagram of a device for persisting data during runtime of a storage cluster according to embodiment 2 of the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and procedures are omitted so as to not unnecessarily limit the invention.
Example 1
The invention provides a method for persisting data during the operation of a storage cluster, wherein the event mode of the existing storage cluster system is mostly based on an event, and an accurate state machine excited by the event is used for ensuring the state synchronization among all nodes in the cluster. More or less of each cluster event may change the cluster's critical configuration data, which needs to be persisted to non-volatile storage to prevent the cluster from crashing unexpectedly.
Event batch processing (events batch) is an idea of grouping frequent event interactions, when event interactions reach time specified by the events batch, a state data persistence action is triggered, after the event processing in the last events batch is finished, the event in the next events batch is not processed immediately, but a boss node of a cluster sends a commit notification, and the event in the events batch can be processed normally only after the completion of the commit of the events batch by each node in the cluster. This ensures that the phased state of the cluster is synchronized. Where the commit command is used to save the modifications made by the transaction to the database, it saves all transactions following the last commit or rollback command to the database. A Commit event is an event that saves modifications made by a transaction.
Fig. 1 shows a flowchart of a method for persisting data during runtime of a storage cluster in embodiment 1 of the present invention.
In step S101, the cluster master node sequentially sends commit events to the cluster slave nodes according to a preset time, and generates event batch processing.
In step S102, after the cluster receives the commit event from the node, a first event batch processing sequence is generated for an event intermediate between the first commit event and the second commit event, and persisted into the first data.
And the cluster main node also generates a first event batch processing sequence for the event between the first commit event and the second commit event according to the commit event sent by the cluster main node, and persists the first event batch processing sequence into the first data.
The cluster slave node or the cluster master node also generates a first event batch processing sequence from the second commit event and persists the first event batch processing sequence into the first data, and the purpose of doing so is to save event processing records within a certain time so as to prepare for replay events when the cluster crashes unexpectedly.
In the invention, as the time specified by the events batch is known in the cluster range, if the slave nodes of the cluster do not receive the commit event sent by the master node of the cluster within the preset time, the slave nodes of the cluster which do not receive the commit event automatically quit from the cluster and rejoin the cluster. The preset time in the invention is the time specified by the time batch processing, and is generally 2 seconds to 3 seconds, and the protection scope of the invention is not limited to the time disclosed in the embodiment.
While the current events batch event of the cluster is being processed, new cluster events are accumulated but not immediately processed because the event processing principle of the cluster is sequential and is implemented in a single thread. After all nodes (including the boss node) in the cluster receive and process the commit event, the business event in the events batch is formally processed, and after the batch of events is completely processed, the commit event inserted by the boss node is processed, wherein the action corresponding to the commit event is to persist the final state of the cluster into hardendata (persistent data).
In the scenario of an unexpected cluster crash, events batch plays a role. When a node unexpectedly exits, and then reappears to join the cluster, the node may reapply the event batch being processed during exiting, and after the processing is completed, interact with the cluster for the current event batch sequence (event batch sequence).
If the deviation of the current event batch sequence is less than or equal to 16 time batches, the cluster initiates synchronization, so that the returned node state and the cluster state are kept consistent; if the deviation of the current event batch sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
Example 2
Based on the method for persisting data during the storage cluster runtime provided by the present invention, a device for persisting data during the storage cluster runtime is also provided, for example, fig. 2 shows a schematic diagram of the device for persisting data during the storage cluster runtime.
The sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
The processing device for storing the cluster runtime data persistence further comprises:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of data persistence when a storage cluster is run.
According to the invention, through changing the configuration data persistence mode, the mode of persisting each event once or persisting the cluster state at fixed time is changed into the mode of batch processing events through events batch, so that the cluster state can be synchronized and persisted among nodes in a staged mode, the stability of the cluster is enhanced, the persistence efficiency of the cluster is improved, the granularity of cluster recovery is changed from the original event to events batch, and the state recovery stage of the nodes is clearer.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto. Various modifications and alterations will occur to those skilled in the art based on the foregoing description. And are neither required nor exhaustive of all embodiments. On the basis of the technical scheme of the invention, various modifications or changes which can be made by a person skilled in the art without creative efforts are still within the protection scope of the invention.

Claims (10)

1. A method for storage cluster runtime data persistence, comprising the steps of:
the cluster master node sequentially sends commit events to the cluster slave nodes according to preset time and generates event batch processing;
and after the cluster receives the commit event from the node, generating a first event batch processing sequence for the event in the middle of the first commit event and the second commit event, and persisting the first event batch processing sequence into the first data.
2. The method for data persistence during runtime of a storage cluster as claimed in claim 1, wherein the cluster master node further generates a first event batch sequence from an event intermediate the first commit event and the second commit event according to the commit event sent by the cluster master node, and persists the first event batch sequence into the first data.
3. The method for storage cluster runtime data persistence as claimed in claim 1 or 2, wherein the cluster slave node or the cluster master node further generates a second commit event into the first event batch sequence and persists into the first data.
4. The method for data persistence during runtime of a storage cluster according to claim 1, wherein the cluster master node sequentially sends commit events to the cluster slave nodes according to a preset time, and further comprising: and if the cluster slave node does not receive the commit event sent by the cluster master node within the preset time, the cluster slave node which does not receive the commit event automatically exits from the cluster and rejoins the cluster.
5. The storage cluster runtime data persistence method of claim 1, wherein the preset time is an event batch specified time; the event batch specified time is 2 seconds to 3 seconds.
6. The method of claim 1, wherein when a cluster crashes, a node exits the cluster, and when the exiting node reappears to join the cluster, the exiting node plays back the event batch processing when exiting, and interacts with the cluster with the current event batch processing sequence after processing is complete.
7. The method for data persistence during runtime of a storage cluster according to claim 6, wherein if there are 16 time batches with a deviation less than or equal to the current event batch sequence, the cluster initiates synchronization to keep the returned node state consistent with the cluster state;
if the deviation of the current event batch processing sequence is more than 16 time batches; the cluster state of the nodes other than the returned node is replicated.
8. An apparatus for storage cluster runtime data persistence, comprising:
the sending generation module is used for sequentially sending commit events to the cluster slave nodes according to preset time by the cluster master node and generating event batch processing;
and the persistence execution module generates a first event batch processing sequence for an event in the middle of the first commit event and the second commit event after the cluster receives the commit event from the node, and persists the first event batch processing sequence into the first data.
9. Processing device for storage cluster runtime data persistence, further comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method for storage cluster runtime data persistence of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for storage cluster runtime data persistence according to one of claims 1 to 7.
CN202010846365.1A 2020-08-21 2020-08-21 Method, apparatus, device and medium for data persistence during storage cluster runtime Withdrawn CN112068994A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010846365.1A CN112068994A (en) 2020-08-21 2020-08-21 Method, apparatus, device and medium for data persistence during storage cluster runtime
PCT/CN2021/096686 WO2022037173A1 (en) 2020-08-21 2021-05-28 Method and apparatus for data persistence in storage cluster runtime, and device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010846365.1A CN112068994A (en) 2020-08-21 2020-08-21 Method, apparatus, device and medium for data persistence during storage cluster runtime

Publications (1)

Publication Number Publication Date
CN112068994A true CN112068994A (en) 2020-12-11

Family

ID=73662555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010846365.1A Withdrawn CN112068994A (en) 2020-08-21 2020-08-21 Method, apparatus, device and medium for data persistence during storage cluster runtime

Country Status (2)

Country Link
CN (1) CN112068994A (en)
WO (1) WO2022037173A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022037173A1 (en) * 2020-08-21 2022-02-24 苏州浪潮智能科技有限公司 Method and apparatus for data persistence in storage cluster runtime, and device and medium
CN114969072A (en) * 2022-06-06 2022-08-30 北京友友天宇系统技术有限公司 Data transmission method, device and equipment based on state machine and data persistence

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8521854B2 (en) * 2010-08-06 2013-08-27 International Business Machines Corporation Minimising network resource overhead consumption by reports from one or more agents distributed in an electronic data network of nodes
CN103412768A (en) * 2013-07-19 2013-11-27 蓝盾信息安全技术股份有限公司 Zookeeper cluster automatic-deployment method based on script program
CN106528574A (en) * 2015-09-14 2017-03-22 阿里巴巴集团控股有限公司 Data synchronization method and device
CN110209726B (en) * 2018-02-12 2023-10-20 金篆信科有限责任公司 Distributed database cluster system, data synchronization method and storage medium
CN111400268B (en) * 2020-03-13 2022-06-17 清华大学 Log management method of distributed persistent memory transaction system
CN112068994A (en) * 2020-08-21 2020-12-11 苏州浪潮智能科技有限公司 Method, apparatus, device and medium for data persistence during storage cluster runtime

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022037173A1 (en) * 2020-08-21 2022-02-24 苏州浪潮智能科技有限公司 Method and apparatus for data persistence in storage cluster runtime, and device and medium
CN114969072A (en) * 2022-06-06 2022-08-30 北京友友天宇系统技术有限公司 Data transmission method, device and equipment based on state machine and data persistence

Also Published As

Publication number Publication date
WO2022037173A1 (en) 2022-02-24

Similar Documents

Publication Publication Date Title
US20090063807A1 (en) Data redistribution in shared nothing architecture
CN112068994A (en) Method, apparatus, device and medium for data persistence during storage cluster runtime
CN110941502B (en) Message processing method, device, storage medium and equipment
CN110413687B (en) Distributed transaction fault processing method and related equipment based on node interaction verification
WO2019020081A1 (en) Distributed system and fault recovery method and apparatus thereof, product, and storage medium
CN105915391B (en) The distributed key assignments storage method of self-recovering function is submitted and had based on single phase
US11748215B2 (en) Log management method, server, and database system
WO2022134876A1 (en) Data synchronization method and apparatus, and electronic device and storage medium
CN106815094B (en) Method and equipment for realizing transaction submission in master-slave synchronization mode
CN113051110A (en) Cluster switching method, device and equipment
EP4275129A1 (en) Distributed database remote backup
CN115617908A (en) MySQL data synchronization method, device, database terminal, medium and system
CN111400086B (en) Method and system for realizing fault tolerance of virtual machine
CN115994053A (en) Parallel playback method and device of database backup machine, electronic equipment and medium
CN115617571A (en) Data backup method, device, system, equipment and storage medium
CN110417882B (en) Method and device for determining main node and storage medium
US20100274758A1 (en) Data processing method, computer, and data processing program
CN108108119B (en) Configuration method and device for extensible storage cluster things
CN111352704A (en) Distributed global transaction processing system and method based on policy management
CN113900788A (en) Distributed work scheduling method and distributed workflow engine system
CN110532069B (en) Distributed transaction submitting method and device
CN115658245A (en) Transaction submitting system, method and device based on distributed database system
CN113312211B (en) Method for ensuring high availability of distributed learning system
CN115328931A (en) Database cluster data verification method and device, storage medium and electronic equipment
CN115202925A (en) Common identification method and system supporting fine-grained fault tolerance based on RDMA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201211

WW01 Invention patent application withdrawn after publication