CN118331496A - Arbitration method and device for storage cluster, computer equipment and storage medium - Google Patents

Arbitration method and device for storage cluster, computer equipment and storage medium Download PDF

Info

Publication number
CN118331496A
CN118331496A CN202410451312.8A CN202410451312A CN118331496A CN 118331496 A CN118331496 A CN 118331496A CN 202410451312 A CN202410451312 A CN 202410451312A CN 118331496 A CN118331496 A CN 118331496A
Authority
CN
China
Prior art keywords
arbitration
cluster
node
storage
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410451312.8A
Other languages
Chinese (zh)
Inventor
周昭飞
穆向东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Jinan data Technology Co ltd
Original Assignee
Inspur Jinan data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Jinan data Technology Co ltd filed Critical Inspur Jinan data Technology Co ltd
Priority to CN202410451312.8A priority Critical patent/CN118331496A/en
Publication of CN118331496A publication Critical patent/CN118331496A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of storage clusters, and discloses an arbitration method, an arbitration device, computer equipment and a storage medium of the storage clusters, wherein the method comprises the following steps: selecting at least one node from the local storage cluster as a local arbitration node, and determining at least one remote arbitration node and at least one third party arbitration node; creating an arbitration cluster comprising a local arbitration node, a remote arbitration node and a third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of the arbitration nodes in the arbitration cluster; in case of abnormality, initiating an arbitration request to an arbitration cluster; and determining whether to continue to provide the storage service according to the arbitration result of the arbitration cluster. The invention arbitrates based on the arbitration cluster, has high arbitration reliability, and provides high availability guarantee for the storage system and the arbitration service.

Description

Arbitration method and device for storage cluster, computer equipment and storage medium
Technical Field
The present invention relates to the field of storage clusters, and in particular, to a storage cluster arbitration method, a storage cluster arbitration device, a computer device, and a storage medium.
Background
The storage system generally stores key service data, and in order to ensure the stability and reliability of the service data, data backup is generally performed. In order to improve the utilization rate of storage resources, a dual-active storage scheme of a distributed storage block is generally adopted, and two data centers are mutually backed up.
When a link between two data centers fails or one of the data centers fails, the two data centers cannot be synchronized in real time, and only one end can continue to provide service. To ensure data consistency, the dual-active scheme determines service priority of the data center through an arbitration mode.
The current arbitration mode generally adopts a static priority mode or an independent arbitration mode, and both arbitration modes have certain limitations and are difficult to effectively utilize storage resources.
Disclosure of Invention
In view of the above, the present invention provides a method, apparatus, computer device and storage medium for arbitration of a storage cluster, so as to solve the problem that the arbitration mode of a dual-active storage system is difficult to effectively utilize storage resources.
In a first aspect, the present invention provides a method for arbitrating a storage cluster, including:
Selecting at least one node from the local storage cluster as a local arbitration node, and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster;
Creating an arbitration cluster comprising the local arbitration node, the remote arbitration node and the third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster;
In case of abnormality, initiating an arbitration request to the arbitration cluster to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster;
and determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
In some alternative embodiments, the method further comprises:
acquiring a configuration file of the arbitration cluster; the configuration file includes configuration information of a plurality of arbitration nodes in the arbitration cluster, the configuration information including: at least one of node name, node address, port number, data storage directory;
And storing the configuration file into a storage node used for storing data in a storage cluster, so that the storage node is connected with the arbitration cluster based on the configuration file.
In some alternative embodiments, the storage node is further configured to: and after the arbitration cluster is connected based on the configuration file, performing read-write operation on the data in the arbitration cluster.
In some alternative embodiments, the method further comprises:
Generating an editing instruction for editing the arbitration cluster, and sending the editing instruction to the arbitration cluster to instruct the arbitration cluster to edit the corresponding arbitration node according to the editing instruction; the editing instruction comprises at least one of adding arbitration nodes, updating arbitration nodes and deleting arbitration nodes;
And/or generating a deleting instruction for deleting the arbitration cluster, and deleting the arbitration cluster according to the deleting instruction.
In some optional embodiments, one arbitration node in the arbitration cluster is a currently elected main arbitration node, and the other arbitration nodes are standby arbitration nodes; the main arbitration node is used for feeding back arbitration results;
The initiating an arbitration request to the arbitration cluster includes:
transmitting an arbitration request to the master arbitration node;
or sending an arbitration request to the standby arbitration node, which redirects the arbitration request to the primary arbitration node.
In some alternative embodiments, the arbitration result of the arbitration cluster is generated by:
Selecting an arbitration node from the arbitration cluster as a main arbitration node;
the main arbitration node sends heartbeat messages to other standby arbitration nodes at regular time;
under the condition that at least one standby arbitration node does not receive the heartbeat message after exceeding a preset time length, the at least one standby arbitration node initiates a new round of election request, and other standby arbitration nodes are instructed to elect;
According to the election result of the standby arbitration node, electing one from the at least one standby arbitration node as a new main arbitration node;
after determining a main arbitration node, the main arbitration node acquires an arbitration request in the form of a key value pair initiated by the local storage cluster; the arbitration request includes a corresponding key and a first value;
The master arbitration node determining a second value corresponding to a key of the arbitration request in local storage and comparing whether the first value is the same as the second value;
attempting to correct the first value to the second value if the first value is different from the second value;
And under the condition of failure correction, determining that the local storage cluster is abnormal.
In some alternative embodiments, the total number of arbitration nodes in the arbitration cluster is an odd number.
In a second aspect, the present invention provides an arbitration device for a storage cluster, including:
The selecting module is used for selecting at least one node from the local storage cluster as a local arbitration node and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster;
an arbitration cluster creation module, configured to create an arbitration cluster including the local arbitration node, the remote arbitration node, and the third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster;
the request module is used for initiating an arbitration request to the arbitration cluster under the condition that an abnormality exists, so as to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster;
and the processing module is used for determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
In a third aspect, the present invention provides a computer device comprising: the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions, so that the arbitration method of the storage cluster of the first aspect or any implementation manner corresponding to the first aspect is executed.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions for causing a computer to perform the arbitration method of a storage cluster according to the first aspect or any one of the embodiments corresponding thereto.
According to the arbitration method, the device, the computer equipment and the storage medium of the storage cluster, a certain number of nodes are selected from the local storage cluster, the remote storage cluster and the third party node to serve as arbitration nodes, the arbitration cluster is built, and the number of each type of arbitration nodes is smaller than half of the total number of arbitration nodes, so that the arbitration cluster can still provide arbitration service when any type of arbitration nodes are abnormal while the resource utilization rate of a storage system is ensured, the use of an arbitration function is not influenced, the arbitration reliability is high, and the high availability guarantee is provided for the storage system and the arbitration service. The time mark is set for the data, interaction with all arbitration nodes is not needed, and the reading and writing efficiency can be improved; the arbitration cluster can select a proper main arbitration node, and can realize arbitration based on the main arbitration node, and the processing mode is simple.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the related art, the drawings that are required to be used in the description of the embodiments or the related art will be briefly described, and it is apparent that the drawings in the description below are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
FIG. 1 is a flow chart of a method of arbitration for a storage cluster according to an embodiment of the invention;
FIG. 2 is a schematic diagram of creating an arbitration cluster, according to an embodiment of the invention;
FIG. 3 is a flow chart of another method of arbitration for a storage cluster according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a management arbitration cluster, according to an embodiment of the invention;
FIG. 5 is a block diagram of an arbitration device for a storage cluster according to an embodiment of the invention;
fig. 6 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
With the rapid development of information technology, the data gradually becomes the operation core of enterprises due to the advancement of the digitization process of each industry, the storage system plays an increasingly important role in the key business of various industries, and the stability requirement of users on the storage system bearing the data is also higher. Especially in the fields of communication, finance, medical treatment, electronic commerce, logistics, government and the like, the interruption of the service of the storage system can lead to important data loss and cause huge economic loss. Therefore, ensuring business continuity is a key to the construction of storage systems.
The traditional disaster recovery system comprises a production center DC (DATA CENTER ) and a disaster recovery center DC, wherein the disaster recovery center is normally in an inactive state, and the production center is paralyzed and the disaster recovery center is started only when a disaster occurs. Such disaster recovery systems face the following challenges: when the production center fails, such as power supply failure, fire disaster, flood, earthquake and other disasters, the service needs to be manually switched to the disaster recovery center, the service interruption time is long, and the continuous operation of the service cannot be ensured. In addition, the disaster recovery center is only used for data backup in normal practice, and the resource utilization rate is low.
The two data centers of the storage system based on the block double activity are mutually backed up and are in an operation state. When one data center fails, even the whole data center fails, the system can be automatically switched to the other data center, and the problems that the conventional disaster recovery center cannot bear the service and the service cannot be automatically switched are solved. And the high-level data reliability and service continuity are provided for the user, and the resource utilization rate of the storage system is improved. Based on two sets of HA (High Availability ) clusters, constructing AA (Active-Active, dual-activity) read-write access capability, wherein the data center of any cluster HAs a fault, data zero is lost, the system is automatically switched to the other cluster to operate, RPO (Recovery Point Objective, recovery point target) is 0, namely, the data is not lost after fault recovery, and the latest available backup time point can be recovered; and RTO (Recovery Time Objective, recovery time target) is about 0, namely the service can be recovered quickly when the fault occurs, so that the continuity of the upper layer service can be ensured.
Specifically, a cross-site virtual volume is virtually formed based on two sets of storage clusters, the data of the virtual volume are synchronized in real time between the two storage clusters, and the two sets of storage clusters can process I/O (input/output) read-write requests of a computing node at the same time, so that the virtual volume has good elastic expansion capability; for large-scale application clients, each storage cluster can be configured with a plurality of nodes, and each node can share the load of data synchronization so as to meet the subsequent service growth.
The dual-active scheme supports two types of static priority arbitration and independent server arbitration, namely an arbitration mode comprises a static priority mode and an independent arbitration mode, and the two modes can be automatically switched. When any data center fails, the system can automatically identify and initiate arbitration, the service can be automatically switched to another data center, the service continuity is ensured, and the service does not sense switching. For example, the remote storage cluster is backed up in real time, and after the service data of the data center of the local storage cluster fails, the service can be continuously provided through the remote storage cluster, so that the loss of the data of one site to the customer is avoided. When a data center has equipment failure or even the whole data center fails, the service can be automatically switched, so that the service interruption time is long due to long manual switching time after a site fails. The two data centers can provide services at the same time, and the resource utilization rate is improved.
However, the currently used arbitration modes have certain limitations, and for the situation that the arbitration mode is a static priority mode, the two ends of the dual active volume are respectively a preferred end and a non-preferred end, if the dual active volume is a non-preferred end fault, the preferred end provides service, and if the dual active volume is a preferred end fault, the preferred end and the non-preferred end do not provide service; this may waste resources of the non-preferred end; because the volumes of the non-preferred end are normal, business operations can be performed. The method is an independent arbitration mode aiming at the arbitration mode, an independent arbitration server is arranged, one end of the dual active volume is failed, arbitration is initiated at two ends, and the volume arbitration win of which end can be determined based on the arbitration result, so that the service can be continuously provided, and the other end stops providing the service; however, if the node where the arbitration server is located is abnormal (for example, down or power-down), at this time, if one end of the dual active volume fails, the arbitration result cannot be obtained (or the operation is switched to the static priority mode), so that the normal operation of the service is affected.
The embodiment of the invention provides an arbitration method for a storage cluster, which is characterized in that arbitration nodes are selected from a dual-activity storage cluster and a third party node to form an arbitration cluster, the arbitration for the dual-activity storage cluster is realized based on the arbitration cluster, the high availability of a dual-activity system is guaranteed to a greater extent, and the arbitration capability of the dual-activity system is improved.
According to an embodiment of the present invention, there is provided an arbitration method embodiment for a storage cluster, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.
In this embodiment, an arbitration method for a storage cluster is provided, which may be applied to a certain storage cluster in a dual-active storage cluster, and in particular, may be executed by management software of the storage cluster. Fig. 1 is a flowchart of a method of arbitration of a storage cluster according to an embodiment of the present invention, as shown in fig. 1, the flowchart including the following steps.
Step S101, selecting at least one node from a local storage cluster as a local arbitration node, and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster.
In this embodiment, two sets of storage clusters are formed into a cross-site dual-active storage cluster, two volumes with the same size in the storage clusters at two ends can form dual-active volumes, data of the dual-active volumes are synchronized in real time, both ends can process read-write requests of an application server at the same time, and non-differential AA parallel access capability is provided for the application server. The dual active storage cluster comprises two storage clusters, and for convenience of description, the storage cluster performing the method is referred to as a local storage cluster, and correspondingly, the other storage cluster is referred to as a remote storage cluster.
Wherein the local storage cluster and the remote storage cluster each comprise a plurality of nodes, and the nodes are originally used for storing data, and are one storage node. In this embodiment, one or more nodes are selected from the local storage cluster to serve as arbitration nodes, i.e., local arbitration nodes; and, also elect one or more nodes from the remote storage cluster as arbitration nodes, i.e. remote arbitration nodes. In addition, one or more nodes are selected from the third party nodes as arbitration nodes, i.e. third party arbitration nodes.
It is understood that the local storage cluster and the remote storage cluster are relative concepts. For example, the dual-active storage cluster includes a storage cluster a and a storage cluster B, where the storage cluster a may be a local storage cluster and the storage cluster B is a remote storage cluster; or the storage cluster B may be a local storage cluster, and the storage cluster a is a remote storage cluster. In other words, any one of the dual active storage clusters may be used as the local storage cluster to execute the method provided in this embodiment.
Step S102, creating an arbitration cluster comprising a local arbitration node, a remote arbitration node and a third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are less than half of the total number of the arbitration nodes in the arbitration cluster.
In this embodiment, after determining the local arbitration node, the remote arbitration node, and the third party arbitration node, the corresponding arbitration cluster may be created by combining these arbitration nodes.
FIG. 2 illustrates a schematic diagram of creating an arbitration cluster. As shown in fig. 2, the home storage cluster includes three nodes, namely, storage node 1-1, storage node 1-2, and storage node 1-3, where a part of the storage nodes may be selected as home arbitration nodes, and fig. 2 illustrates storage node 1-3 as a home arbitration node. The remote storage cluster also includes three nodes, namely, storage node 2-1, storage node 2-2 and storage node 2-3, some of which can be selected as remote arbitration nodes, and fig. 2 illustrates storage node 2-3 as a remote arbitration node.
In addition, at least one third-party node is selected from the third-party sites as a third-party mediation node, and fig. 2 illustrates the selection of one third-party node 3-1 as a third-party mediation node in combination. As shown in FIG. 2, an arbitration cluster may be constructed based on storage nodes 1-3, storage nodes 2-3, and third party nodes 3-1.
In this embodiment, in the arbitration cluster, the number of local arbitration nodes, the number of remote arbitration nodes, and the number of third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster. That is, if the number of local arbitration nodes is N1, the number of remote arbitration nodes is N2, the number of third party arbitration nodes is N3, and the total number of arbitration nodes in the arbitration cluster is N, then n=n1+n2+n3, and N1, N2, N3 are all smaller than N/2.
If the number of the third-party arbitration nodes is 1, the number of the local arbitration nodes and the number of the remote arbitration nodes are the same.
Optionally, the total number of arbitration nodes N in the arbitration cluster is odd to maximize utilization of the arbitration nodes.
Step S103, in case of abnormality, an arbitration request is initiated to the arbitration cluster to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster.
In this embodiment, if the local arbitration node determines that there is an abnormality, for example, a volume abnormality (for example, a read-write abnormality) created by the local end, or a link abnormality, etc., an arbitration request may be initiated to the arbitration cluster, so that the arbitration cluster can arbitrate the local storage cluster and the remote storage cluster, so as to determine whether the local storage cluster and the remote storage cluster can continue to provide the storage service.
In this embodiment, since the arbitration is performed by the arbitration cluster, the local storage cluster and the remote storage cluster do not need to be distinguished from each other, i.e. the priorities of the local storage cluster and the remote storage cluster are the same; if the arbitration cluster judges that the local storage cluster and the remote storage cluster can work normally, the local storage cluster and the remote storage cluster can still provide service in a double-activity mode; if one of the local storage cluster and the remote storage cluster is abnormal, the arbitration cluster can arbitrate to continue to provide service by the other storage cluster, so that the resource utilization rate of the storage system can be ensured.
In addition, because the number of each type of arbitration node (comprising three types of arbitration nodes, namely a local arbitration node, a remote arbitration node and a third party arbitration node) is less than half of the total number of arbitration nodes, when a certain storage cluster fails, the arbitration process of the arbitration cluster can be determined not to be affected based on the half principle; and when the third party arbitrating node is in fault, the arbitrating cluster can work normally and provide the arbitrating function normally, so that the reliability of the arbitrating can be ensured.
Step S104, determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
In this embodiment, the local storage cluster may determine whether it is able to continue to provide the storage service based on the arbitration result of the arbitration cluster. For example, if the arbitration result indicates that the local storage cluster has no failure, the local storage cluster may still continue to provide service; conversely, if the arbitration result indicates that the local storage cluster fails, the local storage cluster pauses the service, only the remote storage cluster provides the service, and even if the remote storage cluster is abnormal, the whole dual-active storage cluster pauses the work.
According to the arbitration method for the storage clusters, a certain number of nodes are selected from the local storage clusters, the remote storage clusters and the third party nodes to serve as arbitration nodes, the arbitration clusters are built, the number of each type of arbitration nodes is smaller than half of the total number of the arbitration nodes, and therefore arbitration services can be provided by the arbitration clusters when all of the arbitration nodes are abnormal while the resource utilization rate of a storage system is guaranteed, the use of arbitration functions is not affected, the arbitration reliability is high, and high availability guarantee is provided for the storage system and the arbitration services.
In this embodiment, an arbitration method for a storage cluster is provided, which may be applied to a certain storage cluster in a dual-active storage cluster, and fig. 3 is a flowchart of an arbitration method for a storage cluster according to an embodiment of the present invention, and as shown in fig. 3, the flowchart includes the following steps.
Step S301, selecting at least one node from the local storage cluster as a local arbitration node, and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster.
Please refer to step S101 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S302, an arbitration cluster comprising a local arbitration node, a remote arbitration node and a third party arbitration node is created; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are less than half of the total number of the arbitration nodes in the arbitration cluster.
Please refer to step S102 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S303, obtaining a configuration file of an arbitration cluster; the configuration file includes configuration information for a plurality of arbitration nodes within an arbitration cluster, the configuration information including: at least one of node name, node address, port number, data storage directory.
In this embodiment, after creating the arbitration cluster, each arbitration node may form corresponding configuration information, which includes at least one of a node name, a node address (e.g., IP address), a port number, and a data storage directory of the arbitration node; wherein the node name is generally unique. The configuration information of the arbitration nodes is combined into a configuration file, and the configuration file is issued to each storage node of the storage clusters (including the local storage cluster and the remote storage cluster). The storage node is a node in the storage cluster other than the arbitration node for storing data.
It will be appreciated that arbitrating editing operations of the cluster will update the configuration file. For example, when an arbitration node is added, deleted, or configuration information of an arbitration node is modified, the configuration file is updated.
Step S304, the configuration file is saved in a storage node used for storing data in the storage cluster, so that the storage node is connected with the arbitration cluster based on the configuration file.
After the storage node of the storage cluster acquires the configuration file, the storage node can be connected with a corresponding arbitration node in the arbitration cluster based on the configuration file so as to realize interaction with the arbitration cluster. For example, after a storage node connects an arbitration cluster, an arbitration request may be initiated to the arbitration cluster.
In some alternative embodiments, the interaction procedure of the storage node with the arbitration cluster may comprise the following step A1.
And step A1, after the arbitration cluster is connected based on the configuration file, performing read-write operation on data in the arbitration cluster.
In this embodiment, each storage node has a configuration file of an arbitration cluster, and the storage node is connected to the arbitration cluster through the configuration file, so that the storage node and the arbitration cluster may perform a certain read-write interaction operation, which may specifically include: the storage node writes data to the arbitration cluster, reads data from the arbitration cluster, modifies the data of the arbitration cluster, deletes the data of the arbitration cluster, and the like; in addition, if the write operation of the storage node has a certain constraint condition, the storage node may modify the data of the arbitration cluster according to the constraint condition, and delete the data of the arbitration cluster according to the constraint condition.
Optionally, the step A1 "performing a read/write operation on the data in the arbitration cluster" may include the following steps a11 to a12.
Step A11, writing operation is carried out on at least W arbitration nodes in the arbitration cluster, and corresponding data are written in; the written data comprises a data body and a time identifier for indicating the writing sequence.
And step A12, performing read operation on at least R arbitration nodes in the arbitration cluster, reading data in the at least R arbitration nodes, and taking the data corresponding to the latest time mark in the read data as the read effective data. Wherein R is greater than or equal to N-W+1, and N is the total number of arbitration nodes in the arbitration cluster.
The data of each arbitration node in the arbitration cluster is synchronous, and when the data is read and written to the arbitration cluster, the data is generally required to be subjected to full operation, namely all the arbitration nodes are required to be read and written. In this embodiment, read-write operation on part of the arbitration nodes may be implemented.
Specifically, when the storage cluster writes data into the arbitration cluster, only W arbitration nodes in the storage cluster can write data, wherein W is smaller than N; the written data includes, in addition to the data body, a time identifier for indicating the writing sequence, where the time identifier may specifically be a time stamp, a version number, or the like.
When the storage cluster performs read operation on the arbitration cluster, only R arbitration nodes in the storage cluster need to be subjected to read operation, and R is smaller than N; and R is not less than N-W+1, typically R=N-W+1. Since W of the arbitration nodes have the latest data (time identification latest) at the time of writing to the arbitration nodes; when the data in the arbitration cluster is read, the data of at least N-W+1 (i.e. R) arbitration nodes are read, so that the latest data in at least one arbitration node, namely the latest time mark, can be read, the latest data is the effective data, and the effective data can be stored subsequently.
In this embodiment, the time identifier is set for the data, so that data can be read and written only when a part of the arbitration nodes are subjected to read operation or write operation, interaction with all the arbitration nodes is not required, and the read and write efficiency can be improved.
Further optionally, the local storage cluster may edit or delete the arbitration cluster in addition to creating the arbitration cluster. Specifically, the method may further include the following step B1 and/or step B2.
Step B1, generating an editing instruction for editing the arbitration cluster, and sending the editing instruction to the arbitration cluster to instruct the arbitration cluster to edit the corresponding arbitration node according to the editing instruction; the edit instruction includes at least one of adding an arbitration node, updating an arbitration node, and deleting an arbitration node.
And B1, generating a deleting instruction for deleting the arbitration cluster, and deleting the arbitration cluster according to the deleting instruction.
In this embodiment, as shown in fig. 2, the management of the arbitration cluster may be implemented by management software of the storage clusters (including the local storage cluster and the remote storage cluster). The management software may be provided in a single node of the storage cluster, or each storage node may be provided with the management software, which is not limited in this embodiment.
FIG. 4 illustrates a schematic diagram of managing an arbitration cluster. As shown in fig. 4, the management software of the storage cluster may connect with the arbitration cluster, and may perform a management operation on the arbitration cluster, where the management operation specifically includes: creating an arbitration cluster, editing arbitration nodes (specifically including adding arbitration nodes, updating arbitration nodes, deleting arbitration nodes and the like), and deleting the arbitration cluster; and the management software can also be used for inquiring the state of the arbitration cluster.
In step S305, in the case of an abnormality, an arbitration request is initiated to the arbitration cluster to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster.
Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.
In this embodiment, the arbitration cluster currently has an elected master arbitration node, and the master arbitration node provides an arbitration function, i.e. feeds back an arbitration result to the storage cluster. Accordingly, when the local storage cluster is abnormal, the local storage cluster can send an arbitration request to the master arbitration node. Specifically, the above step S305 "initiate an arbitration request to an arbitration cluster" may include the following step C1 or step C2.
Step C1, sending the arbitration request to the master arbitration node. That is, the local storage cluster may send arbitration requests directly to the master arbitration node.
Step C2, sending the arbitration request to the standby arbitration node, and redirecting the arbitration request to the main arbitration node by the standby arbitration node. That is, the local storage cluster may also utilize the backup arbitration node to indirectly send an arbitration request to the master arbitration node so that the master arbitration node may perform an arbitration flow.
Optionally, one arbitration node in the arbitration cluster is a currently elected main arbitration node, and the rest arbitration nodes are standby arbitration nodes. And, the process of generating the arbitration result by the arbitration cluster specifically includes the following steps D1 to D8.
Step D1, selecting an arbitration node from the arbitration cluster as a master arbitration node.
Initially, a master arbitration node may be selected from an arbitration cluster. For example, one of the third party nodes may be the master arbitration node.
Step D2, the main arbitration node sends heartbeat messages to other standby arbitration nodes at regular time.
After the main arbitration node is elected, the main arbitration node needs to send heartbeat messages to other standby arbitration nodes at regular time to inform the standby arbitration nodes, and the current main arbitration node can work normally.
And D3, under the condition that at least one standby arbitration node does not receive the heartbeat message after exceeding the preset duration, the at least one standby arbitration node initiates a new round of election request, and other standby arbitration nodes are instructed to elect.
If the standby arbitration node does not receive the heartbeat information sent by the main arbitration node for a long time, that is, the duration of the heartbeat message which is not received exceeds the preset duration, the current main arbitration node can be considered to be abnormal, and a new main arbitration node needs to be reelected at the moment. Specifically, a standby arbitration node that determines that the current primary arbitration node is abnormal may initiate a new round of election requests to be able to reselect the primary arbitration node.
If a plurality of standby arbitration nodes determine that the main arbitration node is abnormal, the standby arbitration nodes can initiate election requests, but the election requests are election requests of the same round, namely only one main arbitration node is elected at the moment. For example, each arbitration node maintains a field of an election round, which is initially 1 (or 0, etc., depending on the actual situation), and is incremented by one each time the master arbitration node is redetermined; for a specific election round, no matter how many standby arbitration nodes initiate an election request, only one main arbitration node is selected finally, and after the main arbitration node is selected, the election rounds of all arbitration nodes are added by one.
And D4, selecting one from at least one standby arbitration node as a new main arbitration node according to the selection result of the standby arbitration nodes.
In this embodiment, during the election process, each standby arbitration node performs a vote to determine the primary arbitration node selected by itself; finally, based on the election results of all the standby arbitration nodes, a new main arbitration node can be elected according to the principle of minority compliance and majority compliance.
Step D5, after determining the main arbitration node, the main arbitration node acquires an arbitration request in the form of a key value pair initiated by the local storage cluster; the arbitration request includes a corresponding key and a first value.
In this embodiment, after determining the master arbitration node in step D1 or step D4, the local storage cluster may send an arbitration request to the arbitration cluster in due time, for example, based on the foregoing step C1 or step C2, etc., to send the arbitration request to the master arbitration node. Wherein the arbitration request is in the form of a key-value pair, i.e. the arbitration request includes a corresponding key (key) and a value (value), and for convenience of description, the value in the arbitration request is referred to as a first value. The key value pair may specifically represent the state of the local storage cluster, the stored data, and the like.
In step D6, the master arbitration node determines a second value corresponding to the key of the arbitration request in the local storage, and compares whether the first value is the same as the second value.
In this embodiment, corresponding data is also stored in the arbitration node, and is stored in the form of key value pairs. After the master arbitration node obtains the arbitration request, the master arbitration node can query the local storage based on the key (key) of the arbitration request to determine a value corresponding to the key (key), namely a second value, in the local storage, and then can compare the first value with the second value.
And step D7, when the first value is different from the second value, attempting to correct the first value to the second value.
And D8, determining that the local storage cluster is abnormal under the condition of failure correction.
In this embodiment, if the first value is the same as the second value, the master arbitration node may determine that the local storage cluster is normal, that is, the local storage cluster may still provide the storage service. If the first value is different from the second value, determining that the local storage cluster may have a fault; in this embodiment, the local storage cluster is further determined by determining whether the first value of the local storage cluster can be corrected. Specifically, the master arbitration node may attempt to modify the first value of the local storage cluster to the second value, and if the modification is successful, the local storage cluster may be considered to be still normal, otherwise, if the modification is failed, the master arbitration node determines that the local storage cluster is abnormal, and subsequently, the local storage cluster cannot continue to provide the storage service.
Step S306, determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
Please refer to step S104 in the embodiment shown in fig. 1 in detail, which is not described herein.
According to the arbitration method for the storage clusters, a certain number of nodes are selected from the local storage clusters, the remote storage clusters and the third party nodes to serve as arbitration nodes, arbitration clusters are built, arbitration is carried out based on the arbitration clusters, and arbitration reliability is high. The time mark is set for the data, interaction with all arbitration nodes is not needed, and the reading and writing efficiency can be improved; the arbitration cluster can select a proper main arbitration node, and can realize arbitration based on the main arbitration node, and the processing mode is simple.
In this embodiment, an arbitration device for a storage cluster is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and will not be described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The present embodiment provides an arbitration device for a storage cluster, as shown in fig. 5, including:
A selecting module 501, configured to select at least one node from the local storage cluster as a local arbitration node, and determine at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster;
An arbitration cluster creation module 502, configured to create an arbitration cluster including the local arbitration node, the remote arbitration node, and the third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster;
A request module 503, configured to initiate an arbitration request to the arbitration cluster in the case of an exception, so as to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster;
and the processing module 504 is configured to determine whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
In some alternative embodiments, the apparatus further comprises a configuration module for:
acquiring a configuration file of the arbitration cluster; the configuration file includes configuration information of a plurality of arbitration nodes in the arbitration cluster, the configuration information including: at least one of node name, node address, port number, data storage directory;
And storing the configuration file into a storage node used for storing data in a storage cluster, so that the storage node is connected with the arbitration cluster based on the configuration file.
In some alternative embodiments, the storage node is further configured to:
And after the arbitration cluster is connected based on the configuration file, performing read-write operation on the data in the arbitration cluster.
In some alternative embodiments, the apparatus further comprises an instruction generation module for:
Generating an editing instruction for editing the arbitration cluster, and sending the editing instruction to the arbitration cluster to instruct the arbitration cluster to edit the corresponding arbitration node according to the editing instruction; the editing instruction comprises at least one of adding arbitration nodes, updating arbitration nodes and deleting arbitration nodes;
And/or generating a deleting instruction for deleting the arbitration cluster, and deleting the arbitration cluster according to the deleting instruction.
In some optional embodiments, one arbitration node in the arbitration cluster is a currently elected main arbitration node, and the other arbitration nodes are standby arbitration nodes; the main arbitration node is used for feeding back arbitration results;
the request module 503 initiates an arbitration request to the arbitration cluster, including:
transmitting an arbitration request to the master arbitration node;
or sending an arbitration request to the standby arbitration node, which redirects the arbitration request to the primary arbitration node.
In some alternative embodiments, the arbitration result of the arbitration cluster is generated by:
Selecting an arbitration node from the arbitration cluster as a main arbitration node;
the main arbitration node sends heartbeat messages to other standby arbitration nodes at regular time;
under the condition that at least one standby arbitration node does not receive the heartbeat message after exceeding a preset time length, the at least one standby arbitration node initiates a new round of election request, and other standby arbitration nodes are instructed to elect;
According to the election result of the standby arbitration node, electing one from the at least one standby arbitration node as a new main arbitration node;
after determining a main arbitration node, the main arbitration node acquires an arbitration request in the form of a key value pair initiated by the local storage cluster; the arbitration request includes a corresponding key and a first value;
The master arbitration node determining a second value corresponding to a key of the arbitration request in local storage and comparing whether the first value is the same as the second value;
attempting to correct the first value to the second value if the first value is different from the second value;
And under the condition of failure correction, determining that the local storage cluster is abnormal.
In some alternative embodiments, the total number of arbitration nodes in the arbitration cluster is an odd number.
Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.
The arbitration means of the storage clusters in this embodiment are presented in the form of functional units, where the units are ASIC (Application SPECIFIC INTEGRATED Circuit) circuits, including processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above functions.
The embodiment of the invention also provides computer equipment, which is provided with the arbitration device of the storage cluster shown in the figure 5.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 6, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 6.
The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.
Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform the methods shown in implementing the above embodiments.
The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.
The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.
The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims (10)

1. An arbitration method for a storage cluster, applied to a local storage cluster, the method comprising:
Selecting at least one node from the local storage cluster as a local arbitration node, and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster;
Creating an arbitration cluster comprising the local arbitration node, the remote arbitration node and the third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster;
In case of abnormality, initiating an arbitration request to the arbitration cluster to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster;
and determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
2. The method as recited in claim 1, further comprising:
acquiring a configuration file of the arbitration cluster; the configuration file includes configuration information of a plurality of arbitration nodes in the arbitration cluster, the configuration information including: at least one of node name, node address, port number, data storage directory;
And storing the configuration file into a storage node used for storing data in a storage cluster, so that the storage node is connected with the arbitration cluster based on the configuration file.
3. The method of claim 2, wherein the storage node is further configured to:
And after the arbitration cluster is connected based on the configuration file, performing read-write operation on the data in the arbitration cluster.
4. The method as recited in claim 1, further comprising:
Generating an editing instruction for editing the arbitration cluster, and sending the editing instruction to the arbitration cluster to instruct the arbitration cluster to edit the corresponding arbitration node according to the editing instruction; the editing instruction comprises at least one of adding arbitration nodes, updating arbitration nodes and deleting arbitration nodes;
And/or generating a deleting instruction for deleting the arbitration cluster, and deleting the arbitration cluster according to the deleting instruction.
5. The method of claim 1, wherein one arbitration node in the arbitration cluster is a currently elected master arbitration node, and the remaining arbitration nodes are standby arbitration nodes; the main arbitration node is used for feeding back arbitration results;
The initiating an arbitration request to the arbitration cluster includes:
transmitting an arbitration request to the master arbitration node;
or sending an arbitration request to the standby arbitration node, which redirects the arbitration request to the primary arbitration node.
6. The method of claim 5, wherein the arbitration result for the arbitration cluster is generated by:
Selecting an arbitration node from the arbitration cluster as a main arbitration node;
the main arbitration node sends heartbeat messages to other standby arbitration nodes at regular time;
under the condition that at least one standby arbitration node does not receive the heartbeat message after exceeding a preset time length, the at least one standby arbitration node initiates a new round of election request, and other standby arbitration nodes are instructed to elect;
According to the election result of the standby arbitration node, electing one from the at least one standby arbitration node as a new main arbitration node;
after determining a main arbitration node, the main arbitration node acquires an arbitration request in the form of a key value pair initiated by the local storage cluster; the arbitration request includes a corresponding key and a first value;
The master arbitration node determining a second value corresponding to a key of the arbitration request in local storage and comparing whether the first value is the same as the second value;
attempting to correct the first value to the second value if the first value is different from the second value;
And under the condition of failure correction, determining that the local storage cluster is abnormal.
7. The method of claim 1, wherein the total number of arbitration nodes in the arbitration cluster is an odd number.
8. An arbitration device for a storage cluster, the device comprising:
The selecting module is used for selecting at least one node from the local storage cluster as a local arbitration node and determining at least one remote arbitration node and at least one third party arbitration node; the remote arbitration node is a node selected from a remote storage cluster, and the local storage cluster and the remote storage cluster form a dual-activity storage cluster;
an arbitration cluster creation module, configured to create an arbitration cluster including the local arbitration node, the remote arbitration node, and the third party arbitration node; in the arbitration cluster, the number of the local arbitration nodes, the number of the remote arbitration nodes and the number of the third party arbitration nodes are all less than half of the total number of arbitration nodes in the arbitration cluster;
the request module is used for initiating an arbitration request to the arbitration cluster under the condition that an abnormality exists, so as to instruct the arbitration cluster to arbitrate the local storage cluster and the remote storage cluster;
and the processing module is used for determining whether to continue providing the storage service according to the arbitration result of the arbitration cluster.
9. A computer device, comprising:
A memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of arbitration for a storage cluster according to any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer instructions for causing a computer to perform the arbitration method of a storage cluster according to any of claims 1 to 7.
CN202410451312.8A 2024-04-15 2024-04-15 Arbitration method and device for storage cluster, computer equipment and storage medium Pending CN118331496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410451312.8A CN118331496A (en) 2024-04-15 2024-04-15 Arbitration method and device for storage cluster, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410451312.8A CN118331496A (en) 2024-04-15 2024-04-15 Arbitration method and device for storage cluster, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118331496A true CN118331496A (en) 2024-07-12

Family

ID=91778864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410451312.8A Pending CN118331496A (en) 2024-04-15 2024-04-15 Arbitration method and device for storage cluster, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118331496A (en)

Similar Documents

Publication Publication Date Title
US11360854B2 (en) Storage cluster configuration change method, storage cluster, and computer system
US11397647B2 (en) Hot backup system, hot backup method, and computer device
US11307776B2 (en) Method for accessing distributed storage system, related apparatus, and related system
CN107919977B (en) Online capacity expansion and online capacity reduction method and device based on Paxos protocol
GB2484086A (en) Reliability and performance modes in a distributed storage system
CN113010496B (en) Data migration method, device, equipment and storage medium
CN105069152B (en) data processing method and device
WO2017097006A1 (en) Real-time data fault-tolerance processing method and system
CN115794499B (en) Method and system for dual-activity replication data among distributed block storage clusters
WO2024103594A1 (en) Container disaster recovery method, system, apparatus and device, and computer-readable storage medium
CN113821168A (en) Shared storage migration system and method, electronic equipment and storage medium
CN113254275A (en) MySQL high-availability architecture method based on distributed block device
CN112600690B (en) Configuration data synchronization method, device, equipment and storage medium
CN107943615B (en) Data processing method and system based on distributed cluster
CN110351122B (en) Disaster recovery method, device, system and electronic equipment
CN109445984B (en) Service recovery method, device, arbitration server and storage system
US11010266B1 (en) Dual isolation recovery for primary-secondary server architectures
CN118331496A (en) Arbitration method and device for storage cluster, computer equipment and storage medium
CN115470041A (en) Data disaster recovery management method and device
US20240028611A1 (en) Granular Replica Healing for Distributed Databases
CN110413686B (en) Data writing method, device, equipment and storage medium
CN113467717B (en) Dual-machine volume mirror image management method, device and equipment and readable storage medium
CN113708960B (en) Deployment method, device and equipment of Zookeeper cluster
CN112463669B (en) Storage arbitration management method, system, terminal and storage medium
CN118295964A (en) Method, device, computer equipment and storage medium for creating double live volume snapshot

Legal Events

Date Code Title Description
PB01 Publication