CN111212127A - Storage cluster, service data maintenance method, device and storage medium - Google Patents
Storage cluster, service data maintenance method, device and storage medium Download PDFInfo
- Publication number
- CN111212127A CN111212127A CN201911386440.4A CN201911386440A CN111212127A CN 111212127 A CN111212127 A CN 111212127A CN 201911386440 A CN201911386440 A CN 201911386440A CN 111212127 A CN111212127 A CN 111212127A
- Authority
- CN
- China
- Prior art keywords
- network
- storage
- storage node
- service
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 105
- 238000012423 maintenance Methods 0.000 title claims abstract description 35
- 238000012544 monitoring process Methods 0.000 claims abstract description 51
- 230000008569 process Effects 0.000 claims abstract description 47
- 230000002159 abnormal effect Effects 0.000 claims abstract description 32
- 238000004590 computer program Methods 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 7
- 230000000737 periodic effect Effects 0.000 claims description 5
- 238000013500 data storage Methods 0.000 abstract description 8
- 241001362551 Samba Species 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0663—Performing the actions predefined by failover planning, e.g. switching to standby network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1044—Group management mechanisms
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Environmental & Geological Engineering (AREA)
- Debugging And Monitoring (AREA)
Abstract
The application discloses a storage cluster and a service data maintenance method, a device and a computer readable storage medium, wherein the method is applied to each storage node in the storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node. The method and the device effectively avoid the problem that the cache data of the residual process causes the data inconsistency of the storage nodes, and further improve the data storage reliability of the storage cluster.
Description
Technical Field
The present application relates to the field of cluster storage technologies, and in particular, to a method and an apparatus for maintaining a storage cluster and service data, and a computer-readable storage medium.
Background
In the society of today, with the rise of cloud computing and big data, the amount of data generated every day is exponentially increased; the traditional storage can not meet the requirement, and the distributed mass storage supporting dynamic capacity expansion is produced.
CTDB (Database Cluster) is a lightweight Cluster Database implementation, is a Cluster Database component of the Cluster Samba, and is commonly used for processing Samba cross-node messages. The CTDB is a TDB database implemented in a distributed manner on a cluster node, and is effective for ensuring high availability of a Storage service, and in particular, may be specifically applied to a Network Attached Storage service (NAS).
For network attached storage services, it is one of the conditions under which the storage node's network remains unobstructed. Therefore, in the prior art, a certain storage node is often inconsistent with client data due to network anomaly. In view of the above, it is an important need for those skilled in the art to provide a solution to the above technical problems.
Disclosure of Invention
The application aims to provide a storage cluster, a service data maintenance method, a service data maintenance device and a computer readable storage medium, so that the problem of inconsistent service data is effectively solved, and the data storage reliability of the storage cluster is improved.
In order to solve the above technical problem, in a first aspect, the present application discloses a service data maintenance method, which is applied to each storage node in a storage cluster, where each storage node operates a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node;
if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data;
continuing to call the network card monitoring service to periodically detect the network state of the storage node;
and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Optionally, the determining process of the storage node network anomaly includes:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Optionally, after the invoking a network card monitoring service preset in the CTDB to periodically detect the network state of the storage node, the method further includes:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
Optionally, after the invoking a network card monitoring service preset in the CTDB to periodically detect the network state of the storage node, the method further includes:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Optionally, after the network of the storage node is recovered, and the network attached storage service is restarted on the storage node, the method further includes:
continuously sending heartbeat signals to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
Optionally, after the network of the storage node is abnormal, the method further includes:
and modifying the network state identifier of the storage node from a normal state to a fault state.
In a second aspect, the present application further discloses a service data maintenance apparatus, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
the network card monitoring module is used for calling network card monitoring service which is preset in the CTDB so as to periodically detect the network state of the storage node;
the cache clearing module is used for generating a kill command based on the CTDB to end a residual process of network-attached storage service running on the storage node when the network of the storage node is abnormal so as to clear current cached business data; the network card monitoring module continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
Optionally, the method further comprises:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the network of the storage node is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
Optionally, the network card monitoring module is specifically configured to:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Optionally, the method further comprises:
the heartbeat signal module is used for sending heartbeat signals to other storage nodes at regular time if the network of the storage node is normal after the service monitoring module periodically detects the network state of the storage node; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Optionally, the heartbeat signal module is further configured to:
after the storage node network is recovered, after network-attached storage service is started on the storage node again, the heartbeat signal is continuously sent to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
Optionally, the method further comprises:
the state identification module is used for modifying the network state identification of the storage node from a normal state to a fault state after the network of the storage node is abnormal; and after the storage node network is recovered, modifying the network state identifier of the storage node from the fault state to the normal state.
In a third aspect, the present application further discloses a storage cluster, including a plurality of storage nodes, where each of the storage nodes includes:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of any of the service data maintenance methods described above.
In a fourth aspect, the present application further discloses a computer-readable storage medium, in which a computer program is stored, and the computer program is used to implement the steps of any one of the service data maintenance methods described above when executed by a processor.
The service data maintenance method provided by the application is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on the deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Therefore, the network monitoring service preset in the CTDB can effectively monitor the network state of the storage node, and further timely clear the residual process of the network-attached storage service after the network is abnormal, so that the problem that the cache data of the residual process causes the data inconsistency of the storage node is effectively avoided, and the data storage reliability of the storage cluster is improved. The service data maintenance device, the storage cluster and the computer-readable storage medium provided by the application also have the beneficial effects.
Drawings
In order to more clearly illustrate the technical solutions in the prior art and the embodiments of the present application, the drawings that are needed to be used in the description of the prior art and the embodiments of the present application will be briefly described below. Of course, the following description of the drawings related to the embodiments of the present application is only a part of the embodiments of the present application, and it will be obvious to those skilled in the art that other drawings can be obtained from the provided drawings without any creative effort, and the obtained other drawings also belong to the protection scope of the present application.
Fig. 1 is a flowchart of a service data maintenance method disclosed in an embodiment of the present application;
fig. 2 is a schematic diagram of a service switching process after a storage node network is abnormal according to an embodiment of the present application;
fig. 3 is a schematic diagram of a service switching process after a storage node network is restored according to an embodiment of the present disclosure;
fig. 4 is a block diagram of a service data maintenance apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of a storage node in a storage cluster according to an embodiment of the present disclosure.
Detailed Description
The core of the application is to provide a storage cluster, a service data maintenance method, a service data maintenance device and a computer-readable storage medium, so as to effectively solve the problem of inconsistent service data and improve the data storage reliability of the storage cluster.
In order to more clearly and completely describe the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Currently, with the rise of cloud computing and big data, the amount of data generated each day is exponentially growing; the traditional storage can not meet the requirement, and the distributed mass storage supporting dynamic capacity expansion is produced.
CTDB (Database Cluster) is a lightweight Cluster Database implementation, is a Cluster Database component of the Cluster Samba, and is commonly used for processing Samba cross-node messages. The CTDB is a distributed TDB database implemented on a cluster node, and may be effectively used to ensure high availability of storage services, such as performing functions of node monitoring, node switching, IP switching, and the like. In particular, it is applicable to Network attached storage service (NAS).
For network attached storage services, it is one of the conditions under which the storage node's network remains unobstructed. Therefore, in the prior art, a certain storage node is often inconsistent with client data due to network anomaly.
Specifically, after the client requests the storage node for data storage, if the storage node is only a network failure (hardware devices such as a power supply and a network card are normal), the storage node may still retain some residual processes of the network-attached storage service and have certain cache data, although the storage node cannot normally operate the network-attached storage service. In this way, after the network of the storage node is restored, the client initiates a data storage request to the storage node again, which is affected by the previously cached data, and at this time, the data of the storage node is inconsistent with the data requested by the client. In view of this, the present application provides a service data maintenance scheme, which can effectively solve the above problems.
Referring to fig. 1, an embodiment of the present application discloses a service data maintenance method, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
s101: and calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node.
Specifically, the service data maintenance method provided in the embodiment of the present application may be specifically applied to each storage node deployed with a CTDB. Because the network attached storage service running on the storage node needs the support of the network, and the data inconsistency of the storage node is likely to occur due to network interruption and other abnormalities, the embodiment of the application specifically sets the network card monitoring service in the CTDB in advance so as to detect the network state of the storage node.
In particular, the detection of the network state may be repeated periodically. Based on the network card monitoring service, the network status can be checked once every certain checking period, and the checking period can be set reasonably according to the actual situation, for example, can be specifically set to 2 s.
S102: if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network attached storage service operated on the storage node so as to clear the currently cached business data.
Specifically, if the network of the storage node is abnormal, such as network disconnection and blocking, the network-attached storage service cannot operate normally, and problems such as communication interruption and data not being dropped occur. Therefore, once the network of the storage node is found to be abnormal, the embodiment of the application can terminate the residual processes of the network attached storage service by using the kill command, so as to remove the cache data of the residual processes and prevent the problem of data inconsistency caused by the next restart of the network attached storage service by the cached service data.
S103: and continuing to call the network card monitoring service to periodically detect the network state of the storage node.
Specifically, after the residual process of the network attached storage service of the storage node is terminated, the network state of the storage node may be continuously and periodically monitored, so that the network attached storage service is restarted after the network is recovered.
S104: and after the storage node network is recovered, restarting the network attached storage service on the storage node.
It is easily understood that, based on step S102, in the embodiment of the present application, after the communication between the client and the storage node is interrupted or delayed due to a network anomaly, the residual process is closed, and the cached data is cleared, so that when the network attached storage service is restarted, the storage node can maintain data consistency with the client.
The service data maintenance method provided by the embodiment of the application is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on the deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Therefore, the network monitoring service preset in the CTDB can effectively monitor the network state of the storage node, and further timely clear the residual process of the network-attached storage service after the network is abnormal, so that the problem that the cache data of the residual process causes the data inconsistency of the storage node is effectively avoided, and the data storage reliability of the storage cluster is improved.
As a specific embodiment, in the service data maintenance method provided in the embodiment of the present application, on the basis of the foregoing content, a determination process of a network anomaly of the storage node includes:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Specifically, in order to prevent erroneous judgment of the network state, judgment may be performed by integrating the judgment results of a predetermined number of times. For example, the preset number may be specifically 4 times, that is, if the determination results in 4 consecutive detection periods are all network anomalies, it may be determined that the storage node is an anomaly in the network.
As a specific embodiment, the method for maintaining service data provided in this embodiment of the present application, based on the foregoing content, after invoking a network card monitoring service preset in the CTDB to periodically detect a network state of the storage node, further includes:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
Specifically, in this embodiment, a service status monitoring service is further set in the CTDB, and is used to periodically detect the operating status of the network-attached storage service of the storage node. Specifically, as before, if the storage node is only a network failure (hardware devices such as power supply and network card are normal), the network attached storage service of the storage node will not operate normally but will leave a residual process. However, if hardware devices such as the power supply and the network card of the storage node have hardware faults (e.g., power failure), all the network-attached storage services cannot be started and stop running. At this time, once the service state monitoring service monitors such a situation, it may try to restart the network attached storage service on the local storage node after the failure is recovered.
As a specific embodiment, the method for maintaining service data provided in this embodiment of the present application, based on the foregoing content, after invoking a network card monitoring service preset in the CTDB to periodically detect a network state of the storage node, further includes:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the service switching process preset in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Specifically, the present embodiment further sets a service switching flow based on the CTDB. If the storage node is in network abnormity, other storage nodes can elect proxy nodes based on the service switching process to replace the storage node to run the task of the storage node.
Referring to fig. 2, fig. 2 is a schematic view of a service switching process after a storage node network is abnormal according to an embodiment of the present disclosure. Fig. 2 specifically shows the node 1 in which the network failure occurs, and two other storage nodes: a master node 2 and a node 3. After the service switching process is executed, the main node 2 is selected as a proxy node, and the task of the node 1 is switched to the main node 2 to be executed.
As a specific embodiment, on the basis of the foregoing, after the storage node network is restored and the network-attached storage service is restarted on the storage node, the method for maintaining the service data provided in the embodiment of the present application further includes:
after the storage node network is recovered, continuously and regularly sending heartbeat signals to other storage nodes; so that other storage nodes trigger the service switching process again after detecting that the heartbeat signal of the storage node is recovered;
continuously sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect that the heartbeat signal of the storage node is recovered, the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again.
Specifically, after the network is restored, the storage node may continue to send the heartbeat signal to notify other storage nodes. Based on cluster load balancing control, other storage nodes can switch and return tasks before the storage node to the storage node through a service switching process.
Referring to fig. 3, fig. 3 is a schematic view illustrating a service switching process after a storage node network is restored according to an embodiment of the present disclosure. After the network fault of the node 1 is solved, the network is recovered to be normal, and after the main node 2 and the node 3 execute the service switching process again, the original task of the node 1 is switched back to the node 1 to be executed.
As a specific embodiment, the method for maintaining service data provided in the embodiment of the present application, based on the foregoing content, further includes, after the network of the storage node is abnormal: modifying the network state identifier of the storage node from a normal state to a fault state;
after the storage node network is recovered, the method further comprises the following steps: and modifying the network state identifier of the storage node from a fault state to a normal state.
Specifically, the embodiment of the present application further sets a network status identifier for each storage node, where the network status identifier has two identification states: a normal state and a fault state. Therefore, after the storage node network is determined to be abnormal, the storage node network can be set to a fault state. Similarly, after the storage node network is judged to be recovered, the storage node network can be set to be in a normal state.
Referring to fig. 4, an embodiment of the present application discloses a service data maintenance apparatus, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
a network card monitoring module 201, configured to invoke a network card monitoring service preset in the CTDB, so as to periodically detect a network state of the storage node;
the cache clearing module 202 is configured to generate a kill command based on the CTDB to end a residual process of a network-attached storage service running on the storage node when the network of the storage node is abnormal, so as to clear currently cached service data; the network card monitoring module 201 continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module 203 is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
It can be seen that the service data maintenance device disclosed in the embodiment of the present application, based on the network card monitoring service preset in the CTDB, can effectively monitor the network state of the storage node, and then timely clean the residual process of the network attached storage service after the network is abnormal, thereby effectively avoiding the problem that the cache data of the residual process causes the storage node to have data inconsistency, and further improving the data storage reliability of the storage cluster
For the specific content of the service data maintenance device, reference may be made to the foregoing detailed description of the service data maintenance method, and details thereof are not repeated here.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the storage node network is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module 203 is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
As a specific embodiment, in the service data maintenance apparatus disclosed in the embodiment of the present application, on the basis of the foregoing content, the network card monitoring module 201 is specifically configured to:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the heartbeat signal module is used for sending heartbeat signals to other storage nodes at regular time if the storage node network is normal after the service monitoring module periodically detects the network state of the storage node; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the service switching process preset in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
As a specific embodiment, in the service data maintenance apparatus disclosed in the embodiment of the present application, on the basis of the foregoing content, the heartbeat signal module is further configured to:
after the storage node network is recovered, after the network-attached storage service is started on the storage node again, the heartbeat signal is continuously sent to other storage nodes at regular time; after other storage nodes detect that the heartbeat signal of the storage node is recovered, the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the state identification module is used for modifying the network state identification of the storage node from a normal state to a fault state after the network of the storage node is abnormal; and after the storage node network is recovered, the network state identification of the storage node is changed from the fault state to the normal state.
Further, the present application also discloses a storage cluster, which includes a plurality of storage nodes, and as shown in fig. 5, each of the storage nodes includes:
a memory 301 for storing a computer program;
a processor 302 for executing said computer program to implement the steps of any of the service data maintenance methods described above.
Further, the present application also discloses a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the steps of any service data maintenance method described above when being executed by a processor.
For the specific content of the storage cluster and the computer-readable storage medium, reference may be made to the foregoing detailed description on the service data maintenance method, and details are not described here again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the equipment disclosed by the embodiment, the description is relatively simple because the equipment corresponds to the method disclosed by the embodiment, and the relevant parts can be referred to the method part for description.
It is further noted that, throughout this document, relational terms such as "first" and "second" are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The technical solutions provided by the present application are described in detail above. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, without departing from the principle of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall into the protection scope of the present application.
Claims (10)
1. A maintenance method of service data is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node;
if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data;
continuing to call the network card monitoring service to periodically detect the network state of the storage node;
and after the storage node network is recovered, restarting the network attached storage service on the storage node.
2. The method for maintaining business data according to claim 1, wherein the determining process of the local storage node network anomaly comprises:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
3. The method for maintaining business data according to claim 1, wherein after the invoking of the network card monitoring service preset in the CTDB for periodically detecting the network status of the storage node, the method further comprises:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
4. The method for maintaining business data according to claim 1, wherein after the invoking of the network card monitoring service preset in the CTDB for periodically detecting the network status of the storage node, the method further comprises:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
5. The method for maintaining business data according to claim 4, wherein after the network of the local storage node is recovered and the network-attached storage service is restarted on the local storage node, the method further comprises:
continuously sending heartbeat signals to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
6. The method for maintaining business data according to any one of claims 1 to 5, wherein after the network of local storage nodes is abnormal, the method further comprises:
modifying the network state identifier of the storage node from a normal state to a fault state;
after the storage node network is recovered, the method further includes:
and modifying the network state identifier of the storage node from the fault state to the normal state.
7. The service data maintenance device is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
the network card monitoring module is used for calling network card monitoring service which is preset in the CTDB so as to periodically detect the network state of the storage node;
the cache clearing module is used for generating a kill command based on the CTDB to end a residual process of network-attached storage service running on the storage node when the network of the storage node is abnormal so as to clear current cached business data; the network card monitoring module continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
8. The apparatus for maintaining business data according to claim 7, further comprising:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the network of the storage node is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
9. A storage cluster comprising a plurality of storage nodes, wherein each of said storage nodes comprises:
a memory for storing a computer program;
processor for executing said computer program for implementing the steps of the method for maintenance of business data according to any one of claims 1 to 6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method for maintaining business data according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911386440.4A CN111212127A (en) | 2019-12-29 | 2019-12-29 | Storage cluster, service data maintenance method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911386440.4A CN111212127A (en) | 2019-12-29 | 2019-12-29 | Storage cluster, service data maintenance method, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111212127A true CN111212127A (en) | 2020-05-29 |
Family
ID=70789434
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911386440.4A Pending CN111212127A (en) | 2019-12-29 | 2019-12-29 | Storage cluster, service data maintenance method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111212127A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111817892A (en) * | 2020-07-10 | 2020-10-23 | 济南浪潮数据技术有限公司 | Network management method, system, electronic equipment and storage medium |
CN112866408A (en) * | 2021-02-09 | 2021-05-28 | 山东英信计算机技术有限公司 | Service switching method, device, equipment and storage medium in cluster |
CN113626238A (en) * | 2021-07-23 | 2021-11-09 | 济南浪潮数据技术有限公司 | ctdb service health state monitoring method, system, device and storage medium |
CN114035905A (en) * | 2021-11-19 | 2022-02-11 | 江苏安超云软件有限公司 | Fault migration method and device based on virtual machine, electronic equipment and storage medium |
CN115102962A (en) * | 2022-06-22 | 2022-09-23 | 青岛中科曙光科技服务有限公司 | Cluster management method and device, computer equipment and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101237400A (en) * | 2008-01-24 | 2008-08-06 | 创新科存储技术(深圳)有限公司 | Migration method for network additive storage service and network additional storage node |
CN106790565A (en) * | 2016-12-27 | 2017-05-31 | 中国电子科技集团公司第五十二研究所 | A kind of network attached storage group system |
CN108847982A (en) * | 2018-06-26 | 2018-11-20 | 郑州云海信息技术有限公司 | A kind of distributed storage cluster and its node failure switching method and apparatus |
CN108958991A (en) * | 2018-07-26 | 2018-12-07 | 郑州云海信息技术有限公司 | Clustered node failure business quick recovery method, device, equipment and storage medium |
CN109218141A (en) * | 2018-11-20 | 2019-01-15 | 郑州云海信息技术有限公司 | A kind of malfunctioning node detection method and relevant apparatus |
CN109474694A (en) * | 2018-12-04 | 2019-03-15 | 郑州云海信息技术有限公司 | A kind of management-control method and device of the NAS cluster based on SAN storage array |
CN109614201A (en) * | 2018-12-04 | 2019-04-12 | 武汉烽火信息集成技术有限公司 | The OpenStack virtual machine high-availability system of anti-fissure |
CN109634716A (en) * | 2018-12-04 | 2019-04-16 | 武汉烽火信息集成技术有限公司 | The OpenStack virtual machine High Availabitity management end device and management method of anti-fissure |
CN109639794A (en) * | 2018-12-10 | 2019-04-16 | 杭州数梦工场科技有限公司 | A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing |
CN110519086A (en) * | 2019-08-08 | 2019-11-29 | 苏州浪潮智能科技有限公司 | A kind of method and apparatus of the fast quick-recovery storage cluster NAS business based on CTDB |
CN110611603A (en) * | 2019-09-09 | 2019-12-24 | 苏州浪潮智能科技有限公司 | Cluster network card monitoring method and device |
-
2019
- 2019-12-29 CN CN201911386440.4A patent/CN111212127A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101237400A (en) * | 2008-01-24 | 2008-08-06 | 创新科存储技术(深圳)有限公司 | Migration method for network additive storage service and network additional storage node |
CN106790565A (en) * | 2016-12-27 | 2017-05-31 | 中国电子科技集团公司第五十二研究所 | A kind of network attached storage group system |
CN108847982A (en) * | 2018-06-26 | 2018-11-20 | 郑州云海信息技术有限公司 | A kind of distributed storage cluster and its node failure switching method and apparatus |
CN108958991A (en) * | 2018-07-26 | 2018-12-07 | 郑州云海信息技术有限公司 | Clustered node failure business quick recovery method, device, equipment and storage medium |
CN109218141A (en) * | 2018-11-20 | 2019-01-15 | 郑州云海信息技术有限公司 | A kind of malfunctioning node detection method and relevant apparatus |
CN109474694A (en) * | 2018-12-04 | 2019-03-15 | 郑州云海信息技术有限公司 | A kind of management-control method and device of the NAS cluster based on SAN storage array |
CN109614201A (en) * | 2018-12-04 | 2019-04-12 | 武汉烽火信息集成技术有限公司 | The OpenStack virtual machine high-availability system of anti-fissure |
CN109634716A (en) * | 2018-12-04 | 2019-04-16 | 武汉烽火信息集成技术有限公司 | The OpenStack virtual machine High Availabitity management end device and management method of anti-fissure |
CN109639794A (en) * | 2018-12-10 | 2019-04-16 | 杭州数梦工场科技有限公司 | A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing |
CN110519086A (en) * | 2019-08-08 | 2019-11-29 | 苏州浪潮智能科技有限公司 | A kind of method and apparatus of the fast quick-recovery storage cluster NAS business based on CTDB |
CN110611603A (en) * | 2019-09-09 | 2019-12-24 | 苏州浪潮智能科技有限公司 | Cluster network card monitoring method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111817892A (en) * | 2020-07-10 | 2020-10-23 | 济南浪潮数据技术有限公司 | Network management method, system, electronic equipment and storage medium |
CN111817892B (en) * | 2020-07-10 | 2023-04-07 | 济南浪潮数据技术有限公司 | Network management method, system, electronic equipment and storage medium |
CN112866408A (en) * | 2021-02-09 | 2021-05-28 | 山东英信计算机技术有限公司 | Service switching method, device, equipment and storage medium in cluster |
CN112866408B (en) * | 2021-02-09 | 2022-08-09 | 山东英信计算机技术有限公司 | Service switching method, device, equipment and storage medium in cluster |
CN113626238A (en) * | 2021-07-23 | 2021-11-09 | 济南浪潮数据技术有限公司 | ctdb service health state monitoring method, system, device and storage medium |
CN113626238B (en) * | 2021-07-23 | 2024-02-20 | 济南浪潮数据技术有限公司 | ctdb service health state monitoring method, system, device and storage medium |
CN114035905A (en) * | 2021-11-19 | 2022-02-11 | 江苏安超云软件有限公司 | Fault migration method and device based on virtual machine, electronic equipment and storage medium |
CN115102962A (en) * | 2022-06-22 | 2022-09-23 | 青岛中科曙光科技服务有限公司 | Cluster management method and device, computer equipment and storage medium |
CN115102962B (en) * | 2022-06-22 | 2024-08-23 | 青岛中科曙光科技服务有限公司 | Cluster management method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111212127A (en) | Storage cluster, service data maintenance method, device and storage medium | |
CN107515796B (en) | Equipment abnormity monitoring processing method and device | |
CN110830283B (en) | Fault detection method, device, equipment and system | |
JP6019995B2 (en) | Distributed system, server computer, and failure prevention method | |
CN109274544B (en) | Fault detection method and device for distributed storage system | |
CN111953566B (en) | Distributed fault monitoring-based method and virtual machine high-availability system | |
CN112463448B (en) | Distributed cluster database synchronization method, device, equipment and storage medium | |
CN102360324B (en) | Failure recovery method and equipment for failure recovery | |
CN112612545A (en) | Configuration hot loading system, method, equipment and medium of server cluster | |
CN106506278B (en) | Service availability monitoring method and device | |
CN109600264A (en) | CloudStack cloud platform | |
JP6421516B2 (en) | Server device, redundant server system, information takeover program, and information takeover method | |
US10157110B2 (en) | Distributed system, server computer, distributed management server, and failure prevention method | |
JP5285044B2 (en) | Cluster system recovery method, server, and program | |
CN104394033B (en) | Monitoring system, method and device across data center | |
JP2007280155A (en) | Reliability improving method in dispersion system | |
CN115712521A (en) | Cluster node fault processing method, system and medium | |
CN112269693B (en) | Node self-coordination method, device and computer readable storage medium | |
JP6984119B2 (en) | Monitoring equipment, monitoring programs, and monitoring methods | |
CN115314361A (en) | Server cluster management method and related components thereof | |
CN113157493A (en) | Backup method, device and system based on ticket checking system and computer equipment | |
JP4968568B2 (en) | Fault monitoring method, fault monitoring system and program | |
WO2014040470A1 (en) | Alarm message processing method and device | |
CN112612652A (en) | Distributed storage system abnormal node restarting method and system | |
CN107783855B (en) | Fault self-healing control device and method for virtual network element |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200529 |