CN111212127A - Storage cluster, service data maintenance method, device and storage medium - Google Patents

Storage cluster, service data maintenance method, device and storage medium Download PDF

Info

Publication number
CN111212127A
CN111212127A CN201911386440.4A CN201911386440A CN111212127A CN 111212127 A CN111212127 A CN 111212127A CN 201911386440 A CN201911386440 A CN 201911386440A CN 111212127 A CN111212127 A CN 111212127A
Authority
CN
China
Prior art keywords
network
storage
storage node
service
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911386440.4A
Other languages
Chinese (zh)
Inventor
史宗华
何营
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201911386440.4A priority Critical patent/CN111212127A/en
Publication of CN111212127A publication Critical patent/CN111212127A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a storage cluster and a service data maintenance method, a device and a computer readable storage medium, wherein the method is applied to each storage node in the storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node. The method and the device effectively avoid the problem that the cache data of the residual process causes the data inconsistency of the storage nodes, and further improve the data storage reliability of the storage cluster.

Description

Storage cluster, service data maintenance method, device and storage medium
Technical Field
The present application relates to the field of cluster storage technologies, and in particular, to a method and an apparatus for maintaining a storage cluster and service data, and a computer-readable storage medium.
Background
In the society of today, with the rise of cloud computing and big data, the amount of data generated every day is exponentially increased; the traditional storage can not meet the requirement, and the distributed mass storage supporting dynamic capacity expansion is produced.
CTDB (Database Cluster) is a lightweight Cluster Database implementation, is a Cluster Database component of the Cluster Samba, and is commonly used for processing Samba cross-node messages. The CTDB is a TDB database implemented in a distributed manner on a cluster node, and is effective for ensuring high availability of a Storage service, and in particular, may be specifically applied to a Network Attached Storage service (NAS).
For network attached storage services, it is one of the conditions under which the storage node's network remains unobstructed. Therefore, in the prior art, a certain storage node is often inconsistent with client data due to network anomaly. In view of the above, it is an important need for those skilled in the art to provide a solution to the above technical problems.
Disclosure of Invention
The application aims to provide a storage cluster, a service data maintenance method, a service data maintenance device and a computer readable storage medium, so that the problem of inconsistent service data is effectively solved, and the data storage reliability of the storage cluster is improved.
In order to solve the above technical problem, in a first aspect, the present application discloses a service data maintenance method, which is applied to each storage node in a storage cluster, where each storage node operates a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node;
if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data;
continuing to call the network card monitoring service to periodically detect the network state of the storage node;
and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Optionally, the determining process of the storage node network anomaly includes:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Optionally, after the invoking a network card monitoring service preset in the CTDB to periodically detect the network state of the storage node, the method further includes:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
Optionally, after the invoking a network card monitoring service preset in the CTDB to periodically detect the network state of the storage node, the method further includes:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Optionally, after the network of the storage node is recovered, and the network attached storage service is restarted on the storage node, the method further includes:
continuously sending heartbeat signals to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
Optionally, after the network of the storage node is abnormal, the method further includes:
and modifying the network state identifier of the storage node from a normal state to a fault state.
In a second aspect, the present application further discloses a service data maintenance apparatus, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
the network card monitoring module is used for calling network card monitoring service which is preset in the CTDB so as to periodically detect the network state of the storage node;
the cache clearing module is used for generating a kill command based on the CTDB to end a residual process of network-attached storage service running on the storage node when the network of the storage node is abnormal so as to clear current cached business data; the network card monitoring module continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
Optionally, the method further comprises:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the network of the storage node is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
Optionally, the network card monitoring module is specifically configured to:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Optionally, the method further comprises:
the heartbeat signal module is used for sending heartbeat signals to other storage nodes at regular time if the network of the storage node is normal after the service monitoring module periodically detects the network state of the storage node; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Optionally, the heartbeat signal module is further configured to:
after the storage node network is recovered, after network-attached storage service is started on the storage node again, the heartbeat signal is continuously sent to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
Optionally, the method further comprises:
the state identification module is used for modifying the network state identification of the storage node from a normal state to a fault state after the network of the storage node is abnormal; and after the storage node network is recovered, modifying the network state identifier of the storage node from the fault state to the normal state.
In a third aspect, the present application further discloses a storage cluster, including a plurality of storage nodes, where each of the storage nodes includes:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of any of the service data maintenance methods described above.
In a fourth aspect, the present application further discloses a computer-readable storage medium, in which a computer program is stored, and the computer program is used to implement the steps of any one of the service data maintenance methods described above when executed by a processor.
The service data maintenance method provided by the application is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on the deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Therefore, the network monitoring service preset in the CTDB can effectively monitor the network state of the storage node, and further timely clear the residual process of the network-attached storage service after the network is abnormal, so that the problem that the cache data of the residual process causes the data inconsistency of the storage node is effectively avoided, and the data storage reliability of the storage cluster is improved. The service data maintenance device, the storage cluster and the computer-readable storage medium provided by the application also have the beneficial effects.
Drawings
In order to more clearly illustrate the technical solutions in the prior art and the embodiments of the present application, the drawings that are needed to be used in the description of the prior art and the embodiments of the present application will be briefly described below. Of course, the following description of the drawings related to the embodiments of the present application is only a part of the embodiments of the present application, and it will be obvious to those skilled in the art that other drawings can be obtained from the provided drawings without any creative effort, and the obtained other drawings also belong to the protection scope of the present application.
Fig. 1 is a flowchart of a service data maintenance method disclosed in an embodiment of the present application;
fig. 2 is a schematic diagram of a service switching process after a storage node network is abnormal according to an embodiment of the present application;
fig. 3 is a schematic diagram of a service switching process after a storage node network is restored according to an embodiment of the present disclosure;
fig. 4 is a block diagram of a service data maintenance apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of a storage node in a storage cluster according to an embodiment of the present disclosure.
Detailed Description
The core of the application is to provide a storage cluster, a service data maintenance method, a service data maintenance device and a computer-readable storage medium, so as to effectively solve the problem of inconsistent service data and improve the data storage reliability of the storage cluster.
In order to more clearly and completely describe the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Currently, with the rise of cloud computing and big data, the amount of data generated each day is exponentially growing; the traditional storage can not meet the requirement, and the distributed mass storage supporting dynamic capacity expansion is produced.
CTDB (Database Cluster) is a lightweight Cluster Database implementation, is a Cluster Database component of the Cluster Samba, and is commonly used for processing Samba cross-node messages. The CTDB is a distributed TDB database implemented on a cluster node, and may be effectively used to ensure high availability of storage services, such as performing functions of node monitoring, node switching, IP switching, and the like. In particular, it is applicable to Network attached storage service (NAS).
For network attached storage services, it is one of the conditions under which the storage node's network remains unobstructed. Therefore, in the prior art, a certain storage node is often inconsistent with client data due to network anomaly.
Specifically, after the client requests the storage node for data storage, if the storage node is only a network failure (hardware devices such as a power supply and a network card are normal), the storage node may still retain some residual processes of the network-attached storage service and have certain cache data, although the storage node cannot normally operate the network-attached storage service. In this way, after the network of the storage node is restored, the client initiates a data storage request to the storage node again, which is affected by the previously cached data, and at this time, the data of the storage node is inconsistent with the data requested by the client. In view of this, the present application provides a service data maintenance scheme, which can effectively solve the above problems.
Referring to fig. 1, an embodiment of the present application discloses a service data maintenance method, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
s101: and calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node.
Specifically, the service data maintenance method provided in the embodiment of the present application may be specifically applied to each storage node deployed with a CTDB. Because the network attached storage service running on the storage node needs the support of the network, and the data inconsistency of the storage node is likely to occur due to network interruption and other abnormalities, the embodiment of the application specifically sets the network card monitoring service in the CTDB in advance so as to detect the network state of the storage node.
In particular, the detection of the network state may be repeated periodically. Based on the network card monitoring service, the network status can be checked once every certain checking period, and the checking period can be set reasonably according to the actual situation, for example, can be specifically set to 2 s.
S102: if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network attached storage service operated on the storage node so as to clear the currently cached business data.
Specifically, if the network of the storage node is abnormal, such as network disconnection and blocking, the network-attached storage service cannot operate normally, and problems such as communication interruption and data not being dropped occur. Therefore, once the network of the storage node is found to be abnormal, the embodiment of the application can terminate the residual processes of the network attached storage service by using the kill command, so as to remove the cache data of the residual processes and prevent the problem of data inconsistency caused by the next restart of the network attached storage service by the cached service data.
S103: and continuing to call the network card monitoring service to periodically detect the network state of the storage node.
Specifically, after the residual process of the network attached storage service of the storage node is terminated, the network state of the storage node may be continuously and periodically monitored, so that the network attached storage service is restarted after the network is recovered.
S104: and after the storage node network is recovered, restarting the network attached storage service on the storage node.
It is easily understood that, based on step S102, in the embodiment of the present application, after the communication between the client and the storage node is interrupted or delayed due to a network anomaly, the residual process is closed, and the cached data is cleared, so that when the network attached storage service is restarted, the storage node can maintain data consistency with the client.
The service data maintenance method provided by the embodiment of the application is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on the deployed CTDB; the method comprises the following steps: calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node; if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data; continuing to call the network card monitoring service to periodically detect the network state of the storage node; and after the storage node network is recovered, restarting the network attached storage service on the storage node.
Therefore, the network monitoring service preset in the CTDB can effectively monitor the network state of the storage node, and further timely clear the residual process of the network-attached storage service after the network is abnormal, so that the problem that the cache data of the residual process causes the data inconsistency of the storage node is effectively avoided, and the data storage reliability of the storage cluster is improved.
As a specific embodiment, in the service data maintenance method provided in the embodiment of the present application, on the basis of the foregoing content, a determination process of a network anomaly of the storage node includes:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
Specifically, in order to prevent erroneous judgment of the network state, judgment may be performed by integrating the judgment results of a predetermined number of times. For example, the preset number may be specifically 4 times, that is, if the determination results in 4 consecutive detection periods are all network anomalies, it may be determined that the storage node is an anomaly in the network.
As a specific embodiment, the method for maintaining service data provided in this embodiment of the present application, based on the foregoing content, after invoking a network card monitoring service preset in the CTDB to periodically detect a network state of the storage node, further includes:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
Specifically, in this embodiment, a service status monitoring service is further set in the CTDB, and is used to periodically detect the operating status of the network-attached storage service of the storage node. Specifically, as before, if the storage node is only a network failure (hardware devices such as power supply and network card are normal), the network attached storage service of the storage node will not operate normally but will leave a residual process. However, if hardware devices such as the power supply and the network card of the storage node have hardware faults (e.g., power failure), all the network-attached storage services cannot be started and stop running. At this time, once the service state monitoring service monitors such a situation, it may try to restart the network attached storage service on the local storage node after the failure is recovered.
As a specific embodiment, the method for maintaining service data provided in this embodiment of the present application, based on the foregoing content, after invoking a network card monitoring service preset in the CTDB to periodically detect a network state of the storage node, further includes:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the service switching process preset in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
Specifically, the present embodiment further sets a service switching flow based on the CTDB. If the storage node is in network abnormity, other storage nodes can elect proxy nodes based on the service switching process to replace the storage node to run the task of the storage node.
Referring to fig. 2, fig. 2 is a schematic view of a service switching process after a storage node network is abnormal according to an embodiment of the present disclosure. Fig. 2 specifically shows the node 1 in which the network failure occurs, and two other storage nodes: a master node 2 and a node 3. After the service switching process is executed, the main node 2 is selected as a proxy node, and the task of the node 1 is switched to the main node 2 to be executed.
As a specific embodiment, on the basis of the foregoing, after the storage node network is restored and the network-attached storage service is restarted on the storage node, the method for maintaining the service data provided in the embodiment of the present application further includes:
after the storage node network is recovered, continuously and regularly sending heartbeat signals to other storage nodes; so that other storage nodes trigger the service switching process again after detecting that the heartbeat signal of the storage node is recovered;
continuously sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect that the heartbeat signal of the storage node is recovered, the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again.
Specifically, after the network is restored, the storage node may continue to send the heartbeat signal to notify other storage nodes. Based on cluster load balancing control, other storage nodes can switch and return tasks before the storage node to the storage node through a service switching process.
Referring to fig. 3, fig. 3 is a schematic view illustrating a service switching process after a storage node network is restored according to an embodiment of the present disclosure. After the network fault of the node 1 is solved, the network is recovered to be normal, and after the main node 2 and the node 3 execute the service switching process again, the original task of the node 1 is switched back to the node 1 to be executed.
As a specific embodiment, the method for maintaining service data provided in the embodiment of the present application, based on the foregoing content, further includes, after the network of the storage node is abnormal: modifying the network state identifier of the storage node from a normal state to a fault state;
after the storage node network is recovered, the method further comprises the following steps: and modifying the network state identifier of the storage node from a fault state to a normal state.
Specifically, the embodiment of the present application further sets a network status identifier for each storage node, where the network status identifier has two identification states: a normal state and a fault state. Therefore, after the storage node network is determined to be abnormal, the storage node network can be set to a fault state. Similarly, after the storage node network is judged to be recovered, the storage node network can be set to be in a normal state.
Referring to fig. 4, an embodiment of the present application discloses a service data maintenance apparatus, which is applied to each storage node in a storage cluster, where each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
a network card monitoring module 201, configured to invoke a network card monitoring service preset in the CTDB, so as to periodically detect a network state of the storage node;
the cache clearing module 202 is configured to generate a kill command based on the CTDB to end a residual process of a network-attached storage service running on the storage node when the network of the storage node is abnormal, so as to clear currently cached service data; the network card monitoring module 201 continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module 203 is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
It can be seen that the service data maintenance device disclosed in the embodiment of the present application, based on the network card monitoring service preset in the CTDB, can effectively monitor the network state of the storage node, and then timely clean the residual process of the network attached storage service after the network is abnormal, thereby effectively avoiding the problem that the cache data of the residual process causes the storage node to have data inconsistency, and further improving the data storage reliability of the storage cluster
For the specific content of the service data maintenance device, reference may be made to the foregoing detailed description of the service data maintenance method, and details thereof are not repeated here.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the storage node network is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module 203 is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
As a specific embodiment, in the service data maintenance apparatus disclosed in the embodiment of the present application, on the basis of the foregoing content, the network card monitoring module 201 is specifically configured to:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the heartbeat signal module is used for sending heartbeat signals to other storage nodes at regular time if the storage node network is normal after the service monitoring module periodically detects the network state of the storage node; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the service switching process preset in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
As a specific embodiment, in the service data maintenance apparatus disclosed in the embodiment of the present application, on the basis of the foregoing content, the heartbeat signal module is further configured to:
after the storage node network is recovered, after the network-attached storage service is started on the storage node again, the heartbeat signal is continuously sent to other storage nodes at regular time; after other storage nodes detect that the heartbeat signal of the storage node is recovered, the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again.
As a specific embodiment, the service data maintenance apparatus disclosed in the embodiment of the present application further includes, on the basis of the foregoing content:
the state identification module is used for modifying the network state identification of the storage node from a normal state to a fault state after the network of the storage node is abnormal; and after the storage node network is recovered, the network state identification of the storage node is changed from the fault state to the normal state.
Further, the present application also discloses a storage cluster, which includes a plurality of storage nodes, and as shown in fig. 5, each of the storage nodes includes:
a memory 301 for storing a computer program;
a processor 302 for executing said computer program to implement the steps of any of the service data maintenance methods described above.
Further, the present application also discloses a computer-readable storage medium, in which a computer program is stored, and the computer program is used for implementing the steps of any service data maintenance method described above when being executed by a processor.
For the specific content of the storage cluster and the computer-readable storage medium, reference may be made to the foregoing detailed description on the service data maintenance method, and details are not described here again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the equipment disclosed by the embodiment, the description is relatively simple because the equipment corresponds to the method disclosed by the embodiment, and the relevant parts can be referred to the method part for description.
It is further noted that, throughout this document, relational terms such as "first" and "second" are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The technical solutions provided by the present application are described in detail above. The principles and embodiments of the present application are explained herein using specific examples, which are provided only to help understand the method and the core idea of the present application. It should be noted that, for those skilled in the art, without departing from the principle of the present application, several improvements and modifications can be made to the present application, and these improvements and modifications also fall into the protection scope of the present application.

Claims (10)

1. A maintenance method of service data is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the method comprises the following steps:
calling a network card monitoring service preset in the CTDB so as to periodically detect the network state of the storage node;
if the storage node network is abnormal, generating a kill command based on the CTDB to end a residual process of the network-attached storage service operated on the storage node so as to clear the currently cached business data;
continuing to call the network card monitoring service to periodically detect the network state of the storage node;
and after the storage node network is recovered, restarting the network attached storage service on the storage node.
2. The method for maintaining business data according to claim 1, wherein the determining process of the local storage node network anomaly comprises:
and if the periodic network state detection results of the continuous preset number of times are all abnormal, judging that the storage node network is abnormal.
3. The method for maintaining business data according to claim 1, wherein after the invoking of the network card monitoring service preset in the CTDB for periodically detecting the network status of the storage node, the method further comprises:
if the storage node network is normal, calling a service state monitoring service preset in the CTDB so as to periodically detect the running state of the network-attached storage service of the storage node;
and if the network attached storage service of the storage node stops running, restarting the network attached storage service on the storage node.
4. The method for maintaining business data according to claim 1, wherein after the invoking of the network card monitoring service preset in the CTDB for periodically detecting the network status of the storage node, the method further comprises:
if the storage node network is normal, sending heartbeat signals to other storage nodes at regular time; after other storage nodes detect the interruption of the heartbeat signal of the storage node, the proxy node with normal network state is elected by triggering the preset service switching process in the CTDB, and the task of the network-attached storage service of the storage node is replaced and executed by the proxy node.
5. The method for maintaining business data according to claim 4, wherein after the network of the local storage node is recovered and the network-attached storage service is restarted on the local storage node, the method further comprises:
continuously sending heartbeat signals to other storage nodes at regular time; and the proxy node switches the task of the network-attached storage service, which is executed by replacing the storage node, to the storage node by triggering the service switching process again after other storage nodes detect that the heartbeat signal of the storage node is recovered.
6. The method for maintaining business data according to any one of claims 1 to 5, wherein after the network of local storage nodes is abnormal, the method further comprises:
modifying the network state identifier of the storage node from a normal state to a fault state;
after the storage node network is recovered, the method further includes:
and modifying the network state identifier of the storage node from the fault state to the normal state.
7. The service data maintenance device is applied to each storage node in a storage cluster, and each storage node runs a network-attached storage service based on a deployed CTDB; the device comprises:
the network card monitoring module is used for calling network card monitoring service which is preset in the CTDB so as to periodically detect the network state of the storage node;
the cache clearing module is used for generating a kill command based on the CTDB to end a residual process of network-attached storage service running on the storage node when the network of the storage node is abnormal so as to clear current cached business data; the network card monitoring module continues to call the network card monitoring service to periodically detect the network state of the storage node;
and the service restarting module is used for restarting the network attached storage service on the storage node after the storage node network is recovered.
8. The apparatus for maintaining business data according to claim 7, further comprising:
the service monitoring module is used for calling service state monitoring service preset in the CTDB when the network of the storage node is normal so as to periodically detect the running state of the network-attached storage service of the storage node;
the service restart module is further configured to: and when the network of the storage node is normal and the network attached storage service stops running, restarting the network attached storage service on the storage node.
9. A storage cluster comprising a plurality of storage nodes, wherein each of said storage nodes comprises:
a memory for storing a computer program;
processor for executing said computer program for implementing the steps of the method for maintenance of business data according to any one of claims 1 to 6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method for maintaining business data according to any one of claims 1 to 6.
CN201911386440.4A 2019-12-29 2019-12-29 Storage cluster, service data maintenance method, device and storage medium Pending CN111212127A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911386440.4A CN111212127A (en) 2019-12-29 2019-12-29 Storage cluster, service data maintenance method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911386440.4A CN111212127A (en) 2019-12-29 2019-12-29 Storage cluster, service data maintenance method, device and storage medium

Publications (1)

Publication Number Publication Date
CN111212127A true CN111212127A (en) 2020-05-29

Family

ID=70789434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911386440.4A Pending CN111212127A (en) 2019-12-29 2019-12-29 Storage cluster, service data maintenance method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111212127A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111817892A (en) * 2020-07-10 2020-10-23 济南浪潮数据技术有限公司 Network management method, system, electronic equipment and storage medium
CN112866408A (en) * 2021-02-09 2021-05-28 山东英信计算机技术有限公司 Service switching method, device, equipment and storage medium in cluster
CN113626238A (en) * 2021-07-23 2021-11-09 济南浪潮数据技术有限公司 ctdb service health state monitoring method, system, device and storage medium
CN114035905A (en) * 2021-11-19 2022-02-11 江苏安超云软件有限公司 Fault migration method and device based on virtual machine, electronic equipment and storage medium
CN115102962A (en) * 2022-06-22 2022-09-23 青岛中科曙光科技服务有限公司 Cluster management method and device, computer equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237400A (en) * 2008-01-24 2008-08-06 创新科存储技术(深圳)有限公司 Migration method for network additive storage service and network additional storage node
CN106790565A (en) * 2016-12-27 2017-05-31 中国电子科技集团公司第五十二研究所 A kind of network attached storage group system
CN108847982A (en) * 2018-06-26 2018-11-20 郑州云海信息技术有限公司 A kind of distributed storage cluster and its node failure switching method and apparatus
CN108958991A (en) * 2018-07-26 2018-12-07 郑州云海信息技术有限公司 Clustered node failure business quick recovery method, device, equipment and storage medium
CN109218141A (en) * 2018-11-20 2019-01-15 郑州云海信息技术有限公司 A kind of malfunctioning node detection method and relevant apparatus
CN109474694A (en) * 2018-12-04 2019-03-15 郑州云海信息技术有限公司 A kind of management-control method and device of the NAS cluster based on SAN storage array
CN109614201A (en) * 2018-12-04 2019-04-12 武汉烽火信息集成技术有限公司 The OpenStack virtual machine high-availability system of anti-fissure
CN109634716A (en) * 2018-12-04 2019-04-16 武汉烽火信息集成技术有限公司 The OpenStack virtual machine High Availabitity management end device and management method of anti-fissure
CN109639794A (en) * 2018-12-10 2019-04-16 杭州数梦工场科技有限公司 A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing
CN110519086A (en) * 2019-08-08 2019-11-29 苏州浪潮智能科技有限公司 A kind of method and apparatus of the fast quick-recovery storage cluster NAS business based on CTDB
CN110611603A (en) * 2019-09-09 2019-12-24 苏州浪潮智能科技有限公司 Cluster network card monitoring method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237400A (en) * 2008-01-24 2008-08-06 创新科存储技术(深圳)有限公司 Migration method for network additive storage service and network additional storage node
CN106790565A (en) * 2016-12-27 2017-05-31 中国电子科技集团公司第五十二研究所 A kind of network attached storage group system
CN108847982A (en) * 2018-06-26 2018-11-20 郑州云海信息技术有限公司 A kind of distributed storage cluster and its node failure switching method and apparatus
CN108958991A (en) * 2018-07-26 2018-12-07 郑州云海信息技术有限公司 Clustered node failure business quick recovery method, device, equipment and storage medium
CN109218141A (en) * 2018-11-20 2019-01-15 郑州云海信息技术有限公司 A kind of malfunctioning node detection method and relevant apparatus
CN109474694A (en) * 2018-12-04 2019-03-15 郑州云海信息技术有限公司 A kind of management-control method and device of the NAS cluster based on SAN storage array
CN109614201A (en) * 2018-12-04 2019-04-12 武汉烽火信息集成技术有限公司 The OpenStack virtual machine high-availability system of anti-fissure
CN109634716A (en) * 2018-12-04 2019-04-16 武汉烽火信息集成技术有限公司 The OpenStack virtual machine High Availabitity management end device and management method of anti-fissure
CN109639794A (en) * 2018-12-10 2019-04-16 杭州数梦工场科技有限公司 A kind of stateful cluster recovery method, apparatus, equipment and readable storage medium storing program for executing
CN110519086A (en) * 2019-08-08 2019-11-29 苏州浪潮智能科技有限公司 A kind of method and apparatus of the fast quick-recovery storage cluster NAS business based on CTDB
CN110611603A (en) * 2019-09-09 2019-12-24 苏州浪潮智能科技有限公司 Cluster network card monitoring method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111817892A (en) * 2020-07-10 2020-10-23 济南浪潮数据技术有限公司 Network management method, system, electronic equipment and storage medium
CN111817892B (en) * 2020-07-10 2023-04-07 济南浪潮数据技术有限公司 Network management method, system, electronic equipment and storage medium
CN112866408A (en) * 2021-02-09 2021-05-28 山东英信计算机技术有限公司 Service switching method, device, equipment and storage medium in cluster
CN112866408B (en) * 2021-02-09 2022-08-09 山东英信计算机技术有限公司 Service switching method, device, equipment and storage medium in cluster
CN113626238A (en) * 2021-07-23 2021-11-09 济南浪潮数据技术有限公司 ctdb service health state monitoring method, system, device and storage medium
CN113626238B (en) * 2021-07-23 2024-02-20 济南浪潮数据技术有限公司 ctdb service health state monitoring method, system, device and storage medium
CN114035905A (en) * 2021-11-19 2022-02-11 江苏安超云软件有限公司 Fault migration method and device based on virtual machine, electronic equipment and storage medium
CN115102962A (en) * 2022-06-22 2022-09-23 青岛中科曙光科技服务有限公司 Cluster management method and device, computer equipment and storage medium
CN115102962B (en) * 2022-06-22 2024-08-23 青岛中科曙光科技服务有限公司 Cluster management method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111212127A (en) Storage cluster, service data maintenance method, device and storage medium
CN107515796B (en) Equipment abnormity monitoring processing method and device
CN110830283B (en) Fault detection method, device, equipment and system
JP6019995B2 (en) Distributed system, server computer, and failure prevention method
CN109274544B (en) Fault detection method and device for distributed storage system
CN111953566B (en) Distributed fault monitoring-based method and virtual machine high-availability system
CN112463448B (en) Distributed cluster database synchronization method, device, equipment and storage medium
CN102360324B (en) Failure recovery method and equipment for failure recovery
CN112612545A (en) Configuration hot loading system, method, equipment and medium of server cluster
CN106506278B (en) Service availability monitoring method and device
CN109600264A (en) CloudStack cloud platform
JP6421516B2 (en) Server device, redundant server system, information takeover program, and information takeover method
US10157110B2 (en) Distributed system, server computer, distributed management server, and failure prevention method
JP5285044B2 (en) Cluster system recovery method, server, and program
CN104394033B (en) Monitoring system, method and device across data center
JP2007280155A (en) Reliability improving method in dispersion system
CN115712521A (en) Cluster node fault processing method, system and medium
CN112269693B (en) Node self-coordination method, device and computer readable storage medium
JP6984119B2 (en) Monitoring equipment, monitoring programs, and monitoring methods
CN115314361A (en) Server cluster management method and related components thereof
CN113157493A (en) Backup method, device and system based on ticket checking system and computer equipment
JP4968568B2 (en) Fault monitoring method, fault monitoring system and program
WO2014040470A1 (en) Alarm message processing method and device
CN112612652A (en) Distributed storage system abnormal node restarting method and system
CN107783855B (en) Fault self-healing control device and method for virtual network element

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200529