CN113783735A - Method, device, equipment and medium for identifying fault node in Redis cluster - Google Patents

Method, device, equipment and medium for identifying fault node in Redis cluster Download PDF

Info

Publication number
CN113783735A
CN113783735A CN202111119656.1A CN202111119656A CN113783735A CN 113783735 A CN113783735 A CN 113783735A CN 202111119656 A CN202111119656 A CN 202111119656A CN 113783735 A CN113783735 A CN 113783735A
Authority
CN
China
Prior art keywords
node
message
nodes
threshold time
redis cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111119656.1A
Other languages
Chinese (zh)
Inventor
张迪
毛琦
李清炳
刘军
于洋
郑洋
贺晋如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiaohongshu Technology Co ltd
Original Assignee
Xiaohongshu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaohongshu Technology Co ltd filed Critical Xiaohongshu Technology Co ltd
Priority to CN202111119656.1A priority Critical patent/CN113783735A/en
Publication of CN113783735A publication Critical patent/CN113783735A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention relates to the technical field of databases, in particular to a method, a device, equipment and a medium for identifying a fault node in a Redis cluster. The method is used for nodes in a Redis cluster, and comprises the steps of actively sending a first message to a first node in the Redis cluster, broadcasting a request message to other nodes in the Redis cluster under the condition that a response message of the first node to the first message is not received after a first threshold time from the moment of sending the first message, wherein the request message is used for requesting the other nodes to immediately send the first message to the first node, and marking the first node as suspected fault under the condition that a response message from the first node is not received after a second threshold time from the moment of sending the first message. The convergence speed of the node fault cluster view in the Redis cluster is accelerated.

Description

Method, device, equipment and medium for identifying fault node in Redis cluster
Technical Field
The invention relates to the technical field of databases, in particular to a method, a device, equipment and a medium for identifying a fault node in a Redis cluster.
Background
Redis (remote Dictionary Server), a remote Dictionary service, is a key-value storage system. Redis clusters are a distributed solution for Redis, are decentralized, and communicate with each other by means of the Gossip protocol, state synchronization and fault detection. However, the node fault cluster view convergence time of the current fault detection scheme based on Gossip protocol may be long.
Disclosure of Invention
The invention aims to provide a method, a device, equipment and a medium for identifying a fault node in a Redis cluster, and solve the technical problem of long convergence time of a node fault cluster view in the Redis cluster.
The embodiment of the application discloses a method for identifying a fault node in a Redis cluster, which is used for the node in the Redis cluster and comprises the following steps
Actively sending a first message to a first node in the Redis cluster,
broadcasting a request message to other nodes in the Redis cluster under the condition that a response message of the first node to the first message is not received after a first threshold time passes from the moment of sending the first message, wherein the request message is used for requesting the other nodes to immediately send the first message to the first node,
and under the condition that the response message from the first node is not received after a second threshold time passes from the moment of sending the first message, marking the first node as suspected fault.
The embodiment of the application discloses a method for identifying a fault node in a Redis cluster, which is used for the node in the Redis cluster and comprises the following steps
Immediately sending a first message to the first node in response to receiving a request message from the other node, delivered by broadcast,
and under the condition that the response message of the first node to the first message is not received after a second threshold time passes from the moment of sending the first message, marking the first node as suspected fault.
Optionally, the method further includes, in a case where the first node is marked as suspected of being failed, sending a notification message to a predetermined number of other nodes, where the notification message includes that the first node is marked as suspected of being failed.
Optionally, the method further comprises:
in the event that notification messages are received for more than a predetermined number of other nodes, the first node is marked as failed.
Optionally, the predetermined number is 5.
Optionally, the first threshold time is less than or equal to a second threshold time, where the first threshold time is 2 seconds and the second threshold time is 3 seconds.
Optionally, the method further comprises:
and under the condition that the response message of the first node to the first message is not received after a second threshold time passes from the moment of sending the first message, marking the first node as a suspected fault within a preset time length, and replacing the second threshold time with a third threshold time, wherein the third threshold time is greater than the second threshold time.
Optionally, the predetermined time period is 24 hours, the second threshold time is 3 seconds, and the third threshold time is 15 seconds.
The embodiment of the application discloses a method for identifying a fault node in a Redis cluster, which is used for the Redis cluster comprising a plurality of nodes and is characterized by comprising the following steps
The second node actively sends a first message to a first node in the Redis cluster,
broadcasting a request message to other nodes in the Redis cluster when the second node does not receive a response message of the first node to the first message after a first threshold time from the time of sending the first message, the request message being used for requesting the other nodes to immediately send the first message to the first node,
when the second node does not receive the response message from the first node after a second threshold time from the moment of sending the first message, marking the first node as a suspected fault;
each of the other plurality of nodes sending a first message to the first node in response to the request message;
and each node in the other nodes marks the first node as suspected fault when the response message from the first node is not received after a second threshold time from the moment of sending the first message.
The implementation mode of the application discloses a device for identifying a fault node in a Redis cluster, which is characterized by comprising a sending module, a judging module and a judging module, wherein the sending module is used for actively sending a first message to a first node in the Redis cluster;
a receiving module, configured to broadcast a request message to other nodes in the Redis cluster when a response message from the first node to the first message is not received within a first threshold time from a time when the sending module sends the first message, where the request message is used to request the other nodes to immediately send the first message to the first node,
or, in response to the request message received by the receiving module and transmitted by other nodes through broadcasting, the sending module immediately sends a first message to the first node; and
and the processing module marks the first node as suspected fault under the condition that the receiving module does not receive the response message from the first node after a second threshold time from the moment that the sending module sends the first message.
The embodiment of the application discloses identification equipment for a fault node in a Redis cluster, which is characterized by comprising a memory and a processor, wherein the memory stores computer executable instructions;
the instructions, when executed by the processor, cause the apparatus to implement a method of identifying a failed node in a Redis cluster according to any of claims 1-9.
An embodiment of the present application discloses a computer-readable medium, characterized in that instructions are stored on the computer storage medium, which instructions, when run on a computer, cause the computer to perform the method of identification of a faulty node in a Redis cluster according to any of claims 1-9.
Compared with the prior art, the implementation mode of the application has the main differences and the effects that:
in this application, when a response message from the first node to the first message is not received after a first threshold time elapses from a time when the first message is sent, a request message is broadcasted to other nodes in the Redis cluster, where the request message is used to request the other nodes to immediately send the first message to the first node. The time required for sending the ping message to the fault node from the node fault to the normal node in the cluster is shortened.
In the present application, a first message is sent immediately to a first node in response to receiving a request message from another node that is delivered by broadcast. More than half of nodes are accelerated to mark the fault nodes as PFAIL states, and then the convergence speed of the node fault cluster view is accelerated.
In this application, when a first node is marked as suspected to be failed, a notification message is sent to a predetermined number of other nodes, where the notification message includes that the first node is marked as suspected to be failed. Because the first message contains the state information of each node, after the failed node is marked as PFAIL, the state information of PFAIL can be rapidly diffused in the cluster, and the process that the failed node is marked as FAIL is promoted, namely, the convergence speed of the node failure cluster view is accelerated.
In the present application, the first threshold time is less than or equal to the second threshold time, wherein the first threshold time is 2 seconds, and the second threshold time is 3 seconds. Therefore, the time required by the normal node in the cluster to send the ping message to the fault node when the node fails can be controlled to be about 2s, the time for the normal node to send the ping message to the fault node and the normal node to mark the fault node as the PFAIL state is optimized to be about 3s, and the convergence speed of the node fault cluster view is accelerated.
In the application, when a response message of the first node to the first message is not received after a second threshold time from a time when the first message is sent, the first node is marked as a suspected fault within a predetermined time length, and a third threshold time is used to replace the second threshold time, wherein the third threshold time is greater than the second threshold time. The second threshold time can be set to be smaller, so that the convergence speed of the node fault cluster view can be accelerated, but misjudgment is easily caused, so that if a response message of the first node to the first message is not received after the second threshold time from the moment of sending the first message for the first time in a period of time, the node can be considered as a suspected fault, and if the event frequently occurs in the following period of time, the event is probably caused by network jitter and is not considered as a suspected fault, and a third threshold time with a larger value is used for replacing the second threshold time to identify the suspected fault, so that misjudgment is avoided, and the availability of the cluster is enhanced.
Drawings
Fig. 1 shows an exemplary schematic diagram of a Proxy layer and a Redis cluster of an implementation scenario according to an embodiment of the application.
Fig. 2A shows a flowchart of a method for identifying a failed node in a Redis cluster according to an embodiment of the present application.
Fig. 2B shows a flowchart of a method for identifying a failed node in a Redis cluster according to an embodiment of the present application.
Fig. 3 illustrates an apparatus for identifying a failed node in a Redis cluster according to an embodiment of the present application.
Fig. 4 illustrates an identification device of a failed node in a Redis cluster according to an embodiment of the present application.
Fig. 5 shows a timeline diagram of a method of identifying a failed node in a Redis cluster according to an embodiment of the application.
Detailed Description
The present application is further described with reference to the following detailed description and the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. In addition, for convenience of description, only a part of structures or processes related to the present application, not all of them, is illustrated in the drawings. It should be noted that in this specification, like reference numerals and letters refer to like items in the following drawings.
It will be understood that, although the terms "first", "second", etc. may be used herein to describe various features, these features should not be limited by these terms. These terms are used merely for distinguishing and are not intended to indicate or imply relative importance. For example, a first feature may be termed a second feature, and, similarly, a second feature may be termed a first feature, without departing from the scope of example embodiments.
In the description of the present application, it is also to be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present embodiment can be understood in specific cases by those of ordinary skill in the art.
Illustrative embodiments of the present application include, but are not limited to, methods, apparatus, devices, and media for identification of failed nodes in a Redis cluster.
Various aspects of the illustrative embodiments will be described using terms commonly employed by those skilled in the art to convey the substance of their work to others skilled in the art. It will be apparent, however, to one skilled in the art that some alternative embodiments may be practiced using the features described in part. For purposes of explanation, specific numbers and configurations are set forth in order to provide a more thorough understanding of the illustrative embodiments. It will be apparent, however, to one skilled in the art that alternative embodiments may be practiced without the specific details. In some other instances, well-known features are omitted or simplified in order not to obscure the illustrative embodiments of the present application.
Moreover, various operations will be described as multiple operations separate from one another in a manner that is most helpful in understanding the illustrative embodiments; however, the order of description should not be construed as to imply that these operations are necessarily order dependent, and that many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when the described operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
References in the specification to "one embodiment," "an illustrative embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature is described in connection with a particular embodiment, the knowledge of one skilled in the art can affect such feature in combination with other embodiments, whether or not such embodiments are explicitly described.
The terms "comprising," "having," and "including" are synonymous, unless the context dictates otherwise. The phrase "A and/or B" means "(A), (B) or (A and B)".
As used herein, the term "module" may refer to, be a part of, or include: memory (shared, dedicated, or group) for executing one or more software or firmware programs, an Application Specific Integrated Circuit (ASIC), an electronic circuit and/or processor (shared, dedicated, or group), a combinational logic circuit, and/or other suitable components that provide the described functionality.
In the drawings, some features of the structures or methods may be shown in a particular arrangement and/or order. However, it should be understood that such specific arrangement and/or ordering is not required. Rather, in some embodiments, these features may be described in a manner and/or order different from that shown in the illustrative figures. Additionally, the inclusion of structural or methodical features in a particular figure does not imply that all embodiments need to include such features, and in some embodiments, may not include such features or may be combined with other features.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Fig. 1 shows an exemplary schematic diagram of a Proxy layer and a Redis cluster of an implementation scenario according to an embodiment of the application.
As shown in fig. 1, an implementation scenario according to an embodiment of the present application includes a Proxy layer (Proxy layer) 102 and a Redis cluster 104, where the Redis cluster 104 includes nodes a to F, and the Proxy layer 102 and the nodes a to F may be respectively deployed in a plurality of different servers, or may be deployed in different virtual machines of one or more servers, where the servers may be independent physical servers, or a server cluster or a distributed system composed of a plurality of servers, and may also be cloud servers that provide basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDNs, and big data and artificial intelligence platforms, but are not limited thereto.
The communication mechanism of the Redis cluster 104 is as follows: at intervals (e.g., 100ms), a node in the cluster 104 sends a ping message to other nodes in the cluster 104 with addresses, slots, status information, last communication time, etc. of the other nodes known to itself. After receiving the ping message, the node replies a pong message, and the message also carries information of other nodes known by the node. The nodes exchange state information to detect the states of the nodes, such as: online status, suspected offline (or suspected fault) status (PFAIL), offline (or fault) status (FAIL).
When a certain time (threshold, node _ timeout) elapses after any node (e.g., node a) sends a ping message to another node (e.g., node D), node a does not receive a pong message from node D, node a marks node D as PFAIL. When node a sends a ping message to other nodes, the ping message includes information about other nodes in the cluster 104, including information that "D node is marked as PFAIL". And when the node A knows that the node C marks the node D as PFAIL through the received ping information or pong message, the node A finds the clusteridenode structure corresponding to the node D in the own clusterisState nodes dictionary, and adds the offline report of the node C about D to the fail _ reports linked list of the clusteridenode structure. When node a receives that more than half (or other threshold number) of the nodes in cluster 104 have reported node D as PFAIL, node a marks node D as FAIL. That is, the marking of node D as FAIL by node a requires that the following two conditions be satisfied: 1) there are more than half (or other threshold number) of nodes marking node D as PFAIL state, 2) node a also marks node D as PFAIL state. It can be seen that when a node FAILs to a normal node in the cluster 104, the failed node is marked as FAIL state, and there is a period of node failure cluster view convergence process, and the period of this period of process is the node failure cluster view convergence time.
In a scene that the server uses the Redis cluster 104 as a database, the Proxy layer 102 can be set, so that the client can directly access the Redis cluster 104 through the Proxy of the Proxy layer 102 without knowing information such as the number of nodes and node update in the Redis cluster 104, just like accessing a standalone Redis database. The Proxy layer 102 can sense node failure information in the Redis cluster 104 through a fixed node (which may be any one of the nodes in the cluster 104, and the node E is taken as an example in the figure) in the Redis cluster 104. For example, after a node D in the Redis cluster 104 FAILs, the information that the node D FAILs is spread in the Redis cluster 104 through cluster view convergence, where the information includes that the node E marks the node D as a FAIL state, and then the node E reports the FAIL state information of the node D to the Proxy layer 102. Thus, when a client requests access to Redis cluster 104 through Proxy layer 102, Proxy layer 102 avoids node D. It is to be understood that the nodes of the Redis cluster 104 are not limited to A-F, and the number thereof may be arbitrary.
In the fault detection scheme of the existing Redis cluster based on the Gossip protocol, if the cluster 104 is large and the number of nodes is large, the probability that the fault node receives the ping message becomes small, and the ping message is sent to the fault node only when the time exceeds the node _ timeout/2 at most. Therefore, when a node fails, a normal node in the cluster 104 will not ping the failed node for verification in the first time even if it receives a message that other nodes mark the node as suspected to fail. It takes a long time for more than half of the nodes to mark the failed node as the PFAIL state, thereby affecting the time from the node failure to the time that the failed node is marked with FAIL by other nodes. Because the Proxy layer can sense the node failure in the cluster 104 through the message reported by the fixed node in the cluster 104, within the convergence time of the node failure cluster view, the Proxy layer 102 cannot sense that the failed node has failed, and the failed node is still used as a normal node to send a request, thereby causing the request failure.
In view of this problem, an embodiment of the present application provides a method for identifying a failed node in a Redis cluster, where the method is used for a node in the Redis cluster, and with reference to fig. 1, an example according to some embodiments of the present application is: the nodes of the Proxy layer 102 and the Redis cluster 104 may be deployed on multiple devices or multiple virtual machines, respectively (which may be in one or more servers). When a node, e.g., node D, in the cluster fails (e.g., the device in which the virtual machine resides fails), at time t0 (e.g., noted as 0s), node a actively sends a first message to node D according to the Gossip protocol. Normally, node D should reply to node a with a response message to the first message after receiving the first message, however, node a does not receive a response message replied by node D after a first threshold time Δ t1 (e.g., 2s) has elapsed. Node a immediately (e.g., within 5ms depending on hardware and network conditions) communicates a request message to a plurality (e.g., half of the total number of cluster nodes) of other nodes B, C, E, F in the cluster by broadcasting to request them to immediately send a first message to node D to verify if node D fails. Upon receiving the request, node B, C, E, F sends a first message to node D at time t0+ Δ t1(0s +2s — 2 s). It is to be noted that "immediately" in the above may mean, for example, that the minimum value is usually extremely small within a predetermined period set according to the minimum value that can be reached by the conditions of the hardware, the network, and the like, and that the predetermined period may be regarded as "immediately". "active" may refer to, for example, each node in the Redis cluster autonomously selecting one or more other nodes to which to send a first message according to a certain rule (e.g., randomly) at intervals (e.g., 1s, 100ms, etc.) according to the Gossip protocol. And, in contrast, "passive" may refer, for example, to a node in a Redis cluster sending a first message to a particular node as requested by a request message delivered by broadcast from other nodes in response to receiving the request message.
At time t0, when a second threshold time Δ t2 (e.g., 3s) elapses after the node a sends the first message to the node D, the node a has not received a response message from the node D, and then immediately sends a notification message to a predetermined number of nodes in the cluster at random, for example, a notification message, which may be the first message, to both nodes E and H, the notification message including information that the first node is marked as suspected failure, at time t0+ Δ t2(0s +3s — 3 s).
Similarly, at time t0+ Δ t1, when the second threshold time Δ t2 elapses after the first message is sent by nodes B, C, E, and F, and the nodes B, C, E, and F have not received the response message from node D, then nodes B, C, E, and F mark node D as PFAIL, and immediately and randomly send notification messages to a predetermined number of nodes in the cluster, for example, node B sends notification messages to nodes a and E, node C sends notification messages to nodes B and E, node E sends notification messages to nodes F and G, and node F sends notification messages to nodes E and H, at this time, t0+ Δ t1+ Δ t2(0s +2s +3s 5 s). A timeline diagram of the above events according to this embodiment is shown in fig. 5.
Thus, at time t0+ Δ t1+ Δ t2(0s +2s +3s ═ 5s), we look specifically at node E, find that it marked node D as PFAIL, and also receive notification messages from nodes a, B, C, F, know that nodes a, B, C, F also marked node D as PFAIL, and then node E marks node D as FAIL. The Proxy layer 102 can then sense through node E that node D has failed.
In some embodiments, the node that receives the request message from the other nodes and sends the first message to the failed node does not receive the response message returned by the failed node after the first threshold time, and does not broadcast the request message to the other nodes, so as to avoid the nodes in the cluster from sending the request message to each other and sending the first message to the failed node indefinitely.
It will be understood by those skilled in the art that, in the present application, the first message may be a ping message, or may be any other message requesting a response, the corresponding response message is a pong message, and the notification message may be a ping message, or may be any other message including information for sending a flag indicating a suspected failure of a node.
In the method and the device, the time required for the normal node in the cluster to send the ping message to the fault node when the node fails is shortened, and more than half of nodes are accelerated to mark the fault node in a PFAIL state. Meanwhile, the state information of the PFAIL is rapidly diffused in the cluster, and the process that the fault node is marked as FAIL is promoted, namely, the convergence speed of the node fault cluster view is accelerated. The cluster convergence time is shortened, the number of proxy layer error requests can be effectively reduced, and the stability and the usability of the cluster are improved.
According to some embodiments of the present application, in case one node is marked as suspected to be faulty, a notification message is sent to a predetermined number of other nodes, and when the cluster is large, the "predetermined number" may be, for example, 5. In general, even if the number of nodes in a cluster is large, after a normal node marks a failed node as PFAIL, a ping message is actively sent to 5 nodes, and the information marked as PFAIL of the failed node can be rapidly diffused by means of the characteristic of Gossip protocol information propagation.
According to some embodiments of the present application, the second threshold time is equal to or greater than the first threshold time, and preferably, the second threshold time is greater than the first threshold time, for example, the first threshold time is 2 seconds and the second threshold time is 3 seconds. Therefore, after a node fails, the time required for a first normal node to send a ping message to a failed node, the time required for most normal nodes in a cluster to send the ping message to the failed node can be controlled to be about 2s, in a Gossip protocol, the time for the normal nodes to send the ping message to the failed node and the normal nodes to mark the failed node as a PFAIL state is usually 15s, the time is optimized to be about 3s in the application, and the convergence speed of a node failure cluster view is accelerated.
According to some embodiments of the present application, the request message is broadcast to more than half of the other nodes in the Redis cluster. Since one normal node marks the failed node as FAIL requires that two conditions are met: 1) the normal node marks the failed node as PFAIL state, 2) more than half of the nodes in the cluster mark the failed node as PFAIL state. Doing so may quickly facilitate more than half of the nodes in the cluster marking the failed node as PFAIL state.
According to some embodiments of the present application, when a response message from the first node to the first message is not received after a second threshold time elapses from a time when the first message is sent, the first node is marked as a suspected fault within a predetermined time period, and the second threshold time is replaced with a third threshold time, where the third threshold time is longer than the second threshold time. Preferably, the predetermined time period may be 24 hours, the second threshold time is 3 seconds, and the third threshold time is 15 seconds.
An example according to this embodiment is: in the state of marking a first node as suspected fault, once any message from the first node is received, or a predetermined number (e.g., half of the total number of nodes in the cluster) of normal markings for the first node from other nodes are received, the marking of the first node for suspected fault is cancelled. And the action and time of marking suspected faults and the action and time of unmarking suspected faults are recorded each time. As explained in conjunction with fig. 1, for example, at time t0 (e.g., denoted as 0s), node a sends a first message to node D, node a does not receive a response message from node D after a second threshold time Δ t2 (e.g., 3s) has elapsed, and node a immediately marks node D as PFAIL and immediately sends notification messages to a predetermined number of other nodes, and records time t0+ Δ t2(0s +3s — 3s) at that time. Within a subsequent predetermined time period Δ t3 (e.g., 86,400s at 24 h) (i.e., within the interval [ t0+ Δ t2, t0+ Δ t2+ Δ t3], e.g., within [3s,86,403s ]), node a receives any message (e.g., ping message, pong message, request message, notification message, etc.) from node D, node a cancels the marking of the suspected fault to node D and records the time at that time. After that, node a sends a first message to node D, and after a second threshold time Δ t2, node a does not receive a response message from node D, then it is considered that there is network jitter, and node D is not marked as PFAIL, and only if node a does not mark a suspected failure of node D, node a sends the first message to node D, and after a third threshold time Δ t4 (e.g., 15s), node a does not receive a response message from node D, but immediately marks node D as PFAIL and immediately sends a notification message to a predetermined number of other nodes. Until time t0+ Δ t2+ Δ t3 (e.g., 3s +86,400s — 86,403s), when node a does not mark node D as suspected failure, node a sends a first message to node D, node a does not receive a response message from node D after a second threshold time Δ t2 (e.g., 3s), and then node a marks node D as PFAIL immediately and sends notification messages to a predetermined number of other nodes immediately, records the time of this, and then loops similar to the above-described process.
In the application, the second threshold time may be set to be smaller, so that the convergence speed of the node fault cluster view may be increased, but misjudgment may be easily caused, so if a response message of the first node to the first message is not received after the second threshold time passes from a time when the first message is sent for the first time within a period of time, the node may be considered as a suspected fault, and in a later period of time, if the event frequently occurs, the event may be caused by network jitter, the node may not be considered as a suspected fault, and a third threshold time larger than the first threshold time is used instead of the second threshold time to identify the suspected fault, so that misjudgment is avoided, and the availability of the cluster is enhanced.
Fig. 2A and 2B show a flow diagram of a method according to an embodiment of the present application. With reference to fig. 1, the method shown in the flowchart of fig. 2A is used for a node actively sending a ping message to a first node in a Redis cluster, and the flowchart includes:
step 202, a normal node in the Redis cluster sends a ping message to a first node in the Redis cluster; for example, at time t0 (e.g., noted as 0s), node A sends a ping message to node D according to the Gossip protocol.
Normally, the node D should reply the pong message to the node a after receiving the ping message, and if the node D fails (for example, the device where the virtual machine is located fails), the node D may not reply the pong message to the node a normally. As shown in step 204, the node does not receive the pong message returned by the first node after the first threshold time. For example, node a did not receive a pong message from node D in reply after a first threshold time Δ t1 (e.g., 2 s).
In step 206, the above node broadcasts a request message to other nodes in the Redis cluster, where the request message is used to request the other nodes to send ping messages to the first node; for example, after 2 seconds, node a does not receive the pong message from node D, and broadcasts a request message to other nodes B, C, E, and F in the cluster to request them to send a ping message to node D, and upon receiving the request, nodes B, C, E, and F immediately send a ping message to node D at time t0+ Δ t1(0s +2s — 2 s).
Optionally, the broadcasted request message is sent to a plurality of nodes in the cluster, which may be all the nodes, or may be part of the nodes, for example, more than half of the nodes in the cluster.
When the node does not receive the pong message returned by the first node after the second threshold time in step 208; for example, at time t0, node a has not received a pong message from node D after a second threshold time Δ t2 (e.g., 3s) has elapsed since node a sent the ping message to node D.
Then the node marks the first node as suspected of failure at step 210. For example, node a marks node D as PFAIL, and then node a randomly sends a ping message to a predetermined number of nodes in the cluster, e.g., to nodes E and H, at time t0+ Δ t2(0s +3s — 3 s). Similarly, at time t0+ Δ t1, when the second threshold time Δ t2 elapses after the first message is sent by nodes B, C, E, and F, and node B, C, E, and F have not received the pong message from node D, then nodes B, C, E, and F mark node D as PFAIL, and immediately and randomly send ping messages to a predetermined number of nodes in the cluster, for example, node B sends ping messages to nodes a and E, node C sends ping messages to nodes B and E, node E sends ping messages to nodes F and G, and node F sends ping messages to nodes E and H, at this time, t0+ Δ t1+ Δ t2(0s +2s +3s — 5 s).
The above step 202-210 describes the method for identifying a failed node according to the present application by taking the node a finding the node D as a suspected failed node as an example. Hereinafter, the other nodes that send ping messages to the first node in response to received request messages in this example perform the faulty node identification method according to the present application is described with reference to fig. 2B.
In step 212, a normal node in the Redis cluster receives a request message broadcast from other nodes. For example, at time t0 (e.g., noted as 0s), node a sends a ping message to node D according to the Gossip protocol, and node D should normally reply a pong message to node a after receiving the ping message, and if node D fails (e.g., the device where the virtual machine is located fails), node D will not normally reply a pong message to node a. For example, node a did not receive a pong message from node D in reply after a first threshold time Δ t1 (e.g., 2 s). Node a broadcasts a request message to a number of other nodes in the cluster (e.g., half of the total number of cluster nodes) at once requesting them to send a ping message to node D, which node B receives the request message communicated by the broadcast.
At step 214, the node sends a ping message to the first node; for example, node B receives the request message and immediately sends a ping message to node D at time t0+ Δ t1(0s +2 s-2 s).
In step 216, the node does not receive the pong message returned by the first node after the second threshold time. For example, node B has not received a pong message from node D after a second threshold time Δ t2 after t0+ Δ t 1.
Then the first node is marked as suspected of failure at step 218. For example, node B marks node D as PFAIL and then randomly sends a ping message to a predetermined number of nodes in the cluster at once, e.g. to nodes C and E, at time t0+ Δ t1+ Δ t2(0s +2s +3 s-5 s).
In the application, after sending a ping message to a first node in a Redis cluster, when a pong message returned by the first node is not received after a first threshold time, broadcasting a request message to other nodes in the Redis cluster, wherein the request message requests the other nodes to send the ping message to the first node; the time required for sending the ping message to the fault node from the node fault to the normal node in the cluster is shortened. In this application, after receiving the request message, a ping message is sent to the first node. More than half of nodes are accelerated to mark the fault nodes as PFAIL states, and then the convergence speed of the node fault cluster view is accelerated. The cluster convergence time is shortened, the number of proxy layer error requests can be effectively reduced, and the stability and the usability of the cluster are improved.
Fig. 3 illustrates an apparatus for identifying a failed node in a Redis cluster according to an embodiment of the present application.
As shown in fig. 3, the apparatus 300 includes:
a sending module 302, which actively sends a first message to a first node in the Redis cluster;
a receiving module 304, where the receiving module 304 does not receive a response message from the first node to the first message after a first threshold time from the time when the sending module 302 sends the first message, the sending module 302 broadcasts a request message to other nodes in the Redis cluster, the request message is used to request the other nodes to send the first message to the first node immediately,
alternatively, in response to the receiving module 304 receiving a request message from another node via broadcast, the sending module 302 immediately sends a first message to the first node; and
the processing module 306, the receiving module 304, in case that no response message is received from the first node after a second threshold time from the time when the sending module 302 sends the first message, the processing module 306 marks the first node as suspected failure.
The first embodiment is a method embodiment corresponding to the present embodiment, and the present embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
Fig. 4 illustrates an identification device of a failed node in a Redis cluster according to an embodiment of the present application.
As shown in FIG. 4, the apparatus 400 includes
A memory 402 for storing computer-executable instructions, an
A processor 404 for executing the instructions to implement any one of the possible methods of the first embodiment described above.
The first embodiment is a method embodiment corresponding to the present embodiment, and the present embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
Specifically, as shown in fig. 4, the apparatus 400 may include one or more memories 402 (only one shown) and a processor 404 (the processor 404 may include, but is not limited to, a processing device such as a central processing unit CPU, an image processor GPU, a digital signal processor DSP, a microprocessor MCU, or a programmable logic device FPGA). The specific connection medium between the memory 402 and the processor 404 is not limited in the embodiments of the present application. In the embodiment of the present application, the memory 402 and the processor 404 are connected by a bus 406 in fig. 4, the bus 406 is represented by a thick line in fig. 4, and the connection manner between other components is merely for illustrative purposes and is not limited thereto. The bus 406 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus. It will be understood by those skilled in the art that the structure shown in fig. 4 is only an illustration and is not intended to limit the structure of the electronic device. For example, device 400 may also include more or fewer components than shown in FIG. 4, or have a different configuration than shown in FIG. 4.
The processor 404 executes various functional applications and data processing by running software programs and modules stored in the memory 402, that is, implements the above-mentioned identification method of a failed node in a Redis cluster.
Memory 402 may be used to store program instructions/modules corresponding to the method for identifying a failed node in a Redis cluster as in some embodiments of the present application that are executed by processor 404. The memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 402 may further include memory located remotely from the processor 404, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
According to some embodiments of the present application, a computer storage medium is disclosed, having stored thereon instructions that, when executed on a computer, cause the computer to perform any one of the possible methods of the first embodiment described above.
The first embodiment is a method embodiment corresponding to the present embodiment, and the present embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented in the form of instructions or programs carried on or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors or the like. When the instructions or program are executed by a machine, the machine may perform the various methods described previously. For example, the instructions may be distributed via a network or other computer readable medium. Thus, a machine-readable medium may include, but is not limited to, any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), such as floppy diskettes, optical disks, compact disc read-only memories (CD-ROMs), magneto-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or flash memory or tangible machine-readable memory for transmitting network information via electrical, optical, acoustical or other forms of signals (e.g., carrier waves, infrared signals, digital signals, etc.). Thus, a machine-readable medium includes any form of machine-readable medium suitable for storing or transmitting electronic instructions or machine (e.g., a computer) readable information.
While the embodiments of the present application have been described in detail with reference to the accompanying drawings, the application of the present application is not limited to the various applications mentioned in the embodiments of the present application, and various structures and modifications can be easily implemented with reference to the present application to achieve various advantageous effects mentioned herein. Variations that do not depart from the gist of the disclosure are intended to be within the scope of the disclosure.

Claims (12)

1. A method for identifying a failed node in a Redis cluster, for a node in a Redis cluster, comprising proactively sending a first message to a first node in the Redis cluster,
broadcasting a request message to other nodes in the Redis cluster when a response message of the first node to the first message is not received after a first threshold time elapses from a time of sending the first message, the request message being used to request the other nodes to immediately send the first message to the first node,
and under the condition that the response message from the first node is not received after a second threshold time passes from the moment of sending the first message, marking the first node as suspected fault.
2. A method for identifying a failed node in a Redis cluster, for a node in a Redis cluster, comprising immediately sending a first message to a first node in response to receiving a request message from another node, delivered by broadcast,
and under the condition that the response message of the first node to the first message is not received after a second threshold time passes from the moment of sending the first message, marking the first node as a suspected fault.
3. The method according to claim 1 or 2, further comprising, in case the first node is marked as suspected to be faulty, sending a notification message to a predetermined number of other nodes, the notification message including that the first node is marked as suspected to be faulty.
4. The method of claim 3, further comprising:
marking the first node as failed in the event that more than a predetermined number of the notification messages of the other nodes are received.
5. The method of claim 3, wherein the predetermined number is 5.
6. The method of claim 1, wherein the first threshold time is less than or equal to the second threshold time, wherein the first threshold time is 2 seconds and the second threshold time is 3 seconds.
7. The method of claim 1 or 2, further comprising:
and under the condition that a response message of the first node to the first message is not received after a second threshold time passes from the moment of sending the first message, marking the first node as a suspected fault within a preset time length, and replacing the second threshold time with a third threshold time, wherein the third threshold time is greater than the second threshold time.
8. The method of claim 7, wherein the predetermined time period is 24 hours, the second threshold time is 3 seconds, and the third threshold time is 15 seconds.
9. A method for identifying a failed node in a Redis cluster, which is used for the Redis cluster comprising a plurality of nodes, is characterized by comprising the following steps
The second node actively sends a first message to a first node in the Redis cluster,
broadcasting a request message to other nodes in the Redis cluster when the second node does not receive a response message of the first node to the first message after a first threshold time from the moment of sending the first message, wherein the request message is used for requesting the other nodes to immediately send the first message to the first node, and marking the first node as suspected fault when the second node does not receive the response message from the first node after a second threshold time from the moment of sending the first message;
each of the other plurality of nodes sending the first message to the first node in response to the request message;
each of the other plurality of nodes marks the first node as suspected failure if the response message from the first node is not received after a second threshold time from the time of sending the first message.
10. An identification device for a fault node in a Redis cluster is characterized by comprising,
the sending module is used for actively sending a first message to a first node in the Redis cluster;
a receiving module, configured to broadcast a request message to other nodes in the Redis cluster when a response message from the first node to the first message is not received after a first threshold time elapses from a time when the sending module sends the first message, where the request message is used to request the other nodes to immediately send the first message to the first node,
or, in response to the request message received by the receiving module and transmitted by other nodes through broadcasting, the sending module immediately sends the first message to the first node; and
the processing module marks the first node as a suspected fault when the receiving module does not receive the response message from the first node after a second threshold time from the time when the sending module sends the first message.
11. An apparatus for identifying a failed node in a Redis cluster, the apparatus comprising a memory storing computer executable instructions and a processor;
the instructions, when executed by the processor, cause the apparatus to implement a method of identifying a failed node in a Redis cluster according to any of claims 1-9.
12. A computer-readable medium having stored thereon instructions which, when run on a computer, cause the computer to perform a method of identification of a failed node in a Redis cluster according to any of claims 1-9.
CN202111119656.1A 2021-09-24 2021-09-24 Method, device, equipment and medium for identifying fault node in Redis cluster Pending CN113783735A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111119656.1A CN113783735A (en) 2021-09-24 2021-09-24 Method, device, equipment and medium for identifying fault node in Redis cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111119656.1A CN113783735A (en) 2021-09-24 2021-09-24 Method, device, equipment and medium for identifying fault node in Redis cluster

Publications (1)

Publication Number Publication Date
CN113783735A true CN113783735A (en) 2021-12-10

Family

ID=78853017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111119656.1A Pending CN113783735A (en) 2021-09-24 2021-09-24 Method, device, equipment and medium for identifying fault node in Redis cluster

Country Status (1)

Country Link
CN (1) CN113783735A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102204169A (en) * 2011-05-12 2011-09-28 华为技术有限公司 Fault detection method, route node and system
CN103995901A (en) * 2014-06-10 2014-08-20 北京京东尚科信息技术有限公司 Method for determining data node failure
CN106301853A (en) * 2015-06-05 2017-01-04 华为技术有限公司 The fault detection method of group system interior joint and device
CN107426003A (en) * 2017-05-02 2017-12-01 华为技术有限公司 A kind of fault detection method and device
US10177965B1 (en) * 2016-11-10 2019-01-08 Amazon Technologies, Inc. Live media encoding failover system
CN110740064A (en) * 2019-10-25 2020-01-31 北京浪潮数据技术有限公司 Distributed cluster node fault processing method, device, equipment and storage medium
CN113259188A (en) * 2021-07-15 2021-08-13 浩鲸云计算科技股份有限公司 Method for constructing large-scale redis cluster

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102204169A (en) * 2011-05-12 2011-09-28 华为技术有限公司 Fault detection method, route node and system
CN103995901A (en) * 2014-06-10 2014-08-20 北京京东尚科信息技术有限公司 Method for determining data node failure
CN106301853A (en) * 2015-06-05 2017-01-04 华为技术有限公司 The fault detection method of group system interior joint and device
US10177965B1 (en) * 2016-11-10 2019-01-08 Amazon Technologies, Inc. Live media encoding failover system
CN107426003A (en) * 2017-05-02 2017-12-01 华为技术有限公司 A kind of fault detection method and device
CN110740064A (en) * 2019-10-25 2020-01-31 北京浪潮数据技术有限公司 Distributed cluster node fault processing method, device, equipment and storage medium
CN113259188A (en) * 2021-07-15 2021-08-13 浩鲸云计算科技股份有限公司 Method for constructing large-scale redis cluster

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
程莹等: "云环境下服务器故障自适应诊断算法研究", 软件导刊, vol. 17, no. 9, 30 September 2018 (2018-09-30), pages 74 - 75 *
魏甜甜等: "一种在线容错和邻居协作的传感器节点故障诊断算法", 小型微型计算机系统, no. 12, 11 December 2018 (2018-12-11), pages 2670 - 2672 *

Similar Documents

Publication Publication Date Title
US10979286B2 (en) Method, device and computer program product for managing distributed system
US10917289B2 (en) Handling network failures in networks with redundant servers
CN106911728A (en) The choosing method and device of host node in distributed system
CN107995029B (en) Election control method and device and election method and device
CN109327544B (en) Leader node determination method and device
CN116566984B (en) Routing information creation method and device of k8s container cluster and electronic equipment
CN111355600B (en) Main node determining method and device
US20220417136A1 (en) Pce controlled network reliability
CN112218342A (en) Method, device and system for realizing core network sub-slice disaster tolerance
CN113132498A (en) Message processing method, relay equipment, system and storage medium
US20180262418A1 (en) Method and apparatus for communication in virtual network
CN108509296B (en) Method and system for processing equipment fault
US8250140B2 (en) Enabling connections for use with a network
CN113783735A (en) Method, device, equipment and medium for identifying fault node in Redis cluster
CN109344202B (en) Data synchronization method and management node
CN111385117B (en) Method, device and system for alarming
CN110661628B (en) Method, device and system for realizing data multicast
US20230126682A1 (en) Fault tolerance method and apparatus of network device system, computer device, and storage medium
CN112104531B (en) Backup implementation method and device
CN109510864B (en) Forwarding method, transmission method and related device of cache request
WO2015120581A1 (en) Traffic loop detection in a communication network
CN108965363B (en) Method and equipment for processing message
CN116708283B (en) Automatic network switching method and switching system
US7698438B1 (en) Updating of availability of routes in a network
CN110912997B (en) Method and device for checking Loopback interface of triangular networking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination