CN116886574A - Fault detection method and device, home node, detection system and storage medium - Google Patents

Fault detection method and device, home node, detection system and storage medium Download PDF

Info

Publication number
CN116886574A
CN116886574A CN202310889768.8A CN202310889768A CN116886574A CN 116886574 A CN116886574 A CN 116886574A CN 202310889768 A CN202310889768 A CN 202310889768A CN 116886574 A CN116886574 A CN 116886574A
Authority
CN
China
Prior art keywords
alive message
node
path
bfd keep
bfd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310889768.8A
Other languages
Chinese (zh)
Inventor
梁静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Maipu Communication Technology Co Ltd
Original Assignee
Maipu Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Maipu Communication Technology Co Ltd filed Critical Maipu Communication Technology Co Ltd
Priority to CN202310889768.8A priority Critical patent/CN116886574A/en
Publication of CN116886574A publication Critical patent/CN116886574A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/22Alternate routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/247Multipath using M:N active or standby paths
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a fault detection method, a device, a local end node, a detection system and a storage medium. The method comprises the steps that when the path fault exists in the opposite end packet sending path, the local end node pauses periodic sending of the first BFD keep-alive message, fault detection is carried out on other forwarding paths circularly by sending the second BFD keep-alive message to the opposite end node, so that a specific forwarding path without faults is determined when the third BFD keep-alive message is received, or communication faults between the local end node and the opposite end node are judged when the third BFD keep-alive message is not received within a preset detection time period, and therefore the specific forwarding path can be found timely or the communication faults between the nodes can be determined after the preset detection time period, fault misjudgment caused by route convergence is effectively avoided, communication suspension between the nodes due to the fault misjudgment is further avoided, and user experience is improved.

Description

Fault detection method and device, home node, detection system and storage medium
Technical Field
The present invention relates to the field of communications, and in particular, to a fault detection method, a fault detection device, a home node, a detection system, and a storage medium.
Background
BFD (Bidirectional Forwarding Detection ) is a high-speed failure detection mechanism based on RFC5880 standard, and can implement end-to-end failure detection. The BFD detection mechanism is to establish a BFD session between two nodes and periodically send BFD messages along a path between the two nodes, and if one party does not receive the BFD messages within a set time, the BFD session state changes to Down, and then the path is considered to have faults.
BFD is divided into single-hop BFD detection (two nodes communicate directly) and multi-hop BFD detection (two nodes communicate through an intermediate node), wherein multi-hop BFD is a unified detection mechanism of the whole network and is used for rapidly detecting and monitoring the forwarding communication condition of IP routing in the network. In the multi-hop BFD detection scenario, there is a case where the routes of the two end nodes are load routes: the routing table of one end node has a plurality of next-hop nodes taking the other end node as a destination address, namely more than one path exists between the two end nodes.
In the prior art, in the scene of multi-hop BFD detection of load route, BFD session is created between two end nodes, and one path is utilized for fault detection. However, when a fault occurs in the middle of the path or the outgoing interfaces at the two ends of the path oscillate, if the route converges, the BFD session can directly determine the communication fault between the two end nodes, so that the communication between the two end nodes is stopped, but other communicable paths are likely to exist between the two end nodes at the moment, which belongs to the fault misdetermination of the BFD session.
And under the condition that the route is not converged in time, the communication between the nodes is stopped for too long due to the misjudgment of the failure of the BFD session, and the user experience is affected.
Disclosure of Invention
The invention aims to provide a fault detection method, a fault detection device, a local end node, a fault detection system and a storage medium, so as to solve the problems existing in the prior art.
Embodiments of the invention may be implemented as follows:
in a first aspect, the present invention provides a fault detection method, applied to a home node, where the home node is communicatively connected to a peer node and multiple forwarding paths exist; the method comprises the following steps:
periodically sending a first BFD keep-alive message to the opposite terminal node through a local terminal packet sending path;
when a first BFD keep-alive message sent by the opposite terminal node through an opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending the periodic sending of the first BFD keep-alive message; the local packet sending path and the opposite packet sending path are any forwarding path;
cyclically performing fault detection on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node;
When a third BFD keep-alive message of the opposite terminal node is received within a preset detection time length, a specific forwarding path without faults is obtained;
and when the third BFD keep-alive message is not received within the preset detection time, judging that the communication between the local node and the opposite node is faulty.
Optionally, when the peer node receives the second BFD keep-alive message, the third BFD keep-alive message pauses periodic sending of the first BFD keep-alive message and returns through a forwarding path for receiving the second BFD keep-alive message;
the specific forwarding path is a forwarding path through which the third BFD keep-alive message passes; after the step of obtaining a specific forwarding path without failure, the method further comprises:
recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path;
and receiving a first BFD keep-alive message periodically sent by the opposite terminal node through the specific forwarding path.
Optionally, the home node includes a home outbound interface of each forwarding path; the step of cyclically performing fault detection on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node includes:
Determining all other output interfaces except the local output interface corresponding to the opposite-end packet sending path from all the local output interfaces;
and within the preset detection duration, sending the second BFD keep-alive message to the opposite terminal node through each other output interface in a circulating way so as to perform fault detection on each other forwarding path.
Optionally, the step of sending the second BFD keep-alive message to the peer node through each other outgoing interface in a cycle within the preset detection duration to perform fault detection on each other forwarding path includes:
taking one other output interface as an interface to be tested;
transmitting the second BFD keep-alive message to the opposite terminal node through the interface to be detected within the preset detection duration so as to perform fault detection on a forwarding path corresponding to the interface to be detected;
judging whether the third BFD keep-alive message is received from the interface to be tested in the preset period; the third BFD keep-alive message is that when the opposite terminal node receives the second BFD keep-alive message, the periodic transmission of the first BFD keep-alive message is suspended and returned from a forwarding path for receiving the second BFD keep-alive message;
If the third BFD keep-alive message is received from the interface to be tested in the preset period, judging that the forwarding path corresponding to the interface to be tested is a specific forwarding path without faults;
if the third BFD keep-alive message is not received from the interface to be detected within the preset period, judging whether the cycle detection time consumption exceeds the preset detection time length;
if the cycle detection time exceeds the preset detection time, judging that the communication between the local end node and the opposite end node is faulty;
and if the cycle detection time consumption does not exceed the preset detection time length, taking the next other output interface as the interface to be detected, and returning to execute the step of sending the second BFD keep-alive message to the opposite terminal node through the interface to be detected so as to perform fault detection on the forwarding path corresponding to the interface to be detected until the forwarding path corresponding to the interface to be detected is judged to be a specific forwarding path without faults or communication faults between the local terminal node and the opposite terminal node are judged.
Optionally, the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message each include a diagnostic field;
In the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message, the diagnostic field is 0, a preset interface change identifier, and a preset interface change response identifier, respectively;
the diagnosis field in the first BFD keep-alive message is used for indicating that a forwarding path through which the first BFD keep-alive message passes by a message receiving end has no fault;
the diagnosis field in the second BFD keep-alive message is configured to indicate that a path fault exists in an opposite packet sending path of the opposite node, and the opposite packet sending path needs to be adjusted to a forwarding path through which the second BFD keep-alive message passes;
the diagnostic field in the third BFD keep-alive message is configured to inform the home node that the peer node has received the second BFD keep-alive message, and instruct the home node to adjust the home packet sending path to a forwarding path traversed by the third BFD keep-alive message.
In a second aspect, the present invention provides a fault detection device, applied to a home node, where the home node is communicatively connected to a peer node and multiple forwarding paths exist; the device comprises:
a periodic transceiver module for:
periodically sending a first BFD keep-alive message to the opposite terminal node through a local terminal packet sending path;
When a first BFD keep-alive message sent by the opposite terminal node through an opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending the periodic sending of the first BFD keep-alive message; the local packet sending path and the opposite packet sending path are any forwarding path;
the circulation detection module is used for:
cyclically performing fault detection on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node;
when a third BFD keep-alive message of the opposite terminal node is received within a preset detection time length, a specific forwarding path without faults is obtained;
and when the third BFD keep-alive message is not received within the preset detection time, judging that the communication between the local node and the opposite node is faulty.
Optionally, when the peer node receives the second BFD keep-alive message, the third BFD keep-alive message pauses periodic sending of the first BFD keep-alive message and returns through a forwarding path for receiving the second BFD keep-alive message;
the specific forwarding path is a forwarding path through which the third BFD keep-alive message passes; after the loop detection module obtains a specific forwarding path without failure, the periodic transceiver module is further configured to:
Recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path;
and receiving a first BFD keep-alive message periodically sent by the opposite terminal node through the specific forwarding path.
In a third aspect, the present invention provides a home end node comprising: a memory storing a software program that when executed by the home node performs the fault detection method of any of the preceding embodiments, and a processor.
In a fourth aspect, the present invention provides a detection system, where the detection system includes a home end node and a peer end node according to the foregoing embodiment, where the home end node and the peer end node are communicatively connected and there are multiple forwarding paths.
In a fifth aspect, the present invention provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the fault detection method of any of the preceding embodiments.
Compared with the prior art, the embodiment of the invention provides a fault detection method, a device, a local end node, a detection system and a storage medium. The method comprises the steps that when the path fault exists in the opposite end packet sending path, the local end node pauses periodic sending of the first BFD keep-alive message, fault detection is carried out on other forwarding paths circularly by sending the second BFD keep-alive message to the opposite end node, so that a specific forwarding path without faults is determined when the third BFD keep-alive message is received, or communication faults between the local end node and the opposite end node are judged when the third BFD keep-alive message is not received within a preset detection time period, and therefore the specific forwarding path can be found timely or the communication faults between the nodes can be determined after the preset detection time period, fault misjudgment caused by route convergence is effectively avoided, communication suspension between the nodes due to the fault misjudgment is further avoided, and user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic topology diagram of a load routing scenario.
Fig. 2 is a schematic structural diagram of a detection system according to an embodiment of the present invention.
Fig. 3 is a schematic flow chart of a fault detection method according to an embodiment of the present invention.
Fig. 4 is a second flowchart of a fault detection method according to an embodiment of the present invention.
Fig. 5 is a third flow chart of a fault detection method according to an embodiment of the present invention.
Fig. 6 is a flowchart of a fault detection method according to an embodiment of the present invention.
Fig. 7 is a schematic diagram of a detection scenario between nodes according to an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a fault detection device according to an embodiment of the present invention.
Fig. 9 is a schematic structural diagram of a home node according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
Here, first, the keywords or key terms related to the present invention will be described:
1. load routing: in the routing table of one node a, there are a plurality of next-hop nodes with another node B as a destination address, that is, the utilization between the node a and the node B is represented as load routing. I.e. there is more than one path between node a and node B. For example, referring to FIG. 1, 6 router nodes (node A, nodes B, R1-R4) are shown in FIG. 1. For node a, the router node with node B as the destination address includes R1, R2, R3; for node B, the router node with node a as the destination address includes R1, R2, R4. There are 3 paths between node a and node B. This example is merely one example and is not intended to be limiting.
2. Route convergence: after the topology structure of the representative network changes, the routing table on the node is re-established, sent and learned until stable, and all relevant nodes in the network are informed of the change.
With continued reference to fig. 1, three paths exist between node a and node B: s1 (node A-R1-node B), S2 (node A-R2-node B), S1 (node A-R3-R4-node B). In the scenario of fig. 1, BFD detection performed between node a and node B is multi-hop BFD detection.
Referring to fig. 1, in the prior art, a BFD session needs to be created for multi-hop BFD detection between a node a and a node B, and then one of the paths is used for fault detection, and it is assumed that the node a and the node B periodically send BFD messages by using a path S1.
Under the conditions that the router node R1 on the path S1 is faulty, the interface A1 of the node A is oscillated, or the interface B1 of the node B is oscillated, the path S1 is faulty. In the case of route convergence in time, the BFD session may select a new path (e.g., S2) for BFD detection. However, when the route converges in time, the failure of the path S1 may cause the node a and the node B to fail to receive the BFD packet of the opposite end, and at this time, the BFD session may directly determine the communication failure between the node a and the node B, and the communication between the nodes may be stopped.
However, in actual situations, two other paths S2 and S3 exist between the node a and the node B, and the two paths are likely to fail, and in the case that the route converges in time and no failure occurs in S2 or S3, the determination result of the BFD session belongs to a failure erroneous determination. In addition, under the condition that the route convergence is not timely, the communication suspension time caused by the fault misjudgment of the BFD session is too long, and the user experience is easily affected.
In view of this, an embodiment of the present invention provides a fault detection method, which can use a second BFD keep-alive packet to perform fault detection on other forwarding paths in a cyclic manner, so that a specific forwarding path without a fault is determined when a third BFD keep-alive packet is received, or an inter-node communication fault is determined when the third BFD keep-alive packet is not received within a preset detection duration, so that the specific forwarding path can be found out in time, or the inter-node communication fault is determined after the preset detection duration, and fault misjudgment caused by route convergence not in time can be effectively avoided, further, communication suspension between nodes caused by fault misjudgment is avoided, and user experience is improved. The following detailed description is made by way of example with reference to the accompanying drawings.
Here, an application scenario of the present invention will be described.
Referring to fig. 2, the present invention provides a detection system 100, where the detection system 100 includes a home node 110 and a peer node 120, and the home node 110 and the peer node 120 are communicatively connected and three forwarding paths (forwarding paths 1 to 3) exist.
Wherein there may be no intermediate node or at least one intermediate node in one forwarding path. The home end node 110 and the correspondent end node 120 may be network devices such as switches, routers, and the like. The number of forwarding paths between the home node 110 and the peer node 120 is based on the actual application, and the number of forwarding paths shown in fig. 2 is merely an example and is not limited herein.
The fault detection method provided by the embodiment of the present invention may be applied to the home terminal node 110 and the peer terminal node 120 in the detection system 100, and the fault detection method is described in detail below with the home terminal node 110 as an execution body.
Referring to fig. 3, fig. 3 is a flow chart of a fault detection method according to an embodiment of the present invention, where the fault detection includes the following steps S101 to S105:
s101, periodically sending a first BFD keep-alive message to a peer node through a local packet sending path.
In this embodiment, the home node may send the first BFD keep-alive message to the peer node through the pre-selected home packet sending path every preset period. Similarly, the peer node may send the first BFD keep-alive message to the peer node via a preselected peer-to-peer packet path every a preset period. The local packet sending path may be any forwarding path, and the opposite packet sending path may be any forwarding path except the local packet sending path.
S102, when the first BFD keep-alive message sent by the opposite terminal node through the opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending the periodic sending of the first BFD keep-alive message.
S103, the fault detection is circularly carried out on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node.
In this embodiment, if the local end node does not receive the first BFD keep-alive message sent by the opposite end node through the opposite end packet sending path within a preset period, it may be considered that a path failure exists in the opposite end packet sending path, and at this time, periodic sending of the first BFD keep-alive message to the opposite end node is suspended, and fault detection is performed on other forwarding paths except for the opposite end packet sending path in a cyclic manner by sending the second BFD keep-alive message to the opposite end node.
Because, for the home terminal node, when the first BFD keep-alive message of the peer node is not received within the preset period, only the failure of the peer packet transmission path can be determined, but whether the path failure exists in the rest forwarding paths is unknown, so that detection is needed.
And S104, when the third BFD keep-alive message of the opposite terminal node is received within the preset detection time, obtaining a specific forwarding path without faults.
And S105, when the third BFD keep-alive message is not received within the preset detection time, judging the communication fault between the local node and the opposite node.
In this embodiment, for the peer node, when it receives the second BFD keep-alive message, it may suspend periodic sending of the first BFD keep-alive message and return a third BFD keep-alive message through a forwarding path of the second BFD keep-alive message.
For the local end node, when the local end node receives the third BFD keep-alive message of the opposite end node within the preset detection time, the forwarding path of the third BFD keep-alive message can be judged to be a specific forwarding path without faults; when the home terminal node does not receive the third BFD keep-alive message within the preset detection time, the opposite terminal node can not receive the second BFD keep-alive message within the preset detection time, and the home terminal node can judge the communication fault between the home terminal node and the opposite terminal node.
According to the fault detection method provided by the embodiment of the invention, the local end node pauses the periodic transmission of the first BFD keep-alive message when judging that the path fault exists in the opposite end packet transmission path, and circularly carries out fault detection on other forwarding paths by transmitting the second BFD keep-alive message to the opposite end node, so that a specific forwarding path without faults is determined when the third BFD keep-alive message is received, or the communication fault between the local end node and the opposite end node is judged when the third BFD keep-alive message is not received within the preset detection time, so that the specific forwarding path can be timely found out or the communication fault between the nodes is determined after the preset detection time, the fault misjudgment caused by the untimely route convergence can be effectively avoided, the communication suspension between the nodes caused by the fault misjudgment is further avoided, and the user experience is improved.
In an alternative implementation manner, after obtaining the specific forwarding path without failure, both the home node and the peer node may resume periodic sending of the first BFD keep-alive packet on the specific forwarding path. Correspondingly, referring to fig. 4, after the step S104, the method may further include:
s106, recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path;
s107, the first BFD keep-alive message periodically sent by the opposite terminal node through the specific forwarding path is received.
In this embodiment, after determining that the path fault exists in the opposite-end packet sending path, the home node may detect other forwarding paths except the opposite-end packet sending path by using the second BFD keep-alive packet, and after determining that the specific forwarding path has no fault, both the home node and the opposite-end node may resume periodic sending of the first BFD keep-alive packet on the specific forwarding path. Therefore, the fault misjudgment caused by untimely route convergence can be effectively avoided, and the normal communication between the local end node and the opposite end node is ensured before the route convergence.
The preset period is a transmission period of the first BFD keep-alive message, and for example, the preset period may be set to 100m or 200ms. It should be noted that this example is only an example, and the size of the preset period is set according to the actual application situation, which is not limited herein.
In an alternative implementation manner, the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message all adopt a unified message format specified in RFC, where the unified message format includes a diagnostic field (Diag) occupying 5 bits, the expressible field values are 0-31, the following tables are combined, the respective meanings exist for the field values 0-8, and the field values 9-31 are reserved values.
Field value of Diag Description of the invention
0 No Diagnostic information (No Diagnostic)
1 Control detection timeout (Control Detection Time Expired)
2 Echo failure (Echo Function Failed)
3 Neighbor advertisement session down (Neighbor Signaled Session Down)
4 Forwarding plane restart (Forwarding Plane Reset)
5 Channel failure (Path Down)
6 Failure of connecting channel (Concatenated Path Down)
7 Management down (Administratively Down)
8 Reverse link down (Reverse Concatenated Path Down)
9~31 Reserved value (Reserved for future use)
In this embodiment, the field values of the diagnostic fields of the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message are respectively: 0. the method comprises the steps of presetting an interface change identifier and a preset interface change response identifier. The preset interface change identifier and the preset interface change response identifier may be any two of the 24 reserved values of 9-31. For example, the preset interface change identifier and the preset interface change response identifier may be set to 9 and 10, respectively, and this example is only an example and is not limited herein.
Therefore, the diagnostic field in the first BFD keep-alive message is used to indicate that the forwarding path traversed by the first BFD keep-alive message at the message receiving end has no fault. The diagnosis field in the second BFD keep-alive message is used to indicate that a path fault exists in an opposite end packet sending path of the opposite end node, so that the opposite end packet sending path can be adjusted to a forwarding path through which the second BFD keep-alive message passes, and the diagnosis field in the third BFD keep-alive message is used to inform the local end node that the opposite end node has received the second BFD keep-alive message, and meanwhile indicates that the local end node can adjust the local end packet sending path to the forwarding path through which the third BFD keep-alive message passes.
In an alternative implementation manner, the home node may include a home outbound interface corresponding to each forwarding path, and the peer node may include a peer outbound interface corresponding to each forwarding path. Alternatively, the cause of the path failure of the opposite packet path may be: the intermediate node on the opposite end packet sending path breaks down, the local end output interfaces at the two ends of the opposite end packet sending path vibrate or the opposite end output interfaces vibrate, and the like.
Therefore, when the first BFD keep-alive message of the peer node is not received in the preset period, the second BFD keep-alive message needs to be sent to detect other forwarding paths except for the peer packet sending path in a circulating manner through all other outgoing interfaces except for the last recorded packet receiving interface in all the local outgoing interfaces.
The detection process is described in detail below.
Optionally, referring to fig. 5, the substep of step S103 may include S1031 to S1032.
S1031, determining all other outgoing interfaces except the local outgoing interface corresponding to the opposite-end packet sending path from all the local outgoing interfaces.
Optionally, the home node may include a route management module and a BFD module, where the route management module may be responsible for route-related management work, and the BFD module may be responsible for fault detection work between the home node and the peer node.
And when the BFD module creates the BFD session with the correspondent node, the BFD module may subscribe the route to the route management module, so that the route management module will: the local output interfaces of all routes taking the opposite end node as a destination address are synchronized to the BFD module, namely the BFD module can know the local output interfaces of each forwarding path between the local end node and the opposite end node.
Optionally, each time the home terminal node and the opposite terminal node receive the first BFD keep-alive message of the opposite terminal, the packet receiving interface of the first BFD keep-alive message is recorded, and the packet receiving interface is the home terminal outgoing interface corresponding to the opposite terminal packet sending path. In this embodiment, the BFD module may determine all other outgoing interfaces except the packet receiving interface from all the home outgoing interfaces.
S1032, in the preset detection time period, the second BFD keep-alive message is sent to the opposite terminal node through each other output interface in a circulating way so as to perform fault detection on each other forwarding path.
In this embodiment, the BFD module may send the second BFD keep-alive packet to the peer node through each other outgoing interface in a cyclic manner, so as to implement fault cyclic detection on each other forwarding path.
Alternatively, referring to fig. 6 on the basis of fig. 5, the substeps of step S1032 may include S01 to S07.
S01, taking one other output interface as an interface to be tested.
And S02, sending a second BFD keep-alive message to the opposite terminal node through the interface to be detected within a preset detection duration so as to perform fault detection on the forwarding path corresponding to the interface to be detected.
S03, judging whether a third BFD keep-alive message is received from the interface to be tested in a preset period.
In this embodiment, if a third BFD-keep-alive message is received from the interface to be tested within a preset period from the time of the second BFD-keep-alive forwarding, the following step S04 is executed; if the third BFD keep-alive message is not received from the interface to be tested within the preset period of starting timing from the second BFD keep-alive forwarding, the following step S05 is executed.
S04, judging that the forwarding path corresponding to the interface to be tested is a specific forwarding path without faults.
S05, judging whether the cycle detection time exceeds the preset detection time.
In this embodiment, the cycle detection time may be counted from the first time the home node sends the second BFD keep-alive message. If the cycle detection time exceeds the preset detection time, executing the following step S06; if the cycle detection time does not exceed the preset detection time, executing the following step S07, and then returning to execute the step S02 until the forwarding path corresponding to the interface to be detected is determined to be a specific forwarding path without a fault or the communication fault between the home node and the opposite node is determined.
S06, judging a communication fault between the local end node and the opposite end node;
s07, taking the next other output interface as an interface to be tested.
For ease of understanding, an example of fault detection is given below, which is performed cyclically.
With reference to fig. 7, it is assumed that 4 forwarding paths (S1 to S4) exist between a home node a (hereinafter referred to as node a) and a peer node B (hereinafter referred to as node B), the home outgoing interfaces of the node a are A1 to A4, respectively, and the peer outgoing interfaces of the node B are B1 to B4, respectively.
Assuming that the node A periodically sends the first BFD keep-alive message to the node B through the local end output interface A1, the local end packet sending path is S1, and the packet receiving interface of each time the node B receives the record of the first BFD keep-alive message is B1. Assuming that the node B periodically sends the first BFD keep-alive message to the node a through the local outgoing interface B4, the opposite-end packet sending path is S4, and the packet receiving interface of each time the node a receives the record of the first BFD keep-alive message is A4.
For the node a, when the first BFD alive message of the node B is not received at the node A4 within a preset period from the last time the first BFD alive message is received, it is indicated that the path failure exists in the node A4 (may be A4 oscillation, B4 oscillation, or an intermediate node down on the node S4), but whether the path failure exists in the nodes S1 to S3 is unknown, so at this time, the node a needs to suspend sending the first BFD alive message from the node A1, and needs to perform fault detection on the nodes S1 to S3 in a preset detection period T1 in a cyclic manner:
(1) Sending a second BFD keep-alive message to the opposite terminal node through A1;
(2) Judging whether a third BFD keep-alive message is received from A1 in a preset period T2;
(3) If a third BFD keep-alive message is received from A1 in T2, it indicates that the peer node receives a second BFD keep-alive message on B1, and does not send the first BFD keep-alive message, but returns the third BFD keep-alive message through B1, which indicates that S1 has no fault, and node a and node B may each continue to send the first BFD keep-alive message periodically through A1 and B1.
(4) If the third BFD keep-alive message is not received from A1 in T2, then node A sends the second BFD keep-alive message to the opposite terminal node through A2, and judges whether the third BFD keep-alive message is received from A2 in T2;
(5) If a third BFD keep-alive message is received from A2 in T2, the principle is the same as that above, and the node A and the node B can respectively continue to periodically send the first BFD keep-alive message through A2 and B2, wherein the principle is the same as that of S2;
(6) If the third BFD keep-alive message is not received from A2 in T2, then the node A sends the second BFD keep-alive message to the opposite terminal node through A3, and judges whether the third BFD keep-alive message is received from A3 in T2;
(7) If a third BFD keep-alive message is received from A2 in T2, the principle is the same as that above, and the node A and the node B can respectively continue to periodically send the first BFD keep-alive message through A3 and B3, wherein the principle is the same as that of S3;
(8) If the third BFD keep-alive message is not received from A2 in T2, returning to the step (1) until the time consumption of the cycle detection exceeds T1, judging that the communication between the node A and the node B is faulty or the third BFD keep-alive message is received from one of the local outgoing interfaces A1-A3, namely, taking the forwarding path corresponding to the local outgoing interface as a specific forwarding path without faults.
It should be noted that the preset detection duration T1 is greater than the preset period T2, and the value of T1 depends on the actual situation. The above examples are merely examples and are not intended to be limiting.
It should be noted that, in the above method embodiment, the execution sequence of each step is not limited by the drawing, and the execution sequence of each step is based on the actual application situation.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the invention uses any two reserved values of the diagnosis field as the preset interface change identification and the preset interface change response identification, so that when the opposite end packet transmitting path has a path fault, the local end node can utilize the second BFD keep-alive message to perform fault detection on other forwarding paths, and when the local end node receives the third BFD keep-alive message of the opposite end node within the preset detection time, the specific forwarding path without faults is obtained.
Compared with the prior art, the method and the device for stopping the inter-node communication by directly misjudging the inter-node faults under the condition that the route is not converged in time, the method and the device for stopping the inter-node communication can find out the specific forwarding paths by circularly carrying out fault detection on other forwarding paths within the preset detection time under the condition that the route is not converged in time, can effectively avoid the inter-node communication suspension caused by misjudging of BFD session under the condition that the route is not converged in time, and can improve user experience.
In order to perform the corresponding steps in the above method embodiments and in each possible implementation, an implementation of the fault detection device is given below.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a fault detection device according to an embodiment of the present invention. The fault detection device 200 is applied to a local end node, and the local end node is in communication connection with a corresponding end node and has a plurality of forwarding paths; the fault detection device 200 includes: a cycle transceiver module 210 and a cycle detection module 220.
A period transceiver module 210 for: periodically sending a first BFD keep-alive message to a peer node through a local packet sending path; when the first BFD keep-alive message sent by the opposite terminal node through the opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending the periodic sending of the first BFD keep-alive message; the local packet sending path and the opposite packet sending path are any forwarding paths;
a cycle detection module 220 for: cyclically performing fault detection on other forwarding paths except for the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node; when a third BFD keep-alive message of the opposite terminal node is received within a preset detection time length, a specific forwarding path without faults is obtained; and when the third BFD keep-alive message is not received within the preset detection time, judging the communication fault between the local end node and the opposite end node.
Alternatively, the fault detection device 200 may be a BFD module of the home node, and the period transceiving module 210 and the cycle detection module 220 may be sub-modules of the BFD module.
Optionally, the third BFD keep-alive message is configured to suspend periodic sending of the first BFD keep-alive message and return the second BFD keep-alive message through a forwarding path that receives the second BFD keep-alive message when the peer node receives the second BFD keep-alive message. The specific forwarding path is a forwarding path through which the third BFD keep-alive message passes. After the loop detection module 220 obtains the specific forwarding path without failure, the periodic transceiver module 210 is further configured to: recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path; and receiving the first BFD keep-alive message periodically sent by the opposite node through the specific forwarding path.
Optionally, the home node includes a home outbound interface of each forwarding path; the loop detection module 220 is configured to send the second BFD keep-alive message to the peer node to perform fault detection on other forwarding paths except the peer packet sending path in a loop, and may specifically be configured to: all other output interfaces except the local output interface corresponding to the opposite-end packet sending path are determined from all the local output interfaces; and within a preset detection duration, sending a second BFD keep-alive message to the opposite terminal node through each other output interface in a circulating way so as to perform fault detection on each other forwarding path.
Optionally, the cycle detection module 220 is configured to send, in a preset detection period, the second BFD keep-alive packet to the peer node through each other outgoing interface in a cycle, so as to perform fault detection on each other forwarding path, and specifically may be configured to: taking one other output interface as an interface to be tested; transmitting a second BFD keep-alive message to the opposite terminal node through the interface to be detected within a preset detection duration so as to perform fault detection on a forwarding path corresponding to the interface to be detected; judging whether a third BFD keep-alive message is received from an interface to be tested in a preset period; the third BFD keep-alive message is to pause the periodic transmission of the first BFD keep-alive message and return from the forwarding path for receiving the second BFD keep-alive message when the opposite terminal node receives the second BFD keep-alive message; if a third BFD keep-alive message is received from the interface to be tested in a preset period, judging that the forwarding path corresponding to the interface to be tested is a specific forwarding path without faults; if the third BFD keep-alive message is not received from the interface to be detected within the preset period, judging whether the cycle detection time consumption exceeds the preset detection time length; if the cycle detection time exceeds the preset detection time, judging that the communication between the local end node and the opposite end node is faulty; if the cycle detection time does not exceed the preset detection time, taking the next other output interface as the interface to be detected, and returning to execute the step of sending a second BFD keep-alive message to the opposite terminal node through the interface to be detected so as to perform fault detection on the forwarding path corresponding to the interface to be detected until the forwarding path corresponding to the interface to be detected is judged to be a specific forwarding path without faults or communication faults between the local terminal node and the opposite terminal node are judged.
Optionally, the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message each include a diagnostic field. In the first BFD keep-alive message, the second BFD keep-alive message and the third BFD keep-alive message, the diagnosis fields are respectively 0, a preset interface change identifier and a preset interface change response identifier. The diagnosis field in the first BFD keep-alive message is used for indicating that a forwarding path through which the first BFD keep-alive message passes by the message receiving end has no fault; the diagnosis field in the second BFD keep-alive message is used to indicate that a path fault exists in an opposite packet sending path of the opposite node, and the opposite packet sending path needs to be adjusted to a forwarding path through which the second BFD keep-alive message passes; the diagnostic field in the third BFD keep-alive message is configured to inform the home node that the peer node has received the second BFD keep-alive message, and instruct the home node to adjust the home packet sending path to a forwarding path traversed by the third BFD keep-alive message.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the fault detection apparatus 200 described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a home node according to an embodiment of the present invention. The home node 300 comprises a processor 310, a memory 320 and a bus 330, the processor 310 being connected to the memory 320 by the bus 330.
The memory 320 may be used to store software programs, for example, corresponding to the fault detection apparatus 200 described above. The processor 310 performs various functional applications and data processing by running software programs stored in the memory 320 to implement the fault detection method as provided by the embodiments of the present invention.
The Memory 320 may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), flash Memory (Flash), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc.
The processor 310 may be an integrated circuit chip with signal processing capabilities. The processor 310 may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
It is to be understood that the structure shown in fig. 9 is illustrative only, and that the home end node 300 may also include more or fewer components than shown in fig. 9, or have a different configuration than shown in fig. 9. The components shown in fig. 9 may be implemented in hardware, software, or a combination thereof.
The embodiment of the invention also provides a detection system which comprises the local end node and the opposite end node, wherein the local end node and the opposite end node are in communication connection and a plurality of forwarding paths exist.
The embodiment of the invention also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and the computer program is executed by a processor to realize the fault detection method disclosed in the embodiment. The computer readable storage medium may be, but is not limited to: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, RAM, PROM, EPROM, EEPROM, FLASH magnetic disk or an optical disk.
In summary, the embodiment of the invention provides a fault detection method, a device, a local end node, a detection system and a storage medium. The method comprises the steps that when the path fault exists in the opposite end packet sending path, the local end node pauses periodic sending of the first BFD keep-alive message, fault detection is carried out on other forwarding paths circularly by sending the second BFD keep-alive message to the opposite end node, so that a specific forwarding path without faults is determined when the third BFD keep-alive message is received, or communication faults between the local end node and the opposite end node are judged when the third BFD keep-alive message is not received within a preset detection time period, and therefore the specific forwarding path can be found timely or the communication faults between the nodes can be determined after the preset detection time period, fault misjudgment caused by route convergence is effectively avoided, communication suspension between the nodes due to the fault misjudgment is further avoided, and user experience is improved.
The present invention is not limited to the above embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. The fault detection method is characterized by being applied to a local end node, wherein the local end node is in communication connection with a corresponding end node and a plurality of forwarding paths exist; the method comprises the following steps:
periodically sending a first BFD keep-alive message to the opposite terminal node through a local terminal packet sending path;
when a first BFD keep-alive message sent by the opposite terminal node through an opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending the periodic sending of the first BFD keep-alive message; the local packet sending path and the opposite packet sending path are one of any forwarding paths;
cyclically performing fault detection on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node;
When a third BFD keep-alive message of the opposite terminal node is received within a preset detection time length, a specific forwarding path without faults is obtained;
and when the third BFD keep-alive message is not received within the preset detection time, judging that the communication between the local node and the opposite node is faulty.
2. The method according to claim 1, wherein the third BFD keep-alive message is sent periodically by the peer node to pause the sending of the first BFD keep-alive message and returned via a forwarding path that receives the second BFD keep-alive message when the second BFD keep-alive message is received by the peer node;
the specific forwarding path is a forwarding path through which the third BFD keep-alive message passes; after the step of deriving a failure-free specific forwarding path, the method further comprises:
recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path;
and receiving a first BFD keep-alive message periodically sent by the opposite terminal node through the specific forwarding path.
3. The method of claim 1, wherein the home end node comprises a home end out interface for each of the forwarding paths; the step of cyclically performing fault detection on other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node includes:
Determining all other output interfaces except the local output interface corresponding to the opposite-end packet sending path from all the local output interfaces;
and within the preset detection duration, sending the second BFD keep-alive message to the opposite terminal node through each other output interface in a circulating way so as to perform fault detection on each other forwarding path.
4. A method according to claim 3, wherein the step of cyclically sending the second BFD keep-alive messages to the correspondent node over each of the other egress interfaces for fault detection of each of the other forwarding paths within the preset detection duration comprises:
taking one other output interface as an interface to be tested;
transmitting the second BFD keep-alive message to the opposite terminal node through the interface to be detected within the preset detection duration so as to perform fault detection on a forwarding path corresponding to the interface to be detected;
judging whether the third BFD keep-alive message is received from the interface to be tested in the preset period; the third BFD keep-alive message is that when the opposite terminal node receives the second BFD keep-alive message, the periodic transmission of the first BFD keep-alive message is suspended and returned from a forwarding path for receiving the second BFD keep-alive message;
If the third BFD keep-alive message is received from the interface to be tested in the preset period, judging that the forwarding path corresponding to the interface to be tested is a specific forwarding path without faults;
if the third BFD keep-alive message is not received from the interface to be detected within the preset period, judging whether the cycle detection time consumption exceeds the preset detection time length;
if the cycle detection time exceeds the preset detection time, judging that the communication between the local end node and the opposite end node is faulty;
and if the cycle detection time consumption does not exceed the preset detection time length, taking the next other output interface as the interface to be detected, and returning to execute the step of sending the second BFD keep-alive message to the opposite terminal node through the interface to be detected so as to perform fault detection on the forwarding path corresponding to the interface to be detected until the forwarding path corresponding to the interface to be detected is judged to be a specific forwarding path without faults or communication faults between the local terminal node and the opposite terminal node are judged.
5. The method according to any one of claims 1-4, wherein the diagnostic fields of the first BFD keep-alive message, the second BFD keep-alive message, and the third BFD keep-alive message are respectively 0, a preset interface change identifier, and a preset interface change response identifier;
The diagnosis field in the first BFD keep-alive message is used for indicating that a forwarding path through which the first BFD keep-alive message passes has no fault;
the diagnosis field in the second BFD keep-alive message is configured to indicate that a path fault exists in an opposite packet sending path of the opposite node, and the opposite packet sending path needs to be adjusted to a forwarding path through which the second BFD keep-alive message passes;
the diagnostic field in the third BFD keep-alive message is configured to inform the home node that the peer node has received the second BFD keep-alive message, and instruct the home node to adjust the home packet sending path to a forwarding path traversed by the third BFD keep-alive message.
6. The fault detection device is characterized by being applied to a local end node, wherein the local end node is in communication connection with a corresponding end node and a plurality of forwarding paths exist; the device comprises:
the periodic transceiver module is used for periodically transmitting a first BFD keep-alive message to the opposite terminal node through a local terminal packet transmission path; when a first BFD keep-alive message sent by the opposite terminal node through an opposite terminal packet sending path is not received in a preset period, judging that the opposite terminal packet sending path has a path fault and suspending periodic sending of the BFD keep-alive message; the local packet sending path and the opposite packet sending path are any forwarding path;
The circulation detection module is used for circularly detecting faults of other forwarding paths except the opposite-end packet sending path by sending a second BFD keep-alive message to the opposite-end node;
when a third BFD keep-alive message of the opposite terminal node is received within a preset detection time length, a specific forwarding path without faults is obtained;
and when the third BFD keep-alive message is not received within the preset detection time, judging that the communication between the local node and the opposite node is faulty.
7. The apparatus of claim 6, wherein the third BFD keep-alive message is sent periodically by the first BFD keep-alive message and returned via a forwarding path that receives the second BFD keep-alive message when the second BFD keep-alive message is received by the correspondent node;
the specific forwarding path is a forwarding path through which the third BFD keep-alive message passes; after the loop detection module obtains a specific forwarding path without failure, the periodic transceiver module is further configured to:
recovering the periodic transmission of the first BFD keep-alive message on the specific forwarding path;
and receiving a first BFD keep-alive message periodically sent by the opposite terminal node through the specific forwarding path.
8. A home terminal node, comprising: a memory storing a software program that when executed by the home node performs the fault detection method of any of claims 1-5, and a processor.
9. A detection system comprising a home node and a peer node as claimed in claim 8, the home node being communicatively coupled to the peer node and having a plurality of forwarding paths.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the fault detection method of any of claims 1-5.
CN202310889768.8A 2023-07-18 2023-07-18 Fault detection method and device, home node, detection system and storage medium Pending CN116886574A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310889768.8A CN116886574A (en) 2023-07-18 2023-07-18 Fault detection method and device, home node, detection system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310889768.8A CN116886574A (en) 2023-07-18 2023-07-18 Fault detection method and device, home node, detection system and storage medium

Publications (1)

Publication Number Publication Date
CN116886574A true CN116886574A (en) 2023-10-13

Family

ID=88271204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310889768.8A Pending CN116886574A (en) 2023-07-18 2023-07-18 Fault detection method and device, home node, detection system and storage medium

Country Status (1)

Country Link
CN (1) CN116886574A (en)

Similar Documents

Publication Publication Date Title
Albrightson et al. EIGRP--A fast routing protocol based on distance vectors
EP2652905B1 (en) Increased communication opportunities with low-contact nodes in a computer network
EP2187565B1 (en) Detecting and processing method and device of node fault within a peer-to-peer network
US8672566B2 (en) Node apparatus and communication method
US9712433B2 (en) Maintaining and communicating nodal neighboring information
CN109194547B (en) Message transmission method and device, home terminal equipment and readable storage medium
CN108632099B (en) Fault detection method and device for link aggregation
JP2006229967A (en) High-speed multicast path switching
US9300569B2 (en) Compressing data packet routing information using bloom filters
WO2017036180A1 (en) Packet processing method and device
CN111447101B (en) Link detection method, device, computer equipment and storage medium
CN104348659A (en) Fault detection method and node for multi-hop network
CN109787869B (en) Path fault detection method and device
US20210234777A1 (en) Methods and network devices for detecting and resolving abnormal routes
WO2020173424A1 (en) Message processing method, and gateway device
JP3712337B2 (en) Communication network system and failure notification method in communication network system
JP4861293B2 (en) COMMUNICATION DEVICE, COMMUNICATION METHOD, AND COMMUNICATION PROGRAM
EP1964330B1 (en) Method for reducing fault detection time in a telecommunication network
CN116886574A (en) Fault detection method and device, home node, detection system and storage medium
CN111934939B (en) Network node fault detection method, device and system
CN112636999A (en) Port detection method and network monitoring system
CN111385195B (en) Information processing method, device and storage medium
JP2016225837A (en) Multi-hop communication system, communication device and communication method
CN114301829B (en) Method, equipment and medium for selecting message sending path
CN112953789B (en) Link detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination