CN101938365B - Fault handling method and device for Ethernet - Google Patents

Fault handling method and device for Ethernet Download PDF

Info

Publication number
CN101938365B
CN101938365B CN200910088156A CN200910088156A CN101938365B CN 101938365 B CN101938365 B CN 101938365B CN 200910088156 A CN200910088156 A CN 200910088156A CN 200910088156 A CN200910088156 A CN 200910088156A CN 101938365 B CN101938365 B CN 101938365B
Authority
CN
China
Prior art keywords
communication node
node
information
communication
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200910088156A
Other languages
Chinese (zh)
Other versions
CN101938365A (en
Inventor
王如亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN200910088156A priority Critical patent/CN101938365B/en
Publication of CN101938365A publication Critical patent/CN101938365A/en
Application granted granted Critical
Publication of CN101938365B publication Critical patent/CN101938365B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Small-Scale Networks (AREA)

Abstract

本发明提供了一种以太网中的故障处理方法和装置。该方法主要包括:所述管理节点接收所述以太网链路的两端的通信节点上报的故障信息,所述管理节点根据所述故障信息,以及其和所述以太网链路的两端的通信节点之间的心跳检测信息,获取所述以太网链路以及两端的通信节点的故障信息。本发明实施例通过设置专门的管理节点,并通过专门的管理通信通道来监控通信节点状态,可以有效地获取并区分以太网中的通信链路故障和通信节点故障信息,便于快速进行故障恢复。

Figure 200910088156

The invention provides a fault processing method and device in the Ethernet. The method mainly includes: the management node receives the fault information reported by the communication nodes at both ends of the Ethernet link, and the management node, according to the fault information, and the communication nodes at both ends of the Ethernet link The heartbeat detection information between the Ethernet links and the failure information of the communication nodes at both ends are acquired. In the embodiment of the present invention, by setting a special management node and monitoring the state of the communication node through a special management communication channel, the communication link failure and communication node failure information in the Ethernet can be effectively obtained and distinguished, and the failure recovery is facilitated quickly.

Figure 200910088156

Description

以太网中的故障处理方法和装置Method and device for troubleshooting in Ethernet

技术领域 technical field

本发明涉及以太网通信技术领域,尤其涉及一种以太网中的故障处理方法和装置。The invention relates to the technical field of Ethernet communication, in particular to a method and device for processing faults in the Ethernet.

背景技术 Background technique

随着以太网在城域网和广域网中的不断发展,运营商对以太网的设备和链路的OAM(Operations Administrations and Maintaince,可维护性)越来越关注。With the continuous development of Ethernet in MAN and WAN, operators pay more and more attention to the OAM (Operations Administrations and Maintenance, maintainability) of Ethernet equipment and links.

现有的OAM方案侧重于点到点以太网链路的管理维护,提供了完整的以太网OAM解决方案。OAM子层作为MAC(Media Access Control,媒介接入控制)层的一个可选子层,在生效的情况下为监控链路运行状态提供了一种有效的机制,可应用于任何全双工、点到点的以太网链路,或模拟的点到点以太网链路。Existing OAM solutions focus on the management and maintenance of point-to-point Ethernet links, providing a complete Ethernet OAM solution. As an optional sublayer of the MAC (Media Access Control, Media Access Control) layer, the OAM sublayer provides an effective mechanism for monitoring the link operation status when it is in effect, and can be applied to any full-duplex, A point-to-point Ethernet link, or a simulated point-to-point Ethernet link.

在现有的OAM方案中,通过致命事件通告和普通事件通告两类通告来进行链路故障的指示和定位。In the existing OAM solution, link fault indication and location are performed through fatal event notification and common event notification.

致命事件通告采用OAM PDU(Protocol Data Unit,协议数据单元)s中的标志位来实现,主要包括:Fatal event notification is implemented by the flag bits in OAM PDU (Protocol Data Unit, protocol data unit), mainly including:

Link Fault:指示本端PHY(Physical Layer Device,物理层设备)Rx(Receiver,接收器)故障(对端链路信号丢失)。Link Fault: Indicates that the local PHY (Physical Layer Device, physical layer device) Rx (Receiver, receiver) is faulty (the peer link signal is lost).

Dying Gasp:发生了不可恢复的本端错误。Dying Gasp: An unrecoverable local error has occurred.

Critical Event:未知的关键故障发生。Critical Event: An unknown critical fault occurs.

普通事件通告采用OAM PDUs中的TLV(Type-Length-Value,类型长度值)字段来实现,主要包括:Common event notifications are implemented using the TLV (Type-Length-Value) field in OAM PDUs, mainly including:

错误符号周期事件:在给定时间窗口内检测到的错误符号数超过定义的阈值。False Symbol Period Event: The number of detected false symbols exceeds a defined threshold within a given time window.

错帧事件:在给定时间窗口内检测到的错误帧数超过定义的阈值。Error Frame Event: The number of error frames detected within a given time window exceeds a defined threshold.

错帧周期事件:在定量的接收帧中检测到的错误帧数超过定义的阈值。Error frame period event: The number of error frames detected in the quantified received frames exceeds the defined threshold.

错帧秒事件:在一定秒数内检测到的错帧秒数超过定义的阈值。在一个间隔秒内收到的帧中检测到一个或多个错误帧时就称这个间隔秒为错帧秒。Errored frame second event: The number of errored frame seconds detected within a certain number of seconds exceeds the defined threshold. An interval second is called an error frame second when one or more error frames are detected in frames received within an interval second.

上述致命事件通告和普通事件通告主要是针对接收中的错误事件进行监控,同时如果本端的接收器件或设备发生不可恢复的错误情况下,尽最大努力向对端进行通告。The above-mentioned fatal event notification and common event notification are mainly for monitoring error events during reception, and at the same time, if an unrecoverable error occurs in the receiving device or equipment at the local end, it will try its best to notify the opposite end.

在实现本发明过程中,发明人发现上述现有技术中通过致命事件通告和普通事件通告来进行链路故障的指示和定位的方案至少存在如下问题:In the process of implementing the present invention, the inventors found that the solution of indicating and locating link faults through fatal event notification and ordinary event notification in the above-mentioned prior art has at least the following problems:

该方案虽然可以发现通信节点上的接收器件故障、通信链路故障,但没有规定如何把发现的故障通告给用户。同时也无法识别通信节点本身故障的情况。上述通信节点本身故障是指通信节点内部的核心软件或硬件出现故障,导致整个通信节点的无法正常运行。而如果是通信节点上的接收器件、发射器件、接收链路或发射链路发生故障时,通信节点上的其他的没有发生故障的部分仍然可以正常工作。Although this scheme can find the failure of the receiving device and the communication link on the communication node, it does not stipulate how to notify the user of the found failure. At the same time, the failure of the communication node itself cannot be identified. The failure of the communication node itself refers to the failure of the core software or hardware inside the communication node, resulting in the failure of the entire communication node to operate normally. However, if the receiving device, the transmitting device, the receiving link or the transmitting link on the communication node fail, other parts of the communication node that have not failed can still work normally.

发明内容 Contents of the invention

本发明的实施例提供了一种以太网中的故障处理方法和装置。Embodiments of the present invention provide a method and device for processing faults in the Ethernet.

本发明实施例提供的一种以太网中的故障处理方法,所述方法包括:An embodiment of the present invention provides a fault handling method in Ethernet, the method comprising:

管理节点接收以太网链路的端点的通信节点上报的故障信息;The management node receives the fault information reported by the communication node at the endpoint of the Ethernet link;

所述管理节点根据所述故障信息,以及所述管理节点和所述通信节点之间的心跳检测信息,获取所述通信节点的故障信息。The management node obtains the failure information of the communication node according to the failure information and the heartbeat detection information between the management node and the communication node.

本发明实施例提供的一种以太网中的故障处理装置,包括:A device for processing faults in Ethernet provided by an embodiment of the present invention includes:

故障信息接收模块,用于接收以太网链路的端点的通信节点上报的故障信息;The fault information receiving module is used to receive the fault information reported by the communication node of the endpoint of the Ethernet link;

故障处理模块,用于根据所述故障信息接收模块所接收到的故障信息,以及所述故障处理装置和所述通信节点之间的心跳检测信息,获取所述通信节点的故障信息。A fault processing module, configured to acquire fault information of the communication node according to the fault information received by the fault information receiving module and heartbeat detection information between the fault processing device and the communication node.

由上述本发明的实施例提供的技术方案可以看出,本发明实施例通过设置专门的管理节点,并通过专门的管理通信通道来监控通信节点状态,可以有效地获取并区分以太网中的通信链路故障和通信节点故障信息,便于快速进行故障恢复。It can be seen from the technical solutions provided by the above-mentioned embodiments of the present invention that the embodiments of the present invention can effectively obtain and distinguish the communication nodes in the Ethernet by setting a special management node and monitoring the status of the communication nodes through a special management communication channel. Link failure and communication node failure information facilitates rapid failure recovery.

附图说明 Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without any creative effort.

图1为本发明实施例提供的给以太网全双工链路设置管理节点的原理示意图;Fig. 1 provides the schematic diagram of the principle of setting the management node for the Ethernet full-duplex link provided by the embodiment of the present invention;

图2为本发明一个实施例提供的一个通信节点向管理节点上报该通信节点的接收器出现了故障的示意图;FIG. 2 is a schematic diagram of a communication node reporting to the management node that the receiver of the communication node has a failure provided by an embodiment of the present invention;

图3为本发明一个实施例提供的一个通信节点向管理节点上报该通信节点的接收链路质量故障的示意图;FIG. 3 is a schematic diagram of a communication node reporting a quality failure of the communication node to a management node according to an embodiment of the present invention;

图4为本发明另一个实施例提供的一个通信节点向管理节点上报该通信节点的接收链路中断故障的示意图;4 is a schematic diagram of a communication node reporting to a management node a receiving link interruption fault of the communication node provided by another embodiment of the present invention;

图5为本发明实施例提供的管理节点检测出一个通信节点本身出现故障的示意图;FIG. 5 is a schematic diagram of a management node detecting a failure of a communication node itself provided by an embodiment of the present invention;

图6为本发明实施例提供了一种以太网中的故障处理装置的具体实现结构图。Fig. 6 is a specific implementation structural diagram of a fault processing device in an Ethernet provided by an embodiment of the present invention.

具体实施方式 Detailed ways

在本发明实施例中,给以太网链路配置管理节点,所述管理节点接收所述以太网链路的两端的通信节点上报的故障信息。所述管理节点根据所述故障信息,以及管理节点和所述以太网链路的两端的通信节点之间的心跳检测信息,获取所述以太网链路以及两端的通信节点的故障信息。In the embodiment of the present invention, a management node is configured for the Ethernet link, and the management node receives the fault information reported by the communication nodes at both ends of the Ethernet link. The management node obtains the failure information of the Ethernet link and the communication nodes at both ends according to the failure information and the heartbeat detection information between the management node and the communication nodes at both ends of the Ethernet link.

为便于对本发明实施例的理解,下面将结合附图以几个具体实施例为例做进一步的解释说明,且各个实施例并不构成对本发明实施例的限定。In order to facilitate the understanding of the embodiments of the present invention, several specific embodiments will be taken as examples for further explanation below in conjunction with the accompanying drawings, and each embodiment does not constitute a limitation to the embodiments of the present invention.

实施例一Embodiment one

该实施例提供的给以太网通信系统设置管理节点的原理示意图如图1所示。在图1中,第一通信节点和第二通信节点是建立业务通信关系的两个通信节点,第一通信节点和第二通信节点之间的以太网全双工链路可以看成双向的两个单工通道。每个通道由一个RX、TX(Transmitter,发送器)以及一条单向链路组成。A schematic diagram of the principle of setting a management node for an Ethernet communication system provided in this embodiment is shown in FIG. 1 . In Fig. 1, the first communication node and the second communication node are two communication nodes establishing a business communication relationship, and the Ethernet full-duplex link between the first communication node and the second communication node can be regarded as a bidirectional two-way link. a simplex channel. Each channel consists of a RX, TX (Transmitter, transmitter) and a unidirectional link.

给上述第一通信节点和第二通信节点配置一个管理节点,该管理节点通过不同的管理专用通道分别和第一通信节点、第二通信节点进行通信,该管理专用通信通道不受第一通信节点、第二通信节点上的RX/TX故障的影响,也就是在第一通信节点、第二通信节点上的TX/RX故障情况下,管理节点仍然可以通过该管理专用通信通道和第一通信节点、第二通信节点进行通信,该通信包括进行心跳检测等。上述管理节点分别和第一通信节点、第二通信节点之间定期进行心跳检测,根据第一通信节点和第二通信节点通过管理专用通信通道上报过来的信息,以及和第一通信节点、第二通信节点之间的心跳检测结果,监控第一通信节点和第二通信节点的运行状态和第一通信节点和第二通信节点之间的通信链路状态,在检测出故障后,判断出具体故障原因。此外,第一通信节点、第二通信节点之间也可以定期进行心跳检测,以检测第一通信节点、第二通信节点之间的链路是否出现故障。Configure a management node for the above-mentioned first communication node and second communication node, and the management node communicates with the first communication node and the second communication node respectively through different management dedicated channels, and the management dedicated communication channel is not controlled by the first communication node. 1. The influence of RX/TX failure on the second communication node, that is, in the case of TX/RX failure on the first communication node and the second communication node, the management node can still communicate with the first communication node through the management dedicated communication channel . The second communication node performs communication, and the communication includes performing heartbeat detection and the like. The above-mentioned management node performs heartbeat detection with the first communication node and the second communication node on a regular basis, according to the information reported by the first communication node and the second communication node through the management dedicated communication channel, and with the first communication node and the second communication node Heartbeat detection results between communication nodes, monitor the operating status of the first communication node and the second communication node and the status of the communication link between the first communication node and the second communication node, and determine the specific fault after the fault is detected reason. In addition, heartbeat detection may also be performed periodically between the first communication node and the second communication node, so as to detect whether the link between the first communication node and the second communication node fails.

在该实施例中,若第一通信节点检测出自己的RX出现了故障,第一通信节点向管理节点上报其RX出现了故障的示意图如图2所示。在这种情况下,第一通信节点不能接收到第二通信节点发送过来的任何数据,第一通信节点、第二之间的心跳检测将出现故障。于是,第一通信节点向第二通信节点发送致命事件通告例如,Critiacl Event OAM PDU(致命事件操作、管理、维护协议数据单元),以通知第二通信节点其RX故障。In this embodiment, if the first communication node detects that its own RX has failed, the first communication node reports to the management node that its RX has failed, as shown in FIG. 2 . In this case, the first communication node cannot receive any data sent by the second communication node, and the heartbeat detection between the first communication node and the second communication node will fail. Then, the first communication node sends a fatal event notification to the second communication node, for example, Critiacl Event OAM PDU (fatal event operation, management, maintenance protocol data unit), to notify the second communication node of its RX failure.

第一通信节点通过管理专用通信通道向管理节点上报本节点的RX发生故障,管理节点接收到第一通信节点上报的本节点的RX发生故障信息后,通过管理专用通信通道检查和第一通信节点之间的心跳检测是否正常。在检测到该管理节点和第一通信节点之间的心跳检测正常后,管理节点确定第一通信节点的RX故障信息,第一通信节点本身没有发生故障,并向用户通知第一通信节点的RX故障信息。The first communication node reports to the management node that the RX of the node fails through the management dedicated communication channel. Whether the heartbeat detection between is normal. After detecting that the heartbeat detection between the management node and the first communication node is normal, the management node determines the RX failure information of the first communication node, and the first communication node itself does not fail, and notifies the user of the RX failure information of the first communication node. accident details.

实施例二Embodiment two

在该实施例中,若第一通信节点上检测到自己接收方向的错帧和错误符号故障,并且检测到和第二通信节点之间的心跳检测正常,第一通信节点向管理节点上报其接收链路质量故障的示意图如图3所示。在这种情况下,第一通信节点可以判断出第一通信节点的RX和第二通信节点的TX均未发生故障,第一通信节点接收方向的链路质量有问题。于是,第一通信节点向第二通信节点发送普通事件通告,以向第二通信节点通知第一通信节点的接收链路质量故障。In this embodiment, if the first communication node detects frame error and wrong symbol failure in its receiving direction, and detects that the heartbeat detection between the second communication node and the second communication node is normal, the first communication node reports its reception to the management node. A schematic diagram of a link quality failure is shown in Figure 3 . In this case, the first communication node may determine that neither the RX of the first communication node nor the TX of the second communication node is faulty, and the quality of the link in the receiving direction of the first communication node is problematic. Then, the first communication node sends a general event notification to the second communication node, so as to notify the second communication node that the receiving link quality of the first communication node is faulty.

第一通信节点向管理节点上报本节点接收链路质量故障,管理节点接收到第一通信节点上报的本节点的接收链路质量故障信息后,通过管理专用通信通道检查该管理节点和第一通信节点之间的心跳检测是否正常,在检测到该管理节点和第一通信节点之间的心跳检测正常后,管理节点确定第一通信节点的接收链路质量故障信息,第一通信节点本身没有发生故障,并向用户通知第一通信节点的接收链路质量故障信息。The first communication node reports the failure of the receiving link quality of the node to the management node. After receiving the quality failure information of the receiving link of the node reported by the first communication node, the management node checks the management node and the first communication channel through the management dedicated communication channel. Whether the heartbeat detection between nodes is normal. After detecting that the heartbeat detection between the management node and the first communication node is normal, the management node determines that the first communication node receives link quality failure information, and the first communication node itself does not failure, and notify the user of the failure information of the receiving link quality of the first communication node.

实施例三Embodiment Three

在该实施例中,若第一通信节点没有检测到自己的RX故障,但检测到和第二通信节点之间的心跳检测出现故障,第一通信节点向管理节点上报其接收链路中断故障的示意图如图4所示。在这种情况下,第一通信节点可以推断出其接收方向的链路中断。第一通信节点的接收链路中断后,该第一通信节点将接收不到第二通信节点发送的任何通告或数据,而在第一通信节点发生接收链路质量故障时,第一通信节点检测到自己接收方向的错帧和错误符号故障。于是,第一通信节点向第二通信节点发送致命事件通告,以向第二通信节点通知第一通信节点的接收链路中断故障。In this embodiment, if the first communication node does not detect its own RX failure, but detects that the heartbeat detection with the second communication node fails, the first communication node reports to the management node that its receiving link is interrupted. The schematic diagram is shown in Figure 4. In this case, the first communication node can conclude that the link in its receive direction is broken. After the receiving link of the first communication node is interrupted, the first communication node will not receive any notification or data sent by the second communication node, and when the quality failure of the receiving link occurs at the first communication node, the first communication node detects Wrong frame and wrong symbol faults to own receive direction. Then, the first communication node sends a fatal event notification to the second communication node, so as to notify the second communication node that the receiving link of the first communication node is interrupted.

第一通信节点向管理节点上报本节点接收链路中断故障,管理节点接收到第一通信节点上报的本节点的接收链路中断故障信息后,通过管理专用通信通道检查该管理节点和第一通信节点之间的心跳检测是否正常,在检查到该管理节点和第一通信节点、第二通信节点之间的心跳检测也正常后,管理节点确定第一通信节点的接收链路中断故障信息,第一通信节点本身没有发生故障,并向用户通知第一通信节点的接收链路中断故障信息。The first communication node reports to the management node the receiving link interruption fault of the node. After receiving the receiving link interruption fault information of the node reported by the first communication node, the management node checks the management node and the first communication node through the management dedicated communication channel. Whether the heartbeat detection between nodes is normal, after checking that the heartbeat detection between the management node and the first communication node and the second communication node is also normal, the management node determines that the receiving link interruption fault information of the first communication node, the second A communication node itself does not fail, and notifies the user of the receiving link interruption failure information of the first communication node.

实施例四Embodiment Four

在该实施例中,若第一通信节点本身出现故障,导致第一通信节点无法进行任何正常处理,管理节点检测出第一通信节点本身出现故障的示意图如图5所示。在这种情况下,第二通信节点会检测到和第一通信节点之间的心跳检测出现故障,发生心跳丢失,也收不到第一通信节点发送过来的任何通告或数据,并且第二通信节点没有检测到其RX出现故障,于是,第二通信节点向第一通信节点发送致命事件通告,但是,第一通信节点无法处理该致命事件通告。In this embodiment, if the first communication node itself fails, causing the first communication node to be unable to perform any normal processing, the schematic diagram of the management node detecting the failure of the first communication node itself is shown in FIG. 5 . In this case, the second communication node will detect that the heartbeat detection with the first communication node fails, heartbeat loss occurs, and it cannot receive any notification or data sent by the first communication node, and the second communication node The node does not detect that its RX fails, so the second communication node sends a fatal event notification to the first communication node, but the first communication node cannot process the fatal event notification.

第二通信节点向管理节点上报和第一通信节点之间的心跳检测出现故障、收不到第一通信节点发送过来的任何通告或数据信息。管理节点接收到第二通信节点上报的上述信息后,并且通过管理专用通信通道检测和第一通信节点之间的心跳检测是否正常,由于此时,第一通信节点无法进行任何正常处理,因此,管理节点检测到该管理节点和第一通信节点之间的心跳检测不正常。于是,管理节点判断第一通信节点本身出现故障,管理节点向用户通知第一通信节点本身出现故障。The second communication node reports to the management node that the heartbeat detection with the first communication node fails, and that it cannot receive any notification or data information sent by the first communication node. After the management node receives the above information reported by the second communication node, it detects whether the heartbeat between the management dedicated communication channel and the first communication node is normal. Since the first communication node cannot perform any normal processing at this time, therefore, The management node detects that the heartbeat detection between the management node and the first communication node is abnormal. Therefore, the management node determines that the first communication node itself has a failure, and the management node notifies the user that the first communication node itself has a failure.

本发明实施例还提供了一种以太网中的故障处理装置,该装置可以为以太网通信系统中的管理节点,该装置通过不同的管理专用通道分别和以太网中的不同通信节点进行通信。上述管理专用通信通道不受通信节点上的RX/TX故障的影响,也就是在通信节点上的TX/RX故障情况下,该装置仍然可以通过该管理专用通信通道和通信节点进行通信,该通信包括进行心跳检测等。The embodiment of the present invention also provides a fault processing device in the Ethernet. The device can be a management node in the Ethernet communication system, and the device communicates with different communication nodes in the Ethernet through different dedicated management channels. The above-mentioned management dedicated communication channel is not affected by the RX/TX failure on the communication node, that is, in the case of a TX/RX failure on the communication node, the device can still communicate with the communication node through the management dedicated communication channel, and the communication Including heartbeat detection, etc.

其具体实现结构如图6所示,具体可以包括:Its specific implementation structure is shown in Figure 6, which may specifically include:

故障信息接收模块61,用于接收以太网链路的端点的通信节点上报的故障信息;Fault information receiving module 61, used for receiving the fault information reported by the communication node of the endpoint of the Ethernet link;

故障处理模块62,用于根据所述故障信息接收模块所接收到的故障信息,以及所述故障处理装置和所述以太网链路的端点的通信节点之间的心跳检测信息,获取所述通信节点的故障信息。The fault processing module 62 is configured to obtain the communication information according to the fault information received by the fault information receiving module and the heartbeat detection information between the fault processing device and the communication node at the end point of the Ethernet link. Node failure information.

其中所述故障处理模块62具体包括:第一处理模块、第二处理模块、第三处理模块和第四处理模块中的至少一项,其中,Wherein the fault processing module 62 specifically includes: at least one of a first processing module, a second processing module, a third processing module and a fourth processing module, wherein,

第一处理模块621,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收器故障信息,并且通过管理专用通信通道确定所述装置和所述一端的通信节点之间的心跳检测正常,则确定所述一端的通信节点的接收器故障信息,并向用户通告所述一端的通信节点的接收器故障信息;The first processing module 621 is configured to receive the receiver failure information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and determine the device and the communication node at the one end through a management dedicated communication channel If the heartbeat detection between them is normal, then determine the receiver failure information of the communication node at the one end, and notify the user of the receiver failure information of the communication node at the one end;

第二处理模块622,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路质量故障信息,并且通过管理专用通信通道确定所述装置和所述一端的通信节点之间的心跳检测正常,则确定所述一端的通信节点的接收链路质量故障信息,并向用户通告所述一端的通信节点的接收链路质量故障信息;The second processing module 622 is configured to receive the received link quality failure information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and determine the connection between the device and the one end through a management dedicated communication channel If the heartbeat detection between the communication nodes is normal, then determine the receiving link quality failure information of the communication node at the one end, and notify the user of the receiving link quality failure information of the communication node at the one end;

第三处理模块623,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路中断故障信息,以及所述一端的通信节点和另一端的通信节点之间的心跳检测故障信息,并且通过管理专用通信通道确定所述装置和所述一端、另一端的通信节点之间的心跳检测正常,则确定所述一端的通信节点的接收链路中断故障信息,并向用户通告所述一端的通信节点的接收链路中断故障信息;The third processing module 623 is configured to receive the received link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at one end and the communication node at the other end heartbeat detection failure information, and determine that the heartbeat detection between the device and the communication node at the one end and the other end is normal through the management dedicated communication channel, then determine that the communication node at the one end receives the link interruption failure information, and Notifying the user of the failure information of the receiving link interruption of the communication node at the one end;

第四处理模块624,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点和另一端的通信节点之间的心跳检测故障信息,以及该一端的通信节点没有收到另一端的通信节点的通告或数据信息,并且通过管理专用通信通道确定所述装置和所述另一端的通信节点之间的心跳检测故障信息,则确定所述另一端的通信节点本身故障信息,并向用户通告所述另一端的通信节点本身故障信息。The fourth processing module 624 is configured to receive the heartbeat detection failure information reported by the communication node at one end of the Ethernet link between the communication node at the one end and the communication node at the other end, and the communication node at the one end does not receive notification or data information of the communication node at the other end, and determine the heartbeat detection failure information between the device and the communication node at the other end through the management dedicated communication channel, then determine the failure information of the communication node at the other end itself, And notify the user of the failure information of the communication node at the other end.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.

综上所述,本发明实施例通过设置专门的管理节点,并通过专门的管理通信通道来监控通信节点状态,可以有效地获取并区分以太网中的通信链路故障和通信节点故障信息,便于快速进行故障恢复。In summary, the embodiment of the present invention can effectively obtain and distinguish communication link failure and communication node failure information in Ethernet by setting a special management node and monitoring the status of the communication node through a special management communication channel, which facilitates Fast recovery from failures.

本发明实施例还可以通过管理节点向用户通告以太网中的通信链路故障和通信节点故障信息。In the embodiment of the present invention, the management node can also notify the user of communication link failure and communication node failure information in the Ethernet.

以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求的保护范围为准。The above is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art within the technical scope disclosed in the present invention can easily think of changes or Replacement should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the protection scope of the claims.

Claims (5)

1.一种以太网中的故障处理方法,其特征在于,所述方法包括: 1. a fault handling method in Ethernet, is characterized in that, described method comprises: 管理节点接收以太网链路的端点的通信节点上报的故障信息,所述管理节点通过管理专用通道和所述通信节点进行通信,和所述通信节点定期进行心跳检测; The management node receives the fault information reported by the communication node of the endpoint of the Ethernet link, the management node communicates with the communication node through a dedicated management channel, and periodically performs heartbeat detection with the communication node; 所述管理节点根据所述故障信息,以及所述管理节点和所述以太网链路的端点的通信节点之间的心跳检测信息,获取所述通信节点的故障信息,具体包括: The management node obtains the failure information of the communication node according to the failure information and the heartbeat detection information between the management node and the communication node at the end of the Ethernet link, specifically including: 所述管理节点接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收器故障信息,并且确定所述管理节点和所述一端的通信节点之间的心跳检测正常; The management node receives the receiver failure information of the communication node at one end of the Ethernet link reported by the communication node at one end, and determines that the heartbeat detection between the management node and the communication node at one end is normal; 所述管理节点确定所述一端的通信节点的接收器故障信息,并向用户通告所述一端的通信节点的接收器故障信息; The management node determines the receiver failure information of the communication node at the one end, and notifies the user of the receiver failure information of the communication node at the one end; 或者, or, 所述管理节点接收所述以太网链路的一端的通信节点上报的该一端的通信节点和另一端的通信节点之间的心跳检测故障信息,以及该一端的通信节点没有收到另一端的通信节点的通告或数据信息,并且确定所述管理节点和所述另一端的通信节点之间的心跳检测故障信息; The management node receives the heartbeat detection failure information between the communication node at one end and the communication node at the other end reported by the communication node at one end of the Ethernet link, and the communication node at the one end does not receive the communication at the other end node notification or data information, and determine heartbeat detection failure information between the management node and the communication node at the other end; 所述管理节点确定所述另一端的通信节点本身故障信息,并向用户通告所述另一端的通信节点本身故障信息。 The management node determines the failure information of the communication node at the other end, and notifies the user of the failure information of the communication node at the other end. 2.根据权利要求1所述的方法,其特征在于,所述的所述管理节点根据所述故障信息,以及所述管理节点和所述通信节点之间的心跳检测信息,获取所述通信节点的故障信息,具体包括: 2. The method according to claim 1, wherein the management node obtains the communication node information according to the fault information and the heartbeat detection information between the management node and the communication node. fault information, including: 所述管理节点接收所述以太网链路的一端的通信节点上报的该一端的通 信节点的接收链路质量故障信息,并且确定所述管理节点和所述一端的通信节点之间的心跳检测正常; The management node receives the received link quality fault information of the communication node at one end of the Ethernet link reported by the communication node at one end, and determines the heartbeat detection between the management node and the communication node at the one end normal; 所述管理节点确定所述一端的通信节点的接收链路质量故障信息,并向用户通告所述一端的通信节点的接收链路质量故障信息。 The management node determines the receiving link quality failure information of the communication node at the one end, and notifies the user of the receiving link quality failure information of the communication node at the one end. 3.根据权利要求1所述的方法,其特征在于,所述的所述管理节点根据所述故障信息,以及所述管理节点和所述通信节点之间的心跳检测信息,获取所述通信节点的故障信息,具体包括: 3. The method according to claim 1, wherein the management node obtains the communication node information according to the failure information and the heartbeat detection information between the management node and the communication node. fault information, including: 所述管理节点接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路中断故障信息,以及所述一端的通信节点和另一端的通信节点之间的心跳检测故障信息,并且确定所述管理节点和所述一端、另一端的通信节点之间的心跳检测正常; The management node receives the receiving link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the heartbeat detection failure between the communication node at the one end and the communication node at the other end information, and determine that the heartbeat detection between the management node and the communication nodes at the one end and the other end is normal; 所述管理节点确定所述一端的通信节点的接收链路中断故障信息,并向用户通告所述一端的通信节点的接收链路中断故障信息。 The management node determines the receiving link interruption fault information of the communication node at the one end, and notifies the user of the receiving link interruption fault information of the communication node at the one end. 4.一种以太网中的故障处理装置,其特征在于,具体包括: 4. a fault processing device in Ethernet, is characterized in that, specifically comprises: 故障信息接收模块,用于接收以太网链路的端点的通信节点上报的故障信息; The fault information receiving module is used to receive the fault information reported by the communication node of the endpoint of the Ethernet link; 故障处理模块,用于根据所述故障信息接收模块所接收到的故障信息,以及所述故障处理装置和所述通信节点之间的心跳检测信息,获取所述通信节点的故障信息; A fault processing module, configured to acquire fault information of the communication node according to the fault information received by the fault information receiving module and heartbeat detection information between the fault processing device and the communication node; 所述故障处理装置包括以太网通信系统中的管理节点,所述管理节点通过管理专用通道和所述通信节点进行通信,和所述通信节点定期进行心跳检测; The fault processing device includes a management node in the Ethernet communication system, the management node communicates with the communication node through a dedicated management channel, and regularly performs heartbeat detection with the communication node; 所述故障处理模块具体包括下列模块中的至少一项: The fault processing module specifically includes at least one of the following modules: 第一处理模块,用于接收所述以太网链路的一端的通信节点上报的该通信节点接收器故障信息,并且确定所述故障处理装置和所述一端的通信节点 之间的心跳检测正常,则确定所述一端的通信节点的接收器故障信息,并向用户通告所述一端的通信节点的接收器故障信息; The first processing module is used to receive the communication node receiver failure information reported by the communication node at one end of the Ethernet link, and determine that the heartbeat detection between the failure processing device and the communication node at the one end is normal, Then determine the receiver failure information of the communication node at the one end, and notify the user of the receiver failure information of the communication node at the one end; 第四处理模块,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点和另一端的通信节点之间的心跳检测故障信息,以及该一端的通信节点没有收到另一端的通信节点的通告或数据信息,并且确定所述故障处理装置和所述另一端的通信节点之间的心跳检测故障信息,则确定所述另一端的通信节点本身故障信息,并向用户通告所述另一端的通信节点本身故障信息。 The fourth processing module is used to receive the heartbeat detection failure information between the communication node at one end and the communication node at the other end reported by the communication node at one end of the Ethernet link, and the communication node at the one end does not receive the other notification or data information of the communication node at one end, and determine the heartbeat detection fault information between the fault processing device and the communication node at the other end, then determine the fault information of the communication node itself at the other end, and notify the user The failure information of the communication node at the other end itself. 5.根据权利要求4所述的故障处理装置,其特征在于,所述故障处理模块还包括下列模块中的至少一项: 5. The fault processing device according to claim 4, wherein the fault processing module further comprises at least one of the following modules: 第二处理模块,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路质量故障信息,以及所述一端的通信节点和另一端的通信节点之间的心跳检测正常信息,并且确定所述故障处理装置和所述一端的通信节点之间的心跳检测正常,则确定所述一端的通信节点的接收链路质量故障信息,并向用户通告所述一端的通信节点的接收链路质量故障信息; The second processing module is configured to receive the received link quality fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at the one end and the communication node at the other end The heartbeat detection is normal information, and it is determined that the heartbeat detection between the fault processing device and the communication node at the one end is normal, then determine the receiving link quality failure information of the communication node at the one end, and notify the user of the fault at the one end Receive link quality failure information of the communication node; 第三处理模块,用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路中断故障信息,以及所述一端的通信节点和另一端的通信节点之间的心跳检测故障信息,并且确定所述故障处理装置和所述一端、另一端的通信节点之间的心跳检测正常,则确定所述一端的通信节点的接收链路中断故障信息,并向用户通告所述一端的通信节点的接收链路中断故障信息。  The third processing module is configured to receive the received link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at the one end and the communication node at the other end Heartbeat detection fault information, and determine that the heartbeat detection between the fault processing device and the communication node at the one end and the other end is normal, then determine the receiving link interruption fault information of the communication node at the one end, and notify the user of the failure information The receiving link interruption fault information of the communication node at one end. the
CN200910088156A 2009-07-03 2009-07-03 Fault handling method and device for Ethernet Expired - Fee Related CN101938365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910088156A CN101938365B (en) 2009-07-03 2009-07-03 Fault handling method and device for Ethernet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910088156A CN101938365B (en) 2009-07-03 2009-07-03 Fault handling method and device for Ethernet

Publications (2)

Publication Number Publication Date
CN101938365A CN101938365A (en) 2011-01-05
CN101938365B true CN101938365B (en) 2012-10-03

Family

ID=43391514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910088156A Expired - Fee Related CN101938365B (en) 2009-07-03 2009-07-03 Fault handling method and device for Ethernet

Country Status (1)

Country Link
CN (1) CN101938365B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5426782B2 (en) * 2011-02-08 2014-02-26 三菱電機株式会社 Communication system, communication line switching method, and master station apparatus
CN103684929B (en) * 2013-12-27 2017-01-25 乐视云计算有限公司 System and method for monitoring server status
WO2015100611A1 (en) 2013-12-31 2015-07-09 华为技术有限公司 Network function virtualisation nfv fault management apparatus, device, and method
CN106452957B (en) * 2016-09-30 2019-09-10 邦彦技术股份有限公司 Heartbeat detection method and node system
CN107347019A (en) * 2017-04-20 2017-11-14 武汉迈力特通信有限公司 The apparatus and method of MSTP system ethernet link failure fast transfers
CN112118145A (en) * 2019-06-19 2020-12-22 北京沃东天骏信息技术有限公司 Node state monitoring method, control device and monitoring device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1653686A1 (en) * 2004-11-01 2006-05-03 Lucent Technologies Inc. Softrouter feature server
CN101141327A (en) * 2007-10-11 2008-03-12 中兴通讯股份有限公司 Method for detecting network node abnormality
CN101252500A (en) * 2008-04-16 2008-08-27 杭州华三通信技术有限公司 Intersect looped network, node and realizing method of random topology intersect looped network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1653686A1 (en) * 2004-11-01 2006-05-03 Lucent Technologies Inc. Softrouter feature server
CN101141327A (en) * 2007-10-11 2008-03-12 中兴通讯股份有限公司 Method for detecting network node abnormality
CN101252500A (en) * 2008-04-16 2008-08-27 杭州华三通信技术有限公司 Intersect looped network, node and realizing method of random topology intersect looped network

Also Published As

Publication number Publication date
CN101938365A (en) 2011-01-05

Similar Documents

Publication Publication Date Title
CN103684835B (en) Link fault reporting method and processing method, and transmission node and primary node
CN101938365B (en) Fault handling method and device for Ethernet
US9237092B2 (en) Method, apparatus, and system for updating ring network topology information
US9009523B2 (en) Method and apparatus for isolating a fault in a controller area network
US8804485B2 (en) Method and apparatus for coordinating fault recovery techniques among domains
CN101631011A (en) Hotspare method and system suitable for device for processing and forwarding IP media stream in real time
WO2017215456A1 (en) Alarming method, apparatus, network node and computer storage medium
CN101094121A (en) Method, system and device for detecting Ethernet links among not direct connected devices
CN101184013B (en) Method for preventing generation of loop, host node and system
CN108011758B (en) A remote wireless channel monitoring method for electric power remote centralized copying system
US20080002569A1 (en) Method and apparatus for identifying a fault in a communications link
CN112751720B (en) Train backbone network system, fault detection method and storage medium
CN101163059B (en) Network node detection method and apparatus
CN115865742A (en) One-way link fault detection method and system of a white box switch
CN100401826C (en) Fault Detection Method of Transmission Link
JP5722167B2 (en) Fault monitoring determination apparatus, fault monitoring determination method, and program
JP2005268889A (en) Transmission path switching system and operating method of the transmission path switching system
WO2012009914A1 (en) Protection switching method and system
JP2012222700A (en) Node device for monitoring fault, and method for restoration from fault
CN102386972B (en) Method, device and system for detecting misconnection of optical fibers
CN105426118A (en) Method for heartbeat channel backup by using serial port in double control system
JP5475706B2 (en) Monitoring device, communication device, and network monitoring method
CN103095570A (en) Method for judging failure of cross module
JP5743967B2 (en) COMMUNICATION DEVICE, COMMUNICATION METHOD, AND PROGRAM
KR20130035664A (en) Network restoration method and apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121003