CN101938365B

CN101938365B - Fault handling method and device for Ethernet

Info

Publication number: CN101938365B
Application number: CN200910088156A
Authority: CN
Inventors: 王如亮
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2009-07-03
Filing date: 2009-07-03
Publication date: 2012-10-03
Anticipated expiration: 2029-07-03
Also published as: CN101938365A

Abstract

The invention provides a fault processing method and device in the Ethernet. The method mainly includes: the management node receives the fault information reported by the communication nodes at both ends of the Ethernet link, and the management node, according to the fault information, and the communication nodes at both ends of the Ethernet link The heartbeat detection information between the Ethernet links and the failure information of the communication nodes at both ends are acquired. In the embodiment of the present invention, by setting a special management node and monitoring the state of the communication node through a special management communication channel, the communication link failure and communication node failure information in the Ethernet can be effectively obtained and distinguished, and the failure recovery is facilitated quickly.

Description

Method and device for troubleshooting in Ethernet

技术领域 technical field

本发明涉及以太网通信技术领域，尤其涉及一种以太网中的故障处理方法和装置。The invention relates to the technical field of Ethernet communication, in particular to a method and device for processing faults in the Ethernet.

背景技术 Background technique

随着以太网在城域网和广域网中的不断发展，运营商对以太网的设备和链路的OAM(Operations Administrations and Maintaince，可维护性)越来越关注。With the continuous development of Ethernet in MAN and WAN, operators pay more and more attention to the OAM (Operations Administrations and Maintenance, maintainability) of Ethernet equipment and links.

现有的OAM方案侧重于点到点以太网链路的管理维护，提供了完整的以太网OAM解决方案。OAM子层作为MAC(Media Access Control，媒介接入控制)层的一个可选子层，在生效的情况下为监控链路运行状态提供了一种有效的机制，可应用于任何全双工、点到点的以太网链路，或模拟的点到点以太网链路。Existing OAM solutions focus on the management and maintenance of point-to-point Ethernet links, providing a complete Ethernet OAM solution. As an optional sublayer of the MAC (Media Access Control, Media Access Control) layer, the OAM sublayer provides an effective mechanism for monitoring the link operation status when it is in effect, and can be applied to any full-duplex, A point-to-point Ethernet link, or a simulated point-to-point Ethernet link.

在现有的OAM方案中，通过致命事件通告和普通事件通告两类通告来进行链路故障的指示和定位。In the existing OAM solution, link fault indication and location are performed through fatal event notification and common event notification.

致命事件通告采用OAM PDU(Protocol Data Unit，协议数据单元)s中的标志位来实现，主要包括：Fatal event notification is implemented by the flag bits in OAM PDU (Protocol Data Unit, protocol data unit), mainly including:

Link Fault：指示本端PHY(Physical Layer Device，物理层设备)Rx(Receiver，接收器)故障(对端链路信号丢失)。Link Fault: Indicates that the local PHY (Physical Layer Device, physical layer device) Rx (Receiver, receiver) is faulty (the peer link signal is lost).

Dying Gasp：发生了不可恢复的本端错误。Dying Gasp: An unrecoverable local error has occurred.

Critical Event：未知的关键故障发生。Critical Event: An unknown critical fault occurs.

普通事件通告采用OAM PDUs中的TLV(Type-Length-Value，类型长度值)字段来实现，主要包括：Common event notifications are implemented using the TLV (Type-Length-Value) field in OAM PDUs, mainly including:

错误符号周期事件：在给定时间窗口内检测到的错误符号数超过定义的阈值。False Symbol Period Event: The number of detected false symbols exceeds a defined threshold within a given time window.

错帧事件：在给定时间窗口内检测到的错误帧数超过定义的阈值。Error Frame Event: The number of error frames detected within a given time window exceeds a defined threshold.

错帧周期事件：在定量的接收帧中检测到的错误帧数超过定义的阈值。Error frame period event: The number of error frames detected in the quantified received frames exceeds the defined threshold.

错帧秒事件：在一定秒数内检测到的错帧秒数超过定义的阈值。在一个间隔秒内收到的帧中检测到一个或多个错误帧时就称这个间隔秒为错帧秒。Errored frame second event: The number of errored frame seconds detected within a certain number of seconds exceeds the defined threshold. An interval second is called an error frame second when one or more error frames are detected in frames received within an interval second.

上述致命事件通告和普通事件通告主要是针对接收中的错误事件进行监控，同时如果本端的接收器件或设备发生不可恢复的错误情况下，尽最大努力向对端进行通告。The above-mentioned fatal event notification and common event notification are mainly for monitoring error events during reception, and at the same time, if an unrecoverable error occurs in the receiving device or equipment at the local end, it will try its best to notify the opposite end.

在实现本发明过程中，发明人发现上述现有技术中通过致命事件通告和普通事件通告来进行链路故障的指示和定位的方案至少存在如下问题：In the process of implementing the present invention, the inventors found that the solution of indicating and locating link faults through fatal event notification and ordinary event notification in the above-mentioned prior art has at least the following problems:

该方案虽然可以发现通信节点上的接收器件故障、通信链路故障，但没有规定如何把发现的故障通告给用户。同时也无法识别通信节点本身故障的情况。上述通信节点本身故障是指通信节点内部的核心软件或硬件出现故障，导致整个通信节点的无法正常运行。而如果是通信节点上的接收器件、发射器件、接收链路或发射链路发生故障时，通信节点上的其他的没有发生故障的部分仍然可以正常工作。Although this scheme can find the failure of the receiving device and the communication link on the communication node, it does not stipulate how to notify the user of the found failure. At the same time, the failure of the communication node itself cannot be identified. The failure of the communication node itself refers to the failure of the core software or hardware inside the communication node, resulting in the failure of the entire communication node to operate normally. However, if the receiving device, the transmitting device, the receiving link or the transmitting link on the communication node fail, other parts of the communication node that have not failed can still work normally.

发明内容 Contents of the invention

本发明的实施例提供了一种以太网中的故障处理方法和装置。Embodiments of the present invention provide a method and device for processing faults in the Ethernet.

本发明实施例提供的一种以太网中的故障处理方法，所述方法包括：An embodiment of the present invention provides a fault handling method in Ethernet, the method comprising:

管理节点接收以太网链路的端点的通信节点上报的故障信息；The management node receives the fault information reported by the communication node at the endpoint of the Ethernet link;

所述管理节点根据所述故障信息，以及所述管理节点和所述通信节点之间的心跳检测信息，获取所述通信节点的故障信息。The management node obtains the failure information of the communication node according to the failure information and the heartbeat detection information between the management node and the communication node.

本发明实施例提供的一种以太网中的故障处理装置，包括：A device for processing faults in Ethernet provided by an embodiment of the present invention includes:

故障信息接收模块，用于接收以太网链路的端点的通信节点上报的故障信息；The fault information receiving module is used to receive the fault information reported by the communication node of the endpoint of the Ethernet link;

故障处理模块，用于根据所述故障信息接收模块所接收到的故障信息，以及所述故障处理装置和所述通信节点之间的心跳检测信息，获取所述通信节点的故障信息。A fault processing module, configured to acquire fault information of the communication node according to the fault information received by the fault information receiving module and heartbeat detection information between the fault processing device and the communication node.

由上述本发明的实施例提供的技术方案可以看出，本发明实施例通过设置专门的管理节点，并通过专门的管理通信通道来监控通信节点状态，可以有效地获取并区分以太网中的通信链路故障和通信节点故障信息，便于快速进行故障恢复。It can be seen from the technical solutions provided by the above-mentioned embodiments of the present invention that the embodiments of the present invention can effectively obtain and distinguish the communication nodes in the Ethernet by setting a special management node and monitoring the status of the communication nodes through a special management communication channel. Link failure and communication node failure information facilitates rapid failure recovery.

附图说明 Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without any creative effort.

图1为本发明实施例提供的给以太网全双工链路设置管理节点的原理示意图；Fig. 1 provides the schematic diagram of the principle of setting the management node for the Ethernet full-duplex link provided by the embodiment of the present invention;

图2为本发明一个实施例提供的一个通信节点向管理节点上报该通信节点的接收器出现了故障的示意图；FIG. 2 is a schematic diagram of a communication node reporting to the management node that the receiver of the communication node has a failure provided by an embodiment of the present invention;

图3为本发明一个实施例提供的一个通信节点向管理节点上报该通信节点的接收链路质量故障的示意图；FIG. 3 is a schematic diagram of a communication node reporting a quality failure of the communication node to a management node according to an embodiment of the present invention;

图4为本发明另一个实施例提供的一个通信节点向管理节点上报该通信节点的接收链路中断故障的示意图；4 is a schematic diagram of a communication node reporting to a management node a receiving link interruption fault of the communication node provided by another embodiment of the present invention;

图5为本发明实施例提供的管理节点检测出一个通信节点本身出现故障的示意图；FIG. 5 is a schematic diagram of a management node detecting a failure of a communication node itself provided by an embodiment of the present invention;

图6为本发明实施例提供了一种以太网中的故障处理装置的具体实现结构图。Fig. 6 is a specific implementation structural diagram of a fault processing device in an Ethernet provided by an embodiment of the present invention.

具体实施方式 Detailed ways

在本发明实施例中，给以太网链路配置管理节点，所述管理节点接收所述以太网链路的两端的通信节点上报的故障信息。所述管理节点根据所述故障信息，以及管理节点和所述以太网链路的两端的通信节点之间的心跳检测信息，获取所述以太网链路以及两端的通信节点的故障信息。In the embodiment of the present invention, a management node is configured for the Ethernet link, and the management node receives the fault information reported by the communication nodes at both ends of the Ethernet link. The management node obtains the failure information of the Ethernet link and the communication nodes at both ends according to the failure information and the heartbeat detection information between the management node and the communication nodes at both ends of the Ethernet link.

为便于对本发明实施例的理解，下面将结合附图以几个具体实施例为例做进一步的解释说明，且各个实施例并不构成对本发明实施例的限定。In order to facilitate the understanding of the embodiments of the present invention, several specific embodiments will be taken as examples for further explanation below in conjunction with the accompanying drawings, and each embodiment does not constitute a limitation to the embodiments of the present invention.

实施例一Embodiment one

该实施例提供的给以太网通信系统设置管理节点的原理示意图如图1所示。在图1中，第一通信节点和第二通信节点是建立业务通信关系的两个通信节点，第一通信节点和第二通信节点之间的以太网全双工链路可以看成双向的两个单工通道。每个通道由一个RX、TX(Transmitter，发送器)以及一条单向链路组成。A schematic diagram of the principle of setting a management node for an Ethernet communication system provided in this embodiment is shown in FIG. 1 . In Fig. 1, the first communication node and the second communication node are two communication nodes establishing a business communication relationship, and the Ethernet full-duplex link between the first communication node and the second communication node can be regarded as a bidirectional two-way link. a simplex channel. Each channel consists of a RX, TX (Transmitter, transmitter) and a unidirectional link.

给上述第一通信节点和第二通信节点配置一个管理节点，该管理节点通过不同的管理专用通道分别和第一通信节点、第二通信节点进行通信，该管理专用通信通道不受第一通信节点、第二通信节点上的RX/TX故障的影响，也就是在第一通信节点、第二通信节点上的TX/RX故障情况下，管理节点仍然可以通过该管理专用通信通道和第一通信节点、第二通信节点进行通信，该通信包括进行心跳检测等。上述管理节点分别和第一通信节点、第二通信节点之间定期进行心跳检测，根据第一通信节点和第二通信节点通过管理专用通信通道上报过来的信息，以及和第一通信节点、第二通信节点之间的心跳检测结果，监控第一通信节点和第二通信节点的运行状态和第一通信节点和第二通信节点之间的通信链路状态，在检测出故障后，判断出具体故障原因。此外，第一通信节点、第二通信节点之间也可以定期进行心跳检测，以检测第一通信节点、第二通信节点之间的链路是否出现故障。Configure a management node for the above-mentioned first communication node and second communication node, and the management node communicates with the first communication node and the second communication node respectively through different management dedicated channels, and the management dedicated communication channel is not controlled by the first communication node. 1. The influence of RX/TX failure on the second communication node, that is, in the case of TX/RX failure on the first communication node and the second communication node, the management node can still communicate with the first communication node through the management dedicated communication channel . The second communication node performs communication, and the communication includes performing heartbeat detection and the like. The above-mentioned management node performs heartbeat detection with the first communication node and the second communication node on a regular basis, according to the information reported by the first communication node and the second communication node through the management dedicated communication channel, and with the first communication node and the second communication node Heartbeat detection results between communication nodes, monitor the operating status of the first communication node and the second communication node and the status of the communication link between the first communication node and the second communication node, and determine the specific fault after the fault is detected reason. In addition, heartbeat detection may also be performed periodically between the first communication node and the second communication node, so as to detect whether the link between the first communication node and the second communication node fails.

在该实施例中，若第一通信节点检测出自己的RX出现了故障，第一通信节点向管理节点上报其RX出现了故障的示意图如图2所示。在这种情况下，第一通信节点不能接收到第二通信节点发送过来的任何数据，第一通信节点、第二之间的心跳检测将出现故障。于是，第一通信节点向第二通信节点发送致命事件通告例如，Critiacl Event OAM PDU(致命事件操作、管理、维护协议数据单元)，以通知第二通信节点其RX故障。In this embodiment, if the first communication node detects that its own RX has failed, the first communication node reports to the management node that its RX has failed, as shown in FIG. 2 . In this case, the first communication node cannot receive any data sent by the second communication node, and the heartbeat detection between the first communication node and the second communication node will fail. Then, the first communication node sends a fatal event notification to the second communication node, for example, Critiacl Event OAM PDU (fatal event operation, management, maintenance protocol data unit), to notify the second communication node of its RX failure.

第一通信节点通过管理专用通信通道向管理节点上报本节点的RX发生故障，管理节点接收到第一通信节点上报的本节点的RX发生故障信息后，通过管理专用通信通道检查和第一通信节点之间的心跳检测是否正常。在检测到该管理节点和第一通信节点之间的心跳检测正常后，管理节点确定第一通信节点的RX故障信息，第一通信节点本身没有发生故障，并向用户通知第一通信节点的RX故障信息。The first communication node reports to the management node that the RX of the node fails through the management dedicated communication channel. Whether the heartbeat detection between is normal. After detecting that the heartbeat detection between the management node and the first communication node is normal, the management node determines the RX failure information of the first communication node, and the first communication node itself does not fail, and notifies the user of the RX failure information of the first communication node. accident details.

实施例二Embodiment two

在该实施例中，若第一通信节点上检测到自己接收方向的错帧和错误符号故障，并且检测到和第二通信节点之间的心跳检测正常，第一通信节点向管理节点上报其接收链路质量故障的示意图如图3所示。在这种情况下，第一通信节点可以判断出第一通信节点的RX和第二通信节点的TX均未发生故障，第一通信节点接收方向的链路质量有问题。于是，第一通信节点向第二通信节点发送普通事件通告，以向第二通信节点通知第一通信节点的接收链路质量故障。In this embodiment, if the first communication node detects frame error and wrong symbol failure in its receiving direction, and detects that the heartbeat detection between the second communication node and the second communication node is normal, the first communication node reports its reception to the management node. A schematic diagram of a link quality failure is shown in Figure 3 . In this case, the first communication node may determine that neither the RX of the first communication node nor the TX of the second communication node is faulty, and the quality of the link in the receiving direction of the first communication node is problematic. Then, the first communication node sends a general event notification to the second communication node, so as to notify the second communication node that the receiving link quality of the first communication node is faulty.

第一通信节点向管理节点上报本节点接收链路质量故障，管理节点接收到第一通信节点上报的本节点的接收链路质量故障信息后，通过管理专用通信通道检查该管理节点和第一通信节点之间的心跳检测是否正常，在检测到该管理节点和第一通信节点之间的心跳检测正常后，管理节点确定第一通信节点的接收链路质量故障信息，第一通信节点本身没有发生故障，并向用户通知第一通信节点的接收链路质量故障信息。The first communication node reports the failure of the receiving link quality of the node to the management node. After receiving the quality failure information of the receiving link of the node reported by the first communication node, the management node checks the management node and the first communication channel through the management dedicated communication channel. Whether the heartbeat detection between nodes is normal. After detecting that the heartbeat detection between the management node and the first communication node is normal, the management node determines that the first communication node receives link quality failure information, and the first communication node itself does not failure, and notify the user of the failure information of the receiving link quality of the first communication node.

实施例三Embodiment Three

在该实施例中，若第一通信节点没有检测到自己的RX故障，但检测到和第二通信节点之间的心跳检测出现故障，第一通信节点向管理节点上报其接收链路中断故障的示意图如图4所示。在这种情况下，第一通信节点可以推断出其接收方向的链路中断。第一通信节点的接收链路中断后，该第一通信节点将接收不到第二通信节点发送的任何通告或数据，而在第一通信节点发生接收链路质量故障时，第一通信节点检测到自己接收方向的错帧和错误符号故障。于是，第一通信节点向第二通信节点发送致命事件通告，以向第二通信节点通知第一通信节点的接收链路中断故障。In this embodiment, if the first communication node does not detect its own RX failure, but detects that the heartbeat detection with the second communication node fails, the first communication node reports to the management node that its receiving link is interrupted. The schematic diagram is shown in Figure 4. In this case, the first communication node can conclude that the link in its receive direction is broken. After the receiving link of the first communication node is interrupted, the first communication node will not receive any notification or data sent by the second communication node, and when the quality failure of the receiving link occurs at the first communication node, the first communication node detects Wrong frame and wrong symbol faults to own receive direction. Then, the first communication node sends a fatal event notification to the second communication node, so as to notify the second communication node that the receiving link of the first communication node is interrupted.

第一通信节点向管理节点上报本节点接收链路中断故障，管理节点接收到第一通信节点上报的本节点的接收链路中断故障信息后，通过管理专用通信通道检查该管理节点和第一通信节点之间的心跳检测是否正常，在检查到该管理节点和第一通信节点、第二通信节点之间的心跳检测也正常后，管理节点确定第一通信节点的接收链路中断故障信息，第一通信节点本身没有发生故障，并向用户通知第一通信节点的接收链路中断故障信息。The first communication node reports to the management node the receiving link interruption fault of the node. After receiving the receiving link interruption fault information of the node reported by the first communication node, the management node checks the management node and the first communication node through the management dedicated communication channel. Whether the heartbeat detection between nodes is normal, after checking that the heartbeat detection between the management node and the first communication node and the second communication node is also normal, the management node determines that the receiving link interruption fault information of the first communication node, the second A communication node itself does not fail, and notifies the user of the receiving link interruption failure information of the first communication node.

实施例四Embodiment Four

在该实施例中，若第一通信节点本身出现故障，导致第一通信节点无法进行任何正常处理，管理节点检测出第一通信节点本身出现故障的示意图如图5所示。在这种情况下，第二通信节点会检测到和第一通信节点之间的心跳检测出现故障，发生心跳丢失，也收不到第一通信节点发送过来的任何通告或数据，并且第二通信节点没有检测到其RX出现故障，于是，第二通信节点向第一通信节点发送致命事件通告，但是，第一通信节点无法处理该致命事件通告。In this embodiment, if the first communication node itself fails, causing the first communication node to be unable to perform any normal processing, the schematic diagram of the management node detecting the failure of the first communication node itself is shown in FIG. 5 . In this case, the second communication node will detect that the heartbeat detection with the first communication node fails, heartbeat loss occurs, and it cannot receive any notification or data sent by the first communication node, and the second communication node The node does not detect that its RX fails, so the second communication node sends a fatal event notification to the first communication node, but the first communication node cannot process the fatal event notification.

第二通信节点向管理节点上报和第一通信节点之间的心跳检测出现故障、收不到第一通信节点发送过来的任何通告或数据信息。管理节点接收到第二通信节点上报的上述信息后，并且通过管理专用通信通道检测和第一通信节点之间的心跳检测是否正常，由于此时，第一通信节点无法进行任何正常处理，因此，管理节点检测到该管理节点和第一通信节点之间的心跳检测不正常。于是，管理节点判断第一通信节点本身出现故障，管理节点向用户通知第一通信节点本身出现故障。The second communication node reports to the management node that the heartbeat detection with the first communication node fails, and that it cannot receive any notification or data information sent by the first communication node. After the management node receives the above information reported by the second communication node, it detects whether the heartbeat between the management dedicated communication channel and the first communication node is normal. Since the first communication node cannot perform any normal processing at this time, therefore, The management node detects that the heartbeat detection between the management node and the first communication node is abnormal. Therefore, the management node determines that the first communication node itself has a failure, and the management node notifies the user that the first communication node itself has a failure.

本发明实施例还提供了一种以太网中的故障处理装置，该装置可以为以太网通信系统中的管理节点，该装置通过不同的管理专用通道分别和以太网中的不同通信节点进行通信。上述管理专用通信通道不受通信节点上的RX/TX故障的影响，也就是在通信节点上的TX/RX故障情况下，该装置仍然可以通过该管理专用通信通道和通信节点进行通信，该通信包括进行心跳检测等。The embodiment of the present invention also provides a fault processing device in the Ethernet. The device can be a management node in the Ethernet communication system, and the device communicates with different communication nodes in the Ethernet through different dedicated management channels. The above-mentioned management dedicated communication channel is not affected by the RX/TX failure on the communication node, that is, in the case of a TX/RX failure on the communication node, the device can still communicate with the communication node through the management dedicated communication channel, and the communication Including heartbeat detection, etc.

其具体实现结构如图6所示，具体可以包括：Its specific implementation structure is shown in Figure 6, which may specifically include:

故障信息接收模块61，用于接收以太网链路的端点的通信节点上报的故障信息；Fault information receiving module 61, used for receiving the fault information reported by the communication node of the endpoint of the Ethernet link;

故障处理模块62，用于根据所述故障信息接收模块所接收到的故障信息，以及所述故障处理装置和所述以太网链路的端点的通信节点之间的心跳检测信息，获取所述通信节点的故障信息。The fault processing module 62 is configured to obtain the communication information according to the fault information received by the fault information receiving module and the heartbeat detection information between the fault processing device and the communication node at the end point of the Ethernet link. Node failure information.

其中所述故障处理模块62具体包括：第一处理模块、第二处理模块、第三处理模块和第四处理模块中的至少一项，其中，Wherein the fault processing module 62 specifically includes: at least one of a first processing module, a second processing module, a third processing module and a fourth processing module, wherein,

第一处理模块621，用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收器故障信息，并且通过管理专用通信通道确定所述装置和所述一端的通信节点之间的心跳检测正常，则确定所述一端的通信节点的接收器故障信息，并向用户通告所述一端的通信节点的接收器故障信息；The first processing module 621 is configured to receive the receiver failure information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and determine the device and the communication node at the one end through a management dedicated communication channel If the heartbeat detection between them is normal, then determine the receiver failure information of the communication node at the one end, and notify the user of the receiver failure information of the communication node at the one end;

第二处理模块622，用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路质量故障信息，并且通过管理专用通信通道确定所述装置和所述一端的通信节点之间的心跳检测正常，则确定所述一端的通信节点的接收链路质量故障信息，并向用户通告所述一端的通信节点的接收链路质量故障信息；The second processing module 622 is configured to receive the received link quality failure information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and determine the connection between the device and the one end through a management dedicated communication channel If the heartbeat detection between the communication nodes is normal, then determine the receiving link quality failure information of the communication node at the one end, and notify the user of the receiving link quality failure information of the communication node at the one end;

第三处理模块623，用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点的接收链路中断故障信息，以及所述一端的通信节点和另一端的通信节点之间的心跳检测故障信息，并且通过管理专用通信通道确定所述装置和所述一端、另一端的通信节点之间的心跳检测正常，则确定所述一端的通信节点的接收链路中断故障信息，并向用户通告所述一端的通信节点的接收链路中断故障信息；The third processing module 623 is configured to receive the received link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at one end and the communication node at the other end heartbeat detection failure information, and determine that the heartbeat detection between the device and the communication node at the one end and the other end is normal through the management dedicated communication channel, then determine that the communication node at the one end receives the link interruption failure information, and Notifying the user of the failure information of the receiving link interruption of the communication node at the one end;

第四处理模块624，用于接收所述以太网链路的一端的通信节点上报的该一端的通信节点和另一端的通信节点之间的心跳检测故障信息，以及该一端的通信节点没有收到另一端的通信节点的通告或数据信息，并且通过管理专用通信通道确定所述装置和所述另一端的通信节点之间的心跳检测故障信息，则确定所述另一端的通信节点本身故障信息，并向用户通告所述另一端的通信节点本身故障信息。The fourth processing module 624 is configured to receive the heartbeat detection failure information reported by the communication node at one end of the Ethernet link between the communication node at the one end and the communication node at the other end, and the communication node at the one end does not receive notification or data information of the communication node at the other end, and determine the heartbeat detection failure information between the device and the communication node at the other end through the management dedicated communication channel, then determine the failure information of the communication node at the other end itself, And notify the user of the failure information of the communication node at the other end.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)或随机存储记忆体(Random Access Memory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.

综上所述，本发明实施例通过设置专门的管理节点，并通过专门的管理通信通道来监控通信节点状态，可以有效地获取并区分以太网中的通信链路故障和通信节点故障信息，便于快速进行故障恢复。In summary, the embodiment of the present invention can effectively obtain and distinguish communication link failure and communication node failure information in Ethernet by setting a special management node and monitoring the status of the communication node through a special management communication channel, which facilitates Fast recovery from failures.

本发明实施例还可以通过管理节点向用户通告以太网中的通信链路故障和通信节点故障信息。In the embodiment of the present invention, the management node can also notify the user of communication link failure and communication node failure information in the Ethernet.

以上所述，仅为本发明较佳的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到的变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应该以权利要求的保护范围为准。The above is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art within the technical scope disclosed in the present invention can easily think of changes or Replacement should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the protection scope of the claims.

Claims

1. a fault handling method in Ethernet, is characterized in that, described method comprises:

The management node receives the fault information reported by the communication node of the endpoint of the Ethernet link, the management node communicates with the communication node through a dedicated management channel, and periodically performs heartbeat detection with the communication node;

The management node obtains the failure information of the communication node according to the failure information and the heartbeat detection information between the management node and the communication node at the end of the Ethernet link, specifically including:

The management node receives the receiver failure information of the communication node at one end of the Ethernet link reported by the communication node at one end, and determines that the heartbeat detection between the management node and the communication node at one end is normal;

The management node determines the receiver failure information of the communication node at the one end, and notifies the user of the receiver failure information of the communication node at the one end;

or,

The management node receives the heartbeat detection failure information between the communication node at one end and the communication node at the other end reported by the communication node at one end of the Ethernet link, and the communication node at the one end does not receive the communication at the other end node notification or data information, and determine heartbeat detection failure information between the management node and the communication node at the other end;

The management node determines the failure information of the communication node at the other end, and notifies the user of the failure information of the communication node at the other end.

2. The method according to claim 1, wherein the management node obtains the communication node information according to the fault information and the heartbeat detection information between the management node and the communication node. fault information, including:

The management node receives the received link quality fault information of the communication node at one end of the Ethernet link reported by the communication node at one end, and determines the heartbeat detection between the management node and the communication node at the one end normal;

The management node determines the receiving link quality failure information of the communication node at the one end, and notifies the user of the receiving link quality failure information of the communication node at the one end.

3. The method according to claim 1, wherein the management node obtains the communication node information according to the failure information and the heartbeat detection information between the management node and the communication node. fault information, including:

The management node receives the receiving link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the heartbeat detection failure between the communication node at the one end and the communication node at the other end information, and determine that the heartbeat detection between the management node and the communication nodes at the one end and the other end is normal;

The management node determines the receiving link interruption fault information of the communication node at the one end, and notifies the user of the receiving link interruption fault information of the communication node at the one end.

4. a fault processing device in Ethernet, is characterized in that, specifically comprises:

The fault information receiving module is used to receive the fault information reported by the communication node of the endpoint of the Ethernet link;

A fault processing module, configured to acquire fault information of the communication node according to the fault information received by the fault information receiving module and heartbeat detection information between the fault processing device and the communication node;

The fault processing device includes a management node in the Ethernet communication system, the management node communicates with the communication node through a dedicated management channel, and regularly performs heartbeat detection with the communication node;

The fault processing module specifically includes at least one of the following modules:

The first processing module is used to receive the communication node receiver failure information reported by the communication node at one end of the Ethernet link, and determine that the heartbeat detection between the failure processing device and the communication node at the one end is normal, Then determine the receiver failure information of the communication node at the one end, and notify the user of the receiver failure information of the communication node at the one end;

The fourth processing module is used to receive the heartbeat detection failure information between the communication node at one end and the communication node at the other end reported by the communication node at one end of the Ethernet link, and the communication node at the one end does not receive the other notification or data information of the communication node at one end, and determine the heartbeat detection fault information between the fault processing device and the communication node at the other end, then determine the fault information of the communication node itself at the other end, and notify the user The failure information of the communication node at the other end itself.

5. The fault processing device according to claim 4, wherein the fault processing module further comprises at least one of the following modules:

The second processing module is configured to receive the received link quality fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at the one end and the communication node at the other end The heartbeat detection is normal information, and it is determined that the heartbeat detection between the fault processing device and the communication node at the one end is normal, then determine the receiving link quality failure information of the communication node at the one end, and notify the user of the fault at the one end Receive link quality failure information of the communication node;

The third processing module is configured to receive the received link interruption fault information of the communication node at one end of the Ethernet link reported by the communication node at one end of the Ethernet link, and the communication node between the communication node at the one end and the communication node at the other end Heartbeat detection fault information, and determine that the heartbeat detection between the fault processing device and the communication node at the one end and the other end is normal, then determine the receiving link interruption fault information of the communication node at the one end, and notify the user of the failure information The receiving link interruption fault information of the communication node at one end. the