CN112118129A - Fault positioning method and device based on service flow - Google Patents

Fault positioning method and device based on service flow Download PDF

Info

Publication number
CN112118129A
CN112118129A CN202010857668.3A CN202010857668A CN112118129A CN 112118129 A CN112118129 A CN 112118129A CN 202010857668 A CN202010857668 A CN 202010857668A CN 112118129 A CN112118129 A CN 112118129A
Authority
CN
China
Prior art keywords
fault information
message
fault
control plane
forwarding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010857668.3A
Other languages
Chinese (zh)
Other versions
CN112118129B (en
Inventor
苏平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fiberhome Telecommunication Technologies Co Ltd
Original Assignee
Fiberhome Telecommunication Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fiberhome Telecommunication Technologies Co Ltd filed Critical Fiberhome Telecommunication Technologies Co Ltd
Priority to CN202010857668.3A priority Critical patent/CN112118129B/en
Publication of CN112118129A publication Critical patent/CN112118129A/en
Application granted granted Critical
Publication of CN112118129B publication Critical patent/CN112118129B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/215Flow control; Congestion control using token-bucket
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/22Traffic shaping
    • H04L47/225Determination of shaping rate, e.g. using a moving window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/32Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a fault positioning method and a device based on service flow, wherein the method comprises the following steps: the forwarding plane obtains fault information by analyzing the discarded messages, and encapsulates the fault information according to a preset format to form a fault information message; after filtering and speed limiting processing are carried out on the fault information message by the forwarding plane, the fault information message is sent to the control plane; and the control plane obtains the fault information by analyzing the fault information message, and then carries out fault positioning and processing according to the fault information. The method directly analyzes the discarded service message as a source, the control plane can acquire the fault information without additionally increasing equipment and expenditure, and the forwarding plane filters and limits the speed of the message before sending the message, so that the problem of poor adaptation of the speed is effectively solved, and the control plane can be ensured to quickly and effectively perform fault positioning and process the fault information.

Description

Fault positioning method and device based on service flow
Technical Field
The present invention belongs to the technical field of data communication, and more particularly, to a method and an apparatus for fault location based on a service flow.
Background
In a communication device network, network communication failures occur occasionally, which needs to be able to locate and process network failures quickly, thereby avoiding failures of a larger range caused by too long a failure time. In locating a network fault of a communication device, the following three steps are generally included: step one, a forwarding plane processes fault service information; step two, the forwarding plane sends the fault information to the control plane; and step three, the control plane processes the fault information. There are two main problems in this process:
in the first step, the forwarding plane processes the failure service information by parsing the packet, but how to generate the parsed packet and where the packet source is, how to parse the packet without increasing the overhead of additional equipment, which is a technical problem to be solved. In the conventional scheme, fault location is usually performed by means of auxiliary equipment, specifically, a special problem troubleshooting device is accessed in a network topology to simulate a user message; if service failure occurs in the network topology, the equipment site information, the information of the failure service and the like are carried back to the special equipment for analysis and processing through a private protocol. As a result, the additional equipment and overhead is relatively large and costly.
In step two, the forwarding plane sends the failure information to the control plane for subsequent processing by the control plane. However, in the communication device, the forwarding plane is a special processing chip, and the processing speed of the message can reach hundreds of Gbps or even more; the control plane is a general processing chip, the packet processing speed is within several Gbps, and there is a large speed difference between the two, which is actually a processing process of entering a small pipeline from a large pipeline, as shown in fig. 1. How to perform fault information message communication between the high rate of the forwarding plane and the low rate of the control plane needs to solve the problem of message adaptation of the high and low rates, which is the second problem. If the forwarding plane message processing performance and the control plane message processing performance cannot be adapted, the control plane packet receiving and processing pressure is high, control plane processing is paralyzed due to the fact that a large number of forwarding planes are sent upwards, and then the overall fault location and processing performance are directly influenced.
Disclosure of Invention
The invention provides a method and a device for fault location based on service flow, aiming at resolving fault information based on service flow packet loss, rapidly locating a fault point in a communication equipment network and giving specific fault information, thereby solving the problems of high extra equipment overhead and non-adaptive forwarding plane control plane message processing performance in the traditional fault location.
To achieve the above object, according to an aspect of the present invention, there is provided a method for locating a fault based on a service flow, including:
the forwarding plane obtains fault information by analyzing the discarded messages, and encapsulates the fault information according to a preset format to form a fault information message;
after filtering and speed limiting processing are carried out on the fault information message by the forwarding plane, the fault information message is sent to the control plane;
and the control plane obtains the fault information by analyzing the fault information message, and then carries out fault positioning and processing according to the fault information.
Preferably, the communication device initializes a filter table after being powered on, and the filter table is used for recording the fault information message which is sent to the control plane; the filtering process of the forwarding plane for the fault information packet specifically includes:
searching a filter table by using the fault information as a keyword, and determining whether a corresponding fault information message exists in the filter table;
if the corresponding fault information message does not exist in the filter table, continuing to perform speed limit processing on the fault information message;
and if the corresponding fault information message exists in the filter table, discarding the fault information message.
Preferably, the speed limit processing process of the forwarding plane for the fault information packet specifically includes:
configuring METER speed limit in a forwarding plane according to the message processing speed of a control plane;
if the fault information message acquires a METER speed-limiting token, the fault information message is sent to a control plane, and the fault information message is written into the filter table;
and if the fault information message does not acquire the METER speed-limiting token, discarding the fault information message.
Preferably, the configuring, according to the packet processing speed of the control plane, the METER speed limit in the forwarding plane specifically includes:
setting the fault information message to be of a fixed length L for uploading; when the length is less than L, the message is filled to L, and when the length exceeds L, the message is shortened from the message starting position to the position with the message length of L;
configuring a METER speed limit value in a forwarding plane to be C, L and 8, wherein the unit is bps; and C, the number of messages processed by the control plane per second.
Preferably, the configuring, according to the packet processing speed of the control plane, the METER speed limit in the forwarding plane specifically includes:
the fault information message is transmitted by reserving original message information and packaging information, and a forwarding surface is uniformly used as the input of METER speed limit according to a fixed length L;
configuring a METER speed limit value in a forwarding plane to be C, L and 8, wherein the unit is bps; and C, the number of messages processed by the control plane per second.
Preferably, after the forwarding plane sends the failure information message to the control plane, the method further includes:
and when the preset period T is reached, the forwarding plane clears the fault information messages in the filter table one by one, and obtains new discarded messages again for analysis, so that the fault positioning and processing process of the next period is started.
Preferably, the table entry specification of the filter table is N, and the filter table is stored in the filter module; the clearing of the fault information message in the filter table is realized through a hardware packet sending module and a filter module, and specifically comprises the following steps:
the hardware packet sending module sends N reference messages to the filtering module in each period T, and each sent reference message has an index value; wherein, the index values are 1, 2,. and N respectively;
and after receiving a reference message, the filtering module reads the index value M of the reference message, and further clears the fault information message corresponding to the index M entry in the filtering table.
Preferably, the control plane obtains the fault information by analyzing the fault information packet, and then performs fault location and processing according to the fault information, specifically:
the control surface obtains the fault information by analyzing the fault information message and inquires whether the control surface and the forwarding surface are configured or not according to the fault information;
if the control plane exists but the forwarding plane does not exist, the failure is judged that the service configuration is not effective, and at the moment, the service configuration is issued again;
if the forwarding plane exists but the control plane does not exist, judging that the fault is a service configuration residue, and cleaning the service configuration residue at the moment;
if the control plane and the forwarding plane do not exist, the failure is judged to be that the service configuration is not performed, and at the moment, the service configuration is performed again.
Preferably, the failure information includes one or more of a device port corresponding to the failure service, a source-destination MAC address, a source-destination IP address + protocol number, and a source-destination port number.
According to another aspect of the present invention, there is provided a traffic-based fault location apparatus, including at least one processor and a memory, where the at least one processor and the memory are connected through a data bus, and the memory stores instructions executable by the at least one processor, where the instructions are used to complete the traffic-based fault location method according to the first aspect after being executed by the processor.
Generally, compared with the prior art, the technical scheme of the invention has the following beneficial effects: in the fault positioning method provided by the invention, on one hand, the forwarding plane directly analyzes the discarded service message as a source, and then packages and uploads the fault information to the control plane, so that the control plane can acquire the fault information without additionally increasing equipment and overhead; on the other hand, the forwarding plane firstly filters and limits the speed of the message before sending the message, and adapts the high-speed forwarding plane message processing to the low-speed control plane processing, so that the problem of poor adaptation of the speed is effectively solved, and the control plane can be ensured to quickly and effectively perform fault location and process fault information.
Drawings
Fig. 1 is a schematic diagram of forwarding plane upload messages to a control plane in a communication device;
fig. 2 is a flowchart of a method for locating a fault based on a service flow according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a service flow when a fault is located based on a service flow according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a package format of a fault information packet according to an embodiment of the present invention;
fig. 5 is an exploded flowchart for locating a fault based on a service flow according to an embodiment of the present invention;
fig. 6 is an exploded flowchart of another method for locating a fault based on a traffic flow according to an embodiment of the present invention;
fig. 7 is a diagram of a fault location device architecture based on service flows according to an embodiment of the present invention;
fig. 8 is a diagram of another architecture of a fault locating device based on traffic flow according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
In order to solve the technical problems of high overhead of additional equipment and non-adaptive message processing performance of a control plane of a forwarding plane in the traditional fault location, the embodiment of the invention provides a fault location method based on service flow, which can quickly locate a fault point in a communication equipment network and give specific fault information.
With reference to fig. 2 and fig. 3, the fault location method provided by this embodiment mainly includes the following three steps:
and step 10, the forwarding plane obtains fault information by analyzing the discarded messages, and encapsulates the fault information according to a preset format to form a fault information message.
The work of the forwarding plane of the communication equipment is as follows: searching for forwarding configuration through key information in the service message, finding out a service message outlet and then forwarding the service. When a communication device fails, a service message of a user cannot be forwarded due to reasons such as absence of configuration or incorrect configuration, so that a message packet is lost, that is, information in the service message cannot find corresponding forwarding configuration information, and the service message is discarded. The invention is to analyze the discarded service message (hereinafter referred to as discarded message) as the source to obtain the key fault information, namely the core information of the discarded message, which needs to embody the main characteristics of the message; and then the failure information on the discarded message is packaged and sent to the control plane, so that the control plane can acquire the failure information without adding additional equipment and overhead.
The failure information includes one or more of four layers of information corresponding to the failure service, and the four layers of information are respectively: l1 information (device link port), L2 (source destination MAC address) information, L3 (source destination IP address + protocol number) information, and L4 (source destination port number, i.e., four-layer port number) information. The forwarding plane of the communication equipment mainly forwards the information according to L1, L2, L3 and L4, and the fields are all embodied in a user message entering the equipment; specifically, the L1 information is used to indicate physical port information of a packet entering the communication device, the L2 information is a basis for forwarding a two-layer switch service, the L3 information is a basis for forwarding a three-layer router, and the L4 information is used to distinguish a transmission packet from a protocol packet. In the embodiment of the present invention, taking the fault information L1+ L2+ L3+ L4 as an example, the analysis and encapsulation processing process of the fault information is specifically as follows:
(1) the service message searches for forwarding configuration according to the key information (L1+ L2+ L3+ L4) in the message, and when the configuration has problems and the corresponding forwarding configuration cannot be found, the service message is discarded, that is, service flow packet loss occurs;
(2) the service flow packet loss triggering forwarding chip analyzes the discarded message, and further acquires fault information L1+ L2+ L3+ L4;
(3) the forwarding chip encapsulates the parsed L1+ L2+ L3+ L4 information according to a preset format, for example, an ethernet format, to form a fault information packet, as shown in fig. 4.
And 20, after the forwarding plane filters and limits the speed of the fault information message, the forwarding plane sends the fault information message to the control plane.
As can be seen from the foregoing, if the forwarding plane directly sends the message to the control plane, the problem of the message processing performance being not adapted occurs, and the embodiment of the present invention mainly solves the problem by two methods: firstly, filtering, namely filtering the repeatedly sent messages of the forwarding surface; and secondly, limiting the speed, and matching the rate sent by the forwarding plane with the control plane. Two methods are described in detail below.
The purpose of filtering is to reduce duplicate messages being sent up to the control plane processing, removing unnecessary processing pressure on the control plane. Therefore, after the communication equipment is powered on, a filter table is initialized and stored in the forwarding chip, and the filter table is recorded as a filter _ table and used for recording a fault information message which is sent to a control plane by the forwarding plane, and the fault information of the discarded message is used as a search key word. The table entry specification is N, that is, at most N filter entries can be recorded. Referring to fig. 5, the filtering process is as follows:
(1) before the forwarding plane packages the fault information and sends the fault information to the forwarding plane, searching a filter _ table of a filter table by using the fault information L1+ L2+ L3+ L4 as a keyword, and determining whether a corresponding fault information message entry exists in the filter _ table of the filter table;
(2) if the filter _ table of the filter table does not have corresponding fault information message entries, which indicates that the messages with the same fault information are not sent to the control plane, continuing to perform subsequent speed-limiting processing on the fault information messages;
(3) if the corresponding fault information message entry exists in the filter _ table of the filter table, it indicates that the message with the same fault information is sent to the control plane for processing, and in order to avoid repeated sending, the message does not need to be sent again, and the fault information message is discarded.
The purpose of speed limitation is to adapt the forwarding plane message processing rate and the control plane message processing rate, reduce the control plane packet receiving processing pressure and avoid control plane processing paralysis caused by mass uploading; meanwhile, the speed limit is uniform processing, so that the situation that packet loss cannot be processed due to the fact that the control plane receives burst message flow is avoided. The forwarding plane introduces METER (using algorithms such as RFC2697, RFC2698 and the like) to carry out speed limit processing, and the uniform and non-burst uploading is ensured. Referring to fig. 5, the speed limit processing is specifically as follows:
(1) configuring METER speed limit in a forwarding plane according to the message processing speed of a control plane, then placing the fault information message in a METER token bucket, and judging whether a token of the METER speed limit can be acquired;
(2) if the fault information message acquires a METER speed-limiting token, which indicates that the uploading rate of a forwarding plane can be matched with a control plane at the moment, the fault information message is uploaded to the control plane and written into the filtering table;
(3) and if the fault information message does not acquire the METER speed-limiting token, which indicates that the rate sent by a forwarding plane cannot be matched with a control plane, discarding the fault information message, and subsequently continuing to process when the similar discarded message is received.
Wherein, METER speed limit is a flow rate speed limit (bps), namely the number of bits or digits transmitted per second; the packet processing speed of the control plane of the communication device is usually calculated according to the number of Packets Per Second (PPS), so when configuring the METER speed limit, the METER traffic speed limit (bps) needs to be converted into the packet speed limit (PPS) for configuration. The method can be realized by the following two methods:
the method one, set the said fault information message to the fixed length L and upload; when the length is less than L, the message is filled to L, and when the length exceeds L, the message is shortened from the message starting position to the position with the message length of L. Then, configuring the METER speed limit value in the forwarding surface as C L8 in bps; and C, the number of messages processed by the control plane per second. But the method has the disadvantages that the method has the condition of message truncation, and partial message information can be lost.
And secondly, the fault information message is uploaded by reserving original message information and packaging information, and a forwarding surface is uniformly used as the input of METER speed limit according to a fixed length L. Then, configuring the METER speed limit value in the forwarding surface as C L8 in bps; and C, the number of messages processed by the control plane per second. The method can keep the original length of the message to be sent, and the forwarding chip acquires the token of the METER according to the fixed message length without causing message information loss, but the forwarding chip is required to support.
And step 30, the control plane obtains the fault information by analyzing the fault information message, and then carries out fault positioning and processing according to the fault information.
And after receiving the message sent by the forwarding plane, the control plane obtains the fault information by analyzing the fault information message, and inquires whether the configuration exists on the control plane and the forwarding plane according to the fault information so as to complete fault positioning and processing. The forwarding plane configuration is generated and issued by a control plane (specifically, static configuration or dynamic protocol generation); under the condition that the communication equipment normally operates, the configuration information generated by the control surface is correct, and the configuration information received by the forwarding surface is consistent with the configuration information issued by the control surface; when the communication device fails, the configuration information of the forwarding plane and the configuration information of the control plane may be inconsistent. As shown in fig. 3, the fault mainly includes the following three cases:
in the first case, if the control plane exists but the forwarding plane does not exist, that is, the configuration conditions of the forwarding plane and the control plane are not consistent, it is determined that the service configuration delivery is not in effect, and at this time, the service configuration is immediately re-delivered to quickly recover the service. Meanwhile, reporting alarm troubleshooting for logic errors or message transmission channel errors and the like in the configuration issuing process.
If the forwarding plane exists but the control plane does not exist, the fault is judged to be a service configuration residue, and the service configuration residue needs to be cleaned at the moment; namely, reporting an alarm and checking configuration residues.
And thirdly, if the control plane and the forwarding plane do not exist, preliminarily judging that the fault is that the service is not configured, reporting an alarm to check whether the service is not configured, and if so, performing service configuration again.
The control plane can also give priority to troubleshooting of problem sites in the topology by combining the packet loss number; the more the packet loss number is, the higher the priority of the troubleshooting is, thereby being beneficial to quickly positioning the fault point.
In the fault positioning method, the forwarding plane makes full use of the point that the service message can be discarded when the equipment fails, useful fault information analyzed from the discarded message is packaged and uploaded to the control plane for processing, and the control plane can acquire the fault information without increasing extra overhead due to the triggering of the discarded service flow, so that the overhead for the control plane to actively read the fault information of the forwarding plane is reduced, and the equipment failure processing is not required to be performed by introducing external special equipment.
In addition, the forwarding plane filters repeated messages before sending the messages, and removes unnecessary processing pressure of the control plane; and matching the rate of sending the message on the forwarding plane with the rate of processing the message by the control plane through METER speed limit, thereby reducing the packet receiving and processing pressure of the control plane. The hardware resources of the forwarding chip are fully utilized, and the combination of filtering and speed limiting can adapt the high-speed forwarding plane message processing to the low-speed control plane processing, so that the control plane can be ensured to quickly and effectively perform fault positioning and process fault information.
Further, the fault location method is also beneficial to rapid service recovery and greatly shortens the problem troubleshooting time: high-risk fault points in the topology and high-risk fault services in the equipment are preferentially positioned according to the packet loss number, so that the risk can be reduced to the maximum extent; for the configuration of the forwarding plane which is not effective, the service can be quickly recovered, and manual investigation is reduced; for services needing manual troubleshooting, the fault point of the equipment with the problem can be quickly positioned according to the key characteristics of the message, the problem troubleshooting time is greatly shortened, and the services are quickly recovered.
Example 2
In the fault location method provided in embodiment 1, for a fault information packet that is sent from a forwarding plane to a control plane and has been processed by the control plane, because the control plane has received and processed the fault information packet, a corresponding entry in a filter table does not have any function any more; if the entries in the filter table are not cleaned up in time, the filter table cannot be used for filtering after being full.
In order to solve the above problem, the embodiment of the present invention introduces an aging process after the forwarding plane sends the message to the control plane, so as to clear the fault information message that has been sent to the control plane and processed by the control plane in the filter _ table, thereby preventing the filter _ table from failing to perform a filtering function after the filter _ table is full. The general process of aging can be seen in fig. 6:
and when the preset period T is reached, clearing the fault information messages in the filter table one by one, namely regularly clearing the filter _ table entries, and directly indexing the table and clearing one by one during clearing. The aging cleaning method comprises two methods: firstly, after the control plane processes the fault information message, clearing the corresponding message entry in the filter _ table of the forwarding plane; and the control plane does not act after processing and carries out clearing operation by the forwarding plane. The first approach increases the processing pressure of the control plane, and therefore embodiments of the present invention preferentially use the second approach, i.e., the aging clean operation by the forwarding plane.
When the forwarding plane is handed over for aging cleaning, the cleaning of the fault information PACKET in the filter table can be realized by a hardware PACKET sending module (PACKET GENERATOR, abbreviated as PG, i.e. a PACKET GENERATOR) and a filter module in the forwarding chip. The hardware packet sending module can generate a message and send the message according to a set period T, and for convenience of description, the message generated and sent by the hardware packet sending module is recorded as a reference message; the filtering module stores a filtering table. The specific aging cleaning process is as follows:
(1) setting the hardware packet sending module to send N reference messages to the filtering module in each period T, wherein the quantity of the sent messages is consistent with the list item specification of a filter _ table of a filtering table, and each sent reference message is provided with an index value which is 1, 2, 1, N;
(2) and after receiving a reference message, the filtering module reads the index value M of the reference message, and further clears the fault information message corresponding to the index M entry in the filtering table. When the corresponding index M entry is cleared, the following two methods may be adopted:
reading the content of an index M entry in the filter _ table of the filter table, and judging whether the content exists or not; if yes, clearing the table item, namely clearing the fault information message of the corresponding index M item in the filter table; if not, no processing is done. The advantage of this approach is that, in compliance with standard processing logic, redundant operations can be avoided.
And judging whether the content of the index M entry in the filter _ table of the filter table exists or not without reading the content of the index M entry, and directly and forcibly clearing the fault information message corresponding to the index M entry in the filter table. This approach has the advantage of reducing the resources required for hardware reads.
Setting the interval of every two times of packet sending N items of the hardware packet sending module as T, and in every T time, a filtering module in a forwarding chip can receive reference messages of the N items sent by the hardware packet sending module, and the filtering module clears corresponding index items after receiving the messages.
Further, the following problems may also exist during the process: the forwarding plane sends the fault information message processed by the control plane, the control plane does not process the fault information message due to a burst task and the like, but the forwarding plane sets a filter _ table and adds the corresponding fault information message into the filter _ table, so that the fault information message cannot be sent and processed any more in the following process, and the corresponding fault cannot be processed all the time.
In order to solve the above problem and ensure that the packet loss information can be uploaded to the control plane again for processing, so as to process the corresponding fault in time, in the embodiment of the present invention, a retransmission process is also introduced after the forwarding plane sends the message to the control plane. The general procedure for retransmission can be seen in fig. 6:
and (3) every time the preset period T is reached, the forwarding plane clears the fault information messages in the filter table one by one, and after the clearing is completed, new discarded messages are obtained again for analysis, so that the fault positioning and processing process of the next period is started, namely the whole process from step 10 to step 30 is executed again based on the discarded messages in the next period. It can be seen that the aging cleaning operation is included in the retransmission process, and after the preset period T is reached each time, the entries in the filter table need to be aged and cleaned first, and then the new discarded packet is analyzed to perform the fault location processing of the next period.
For period TnThe control plane does not process the fault information message which is sent in the middle and is not processed, so the corresponding fault still exists; then, in the next period Tn+1If the corresponding service packet cannot find the forwarding configuration information, the corresponding service packet is discarded, and a corresponding discarded packet is generated, where the discarded packet includes the failure information and the period TnThe message of the failure information which is uploaded in the middle but not processed is consistent. Since the filter table has been cleaned at this time, the forwarding plane is in period Tn+1The discarded message can be sent to the control plane for processing again after being analyzed and encapsulated again. That is, the period TnThe failure information not processed by the control plane can be processed in the following period Tn+1And the processing is continued in the middle or later period, so that the fault information is ensured not to be missed.
In addition, for the fault information message which is not acquired with the token of the METER speed limit after the speed limit processing in step 20 and is directly discarded, although the uploading processing is not performed in the period, the corresponding fault still exists because the uploading processing is not performed. Similarly, a corresponding discard message is generated in the next period, and after a retransmission process is introduced, the discard message can be continuously uploaded and processed in the next period or a later period so as to ensure that no 'missing fish' exists.
After aging and retransmission are introduced, the fault location method provided by the embodiment of the invention roughly comprises the following processes: the method comprises the steps of message discarding analysis, repeated message filtering, METER speed limit solving speed difference, periodic cleaning and repeated uploading. The specific processes of discarding the message, analyzing the repeated message, filtering the repeated message, and limiting the speed of the METER refer to embodiment 1, which is not described herein again.
Through the aging and retransmission introduced in the embodiment of the invention, the discarded fault information or the fault information which is uploaded but not processed can be ensured to be continuously uploaded and processed in the subsequent period when being uploaded on the forwarding plane, and the fault information is ensured not to be missed; meanwhile, the problem of overlarge specification of the filtering table entry can be solved, and the entries which do not need to be processed are cleaned regularly. In addition, the aging processing is carried out through the forwarding plane, the speed is high, the cost is low, and the processor pressure of the control plane cannot be increased.
Example 3
On the basis of the service flow-based fault location methods provided in the foregoing embodiments 1 and 2, the present invention also provides a service flow-based fault location apparatus that can be used to implement the foregoing methods. As shown in fig. 7, the fault locating device mainly includes a fault information analyzing and encapsulating module, a filtering module, a speed limiting module and a hardware packet sending module, and each module is uniformly distributed on a forwarding plane and can be specifically implemented by a forwarding chip.
The fault information analyzing and encapsulating module can also be called as a fault information processing module, is mainly used for processing fault information, and specifically comprises the following steps: analyzing a discarded message of a user to obtain fault information, packaging the fault information according to a preset format to form a fault information message, and then transmitting the packaged fault information message to the filtering module; that is, the method described in step 10 in embodiment 1 is executed, and the specific process is not described herein again.
The filtering module and the speed limit module are collectively called as a fault information uploading processing module, and are used for uploading the processed fault information message to a control plane. Wherein:
after receiving the fault information message, the filtering module searches a filtering table filter _ table by using the fault information as a keyword, and determines whether a corresponding fault information message entry exists in the filtering table; if the fault information message does not exist, the fault information message is transmitted to the speed limiting module, and if the fault information message exists, the fault information message is discarded; that is, the filtering process described in step 20 in embodiment 1 is performed, which is not described herein again.
After receiving a fault information message, the speed limit module places the fault information message in an METER token bucket and judges whether the message can acquire a METER speed limit token; if the token of METER speed limit can be obtained, the fault information message is sent to a control plane, and the fault information message is written into a filtering table in the filtering module; if the token of METER speed limit cannot be obtained, discarding the fault information message; that is, the speed limiting process described in step 20 in embodiment 1 is executed, which is not described herein again.
Further, in a preferred embodiment, the fault location device further includes a hardware packet sending module, which is also disposed on the forwarding plane, and may be implemented by a forwarding chip.
The hardware packet sending module can generate a reference message and send the reference message according to a set period T. Setting the hardware packet sending module to send N reference messages to the filtering module in each period T, wherein each sent reference message is provided with an index value, and the index values are 1, 2, 1.6.N and N respectively; and after receiving a reference message, the filtering module reads the index value M of the reference message, and further clears the fault information message corresponding to the index M entry in the filtering table.
On the one hand, based on the fault locating device provided by the embodiment of the invention, on the one hand, the forwarding plane can directly analyze the discarded service message as a source, package the service message after obtaining the fault information and upload the fault information to the control plane, and the control plane can acquire the fault information without additionally increasing equipment and overhead; on the other hand, the forwarding plane firstly filters and limits the speed of the message before sending the message, and adapts the high-speed forwarding plane message processing to the low-speed control plane processing, so that the problem of poor adaptation of the speed is effectively solved, and the control plane can be ensured to quickly and effectively perform fault location and process fault information.
Furthermore, the PG module can also solve the problem of overlarge specification of the filtering table entry, regularly clean the entries which do not need to be processed, ensure that the discarded fault information or the fault information which is sent but not processed can be continuously sent and processed in the subsequent period when the forwarding plane is sent, and ensure that the fault information is not missed.
Example 4
On the basis of the service flow-based fault location methods provided in the foregoing embodiments 1 and 2, the present invention further provides a service flow-based fault location apparatus for implementing the foregoing methods, and as shown in fig. 8, the apparatus architecture diagram of the embodiment of the present invention is shown. The traffic-based fault location apparatus of the present embodiment includes one or more processors 21 and a memory 22. In fig. 8, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and fig. 8 illustrates the connection by a bus as an example.
The memory 22, which is a non-volatile computer-readable storage medium for the service flow-based fault location method, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the service flow-based fault location method in embodiment 1. The processor 21 executes various functional applications and data processing of the traffic flow based fault location device by running the nonvolatile software program, instructions and modules stored in the memory 22, that is, implements the traffic flow based fault location methods of embodiments 1 and 2.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and, when executed by the one or more processors 21, perform the traffic flow based fault location method of embodiment 1 described above, for example, perform the steps shown in fig. 2, fig. 3, fig. 5, and fig. 6 described above.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A fault positioning method based on service flow is characterized by comprising the following steps:
the forwarding plane obtains fault information by analyzing the discarded messages, and encapsulates the fault information according to a preset format to form a fault information message;
after filtering and speed limiting processing are carried out on the fault information message by the forwarding plane, the fault information message is sent to the control plane;
and the control plane obtains the fault information by analyzing the fault information message, and then carries out fault positioning and processing according to the fault information.
2. The method according to claim 1, wherein the communication device initializes a filter table after being powered on, for recording the fault information packet that has been uploaded to the control plane; the filtering process of the forwarding plane for the fault information packet specifically includes:
searching a filter table by using the fault information as a keyword, and determining whether a corresponding fault information message exists in the filter table;
if the corresponding fault information message does not exist in the filter table, continuing to perform speed limit processing on the fault information message;
and if the corresponding fault information message exists in the filter table, discarding the fault information message.
3. The method according to claim 2, wherein the speed limit processing procedure of the forwarding plane for the fault information packet specifically comprises:
configuring METER speed limit in a forwarding plane according to the message processing speed of a control plane;
if the fault information message acquires a METER speed-limiting token, the fault information message is sent to a control plane, and the fault information message is written into the filter table;
and if the fault information message does not acquire the METER speed-limiting token, discarding the fault information message.
4. The method according to claim 3, wherein the configuring the METER speed limit in the forwarding plane according to the packet processing speed of the control plane specifically comprises:
setting the fault information message to be of a fixed length L for uploading; when the length is less than L, the message is filled to L, and when the length exceeds L, the message is shortened from the message starting position to the position with the message length of L;
configuring a METER speed limit value in a forwarding plane to be C, L and 8, wherein the unit is bps; and C, the number of messages processed by the control plane per second.
5. The method according to claim 3, wherein the configuring the METER speed limit in the forwarding plane according to the packet processing speed of the control plane specifically comprises:
the fault information message is transmitted by reserving original message information and packaging information, and a forwarding surface is uniformly used as the input of METER speed limit according to a fixed length L;
configuring a METER speed limit value in a forwarding plane to be C, L and 8, wherein the unit is bps; and C, the number of messages processed by the control plane per second.
6. The traffic-flow based fault location method of claim 2, wherein after the forwarding plane sends the fault information message to the control plane, the method further comprises:
and when the preset period T is reached, the forwarding plane clears the fault information messages in the filter table one by one, and obtains new discarded messages again for analysis, so that the fault positioning and processing process of the next period is started.
7. The method of claim 6, wherein the table entry specification of the filter table is N, and the filter table is stored in a filter module; the clearing of the fault information message in the filter table is realized through a hardware packet sending module and a filter module, and specifically comprises the following steps:
the hardware packet sending module sends N reference messages to the filtering module in each period T, and each sent reference message has an index value; wherein, the index values are 1, 2,. and N respectively;
and after receiving a reference message, the filtering module reads the index value M of the reference message, and further clears the fault information message corresponding to the index M entry in the filtering table.
8. The method for locating a fault based on a service flow according to any one of claims 1 to 7, wherein a control plane obtains the fault information by analyzing the fault information packet, and further performs fault location and processing according to the fault information, specifically:
the control surface obtains the fault information by analyzing the fault information message and inquires whether the control surface and the forwarding surface are configured or not according to the fault information;
if the control plane exists but the forwarding plane does not exist, the failure is judged that the service configuration is not effective, and at the moment, the service configuration is issued again;
if the forwarding plane exists but the control plane does not exist, judging that the fault is a service configuration residue, and cleaning the service configuration residue at the moment;
if the control plane and the forwarding plane do not exist, the failure is judged to be that the service configuration is not performed, and at the moment, the service configuration is performed again.
9. The method as claimed in any of claims 1 to 7, wherein the fault information includes one or more of a device port, a source destination MAC address, a source destination IP address + protocol number, and a source destination port number corresponding to the fault service.
10. A traffic-based fault location apparatus, comprising at least one processor and a memory, wherein the at least one processor and the memory are connected by a data bus, and the memory stores instructions executable by the at least one processor, and the instructions are configured to perform the traffic-based fault location method according to any one of claims 1 to 9 after being executed by the processor.
CN202010857668.3A 2020-08-24 2020-08-24 Fault positioning method and device based on service flow Active CN112118129B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857668.3A CN112118129B (en) 2020-08-24 2020-08-24 Fault positioning method and device based on service flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857668.3A CN112118129B (en) 2020-08-24 2020-08-24 Fault positioning method and device based on service flow

Publications (2)

Publication Number Publication Date
CN112118129A true CN112118129A (en) 2020-12-22
CN112118129B CN112118129B (en) 2022-08-12

Family

ID=73805587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857668.3A Active CN112118129B (en) 2020-08-24 2020-08-24 Fault positioning method and device based on service flow

Country Status (1)

Country Link
CN (1) CN112118129B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104735000A (en) * 2013-12-23 2015-06-24 中兴通讯股份有限公司 OpenFlow signaling control method and device
US20150309894A1 (en) * 2014-04-29 2015-10-29 Cisco Technology, Inc. Fast Failover for Application Performance Based WAN Path Optimization with Multiple Border Routers
WO2015180113A1 (en) * 2014-05-30 2015-12-03 华为技术有限公司 Network address translation method and apparatus
CN108270690A (en) * 2016-12-30 2018-07-10 北京华为数字技术有限公司 The method and apparatus for controlling message flow
CN109286594A (en) * 2017-07-19 2019-01-29 中兴通讯股份有限公司 The processing method and processing device of address analysis protocol message

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104735000A (en) * 2013-12-23 2015-06-24 中兴通讯股份有限公司 OpenFlow signaling control method and device
US20150309894A1 (en) * 2014-04-29 2015-10-29 Cisco Technology, Inc. Fast Failover for Application Performance Based WAN Path Optimization with Multiple Border Routers
WO2015180113A1 (en) * 2014-05-30 2015-12-03 华为技术有限公司 Network address translation method and apparatus
CN108270690A (en) * 2016-12-30 2018-07-10 北京华为数字技术有限公司 The method and apparatus for controlling message flow
CN109286594A (en) * 2017-07-19 2019-01-29 中兴通讯股份有限公司 The processing method and processing device of address analysis protocol message

Also Published As

Publication number Publication date
CN112118129B (en) 2022-08-12

Similar Documents

Publication Publication Date Title
EP3387803B1 (en) Router with optimized statistical functionality
US7787442B2 (en) Communication statistic information collection apparatus
EP3593498B1 (en) Router device using flow duplication
US7773530B2 (en) Network traffic synchronization mechanism
CN106815112B (en) Massive data monitoring system and method based on deep packet inspection
CN113055127B (en) Data message duplicate removal and transmission method, electronic equipment and storage medium
EP2442604A1 (en) Method and router for implementing mirroring
CN109586959B (en) Fault detection method and device
EP4333408A2 (en) Method and apparatus for managing routing disruptions in a computer network
CN110708250A (en) Method for improving data forwarding performance, electronic equipment and storage medium
CN112737914B (en) Message processing method and device, network equipment and readable storage medium
US10432519B2 (en) Packet redirecting router
US10404611B2 (en) Discovering path maximum transmission unit
CN102801596B (en) Service processing method for ring network and network equipment
CN111404872A (en) Message processing method, device and system
CN112118129B (en) Fault positioning method and device based on service flow
WO2021027420A1 (en) Method and device used for transmitting data
US7649906B2 (en) Method of reducing buffer usage by detecting missing fragments and idle links for multilink protocols and devices incorporating same
JP7035771B2 (en) Packet acquisition device, packet acquisition method, and packet acquisition program
CN116319468B (en) Network telemetry method, device, switch, network, electronic equipment and medium
CN112995037B (en) Method and system for protecting data message from ARP Miss
CN113923138B (en) Communication device and network management method
CN117997982A (en) HTTP message replay method and system based on two-layer network
KR102015111B1 (en) Apparatus and method of protection switching of packet in transport networks
CN111683018A (en) Mirroring dropped packets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant