CN107332709A - A kind of Fault Locating Method and device - Google Patents

A kind of Fault Locating Method and device Download PDF

Info

Publication number
CN107332709A
CN107332709A CN201710632889.9A CN201710632889A CN107332709A CN 107332709 A CN107332709 A CN 107332709A CN 201710632889 A CN201710632889 A CN 201710632889A CN 107332709 A CN107332709 A CN 107332709A
Authority
CN
China
Prior art keywords
message
detection message
node
detected
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710632889.9A
Other languages
Chinese (zh)
Other versions
CN107332709B (en
Inventor
费志军
谢亮
华锦芝
何朔
尹亚伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN201710632889.9A priority Critical patent/CN107332709B/en
Publication of CN107332709A publication Critical patent/CN107332709A/en
Application granted granted Critical
Publication of CN107332709B publication Critical patent/CN107332709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults

Abstract

The present embodiments relate to technical field of data processing, more particularly to a kind of Fault Locating Method and device, the failure to quick pricise position service device group system.In the embodiment of the present invention, Centroid determines M detection message according to transaction to be detected, and sends M detection message to node to be detected;The transmission path of M detection message covers all nodes to be detected;For each detection message in M detection message, Centroid is according to the corresponding feedback message of detection message, it is determined that there is the detection message of abnormal nodes;Node to be detected in the transmission path for the detection message that there are abnormal nodes is primary dcreening operation node;Centroid determines N number of detection message according to primary dcreening operation node, and sends N number of detection message to primary dcreening operation node;The transmission path of N number of detection message covers all transmission paths of primary dcreening operation node;For each detection message in N number of detection message, Centroid determines abnormal nodes according to the corresponding feedback message of detection message.

Description

A kind of Fault Locating Method and device
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of Fault Locating Method and device.
Background technology
At present, the research on FLT, existing many relevant reports.Wherein, several typical fault location Technology includes:
1st, manual testing's method
This method is, when communication network breaks down, by the staff with certain experiences, to utilize certain test Instrument, the particular location occurred according to the fault alarm signal in communication network, failure judgement.
2nd, passage correlation analysis method
This method is the correlation using service channel, when communication network breaks down, by preceding to alarm signal (FDI) and backward alarm signal (BDI), warning information is transmitted to service channel upstream network element, by upstream network element according to the alarm Information is arbitrated, the particular location occurred with failure judgement.
3rd, center control network element analytic approach
This method is to set a main control network element in a communication network, when telecommunication network breaks down, and other own The alarm indication signal of network element all sends the main control network element to, by the main control network element, according to these alarm indication signals and The business and topology information on communication network prestored, the particular location that failure judgement occurs.
4th, Case Study
In the method, the failure occurred is organized into a case first, and sets up a fault location information Storehouse.When communication network breaks down, this method utilizes the fault location information bank, passes through search and failure warning signal phase The failure identification of matching carries out fault location.
In above-mentioned several typical FLTs, manual testing's method, due to that can not enter in real time to impaired business Row is protected and fixed a breakdown in time, is not suitable for mainframe computer cluster system;And for passage correlation analysis method with The heart control network element analytic approach, due in fault location, it is necessary to network element the substantial amounts of fault alarm signal of front transfer using as The foundation that upstream network element or main control network element judge, therefore, according to both approaches, not only extends fault location time, and And numerous and jumbled calculating more causes both approaches to be difficult to be applied in the communication network of such as distributed AC servo system;Case point Although analysis method accurate positioning, it can not position the fault type not having in case library, therefore in some special occasions This method can not be used.Accordingly, it would be desirable to a kind of method that can be quickly accurately positioned failure.
The content of the invention
The application provides a kind of Fault Locating Method and device, the event to quick pricise position service device group system Barrier.
A kind of Fault Locating Method provided in an embodiment of the present invention, including:
Centroid determines M detection message according to transaction to be detected, and sends the M detection to node to be detected Message;The transmission path of the M detection message covers all nodes to be detected;
For each detection message in described M detection message, the Centroid is relative according to the detection message The feedback message answered, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is according to the detection message What the operation result of corresponding node to be detected was obtained;Waiting in the transmission path of the detection message that there are abnormal nodes Detection node is primary dcreening operation node;
The Centroid determines N number of detection message according to the primary dcreening operation node, and sends described to the primary dcreening operation node N number of detection message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node;
For each detection message in N number of detection message, the Centroid is relative according to the detection message The feedback message answered, determines abnormal nodes.
Optionally, before the Centroid is according to M detection message of transaction determination to be detected, in addition to:
The Centroid is based on different types of transaction, is different clusters by all node divisions to be detected, same The message of the same type of transaction of node processing to be detected in cluster;
The Centroid sends the M detection message to node to be detected, including:
For a cluster, to be detected node of the Centroid into the cluster sends the M detection and disappeared Breath;
The Centroid sends N number of detection message to the primary dcreening operation node, including:
For a cluster, primary dcreening operation node of the Centroid into the cluster sends N number of detection message.
Optionally, the Centroid is according to the corresponding feedback message of the detection message, determine abnormal nodes it Afterwards, in addition to:
The state of the abnormal nodes is set to isolation.
Optionally, it is described the abnormal nodes are isolated after, in addition to:
The Centroid sends detection message to the abnormal nodes, according to the corresponding feedback of the detection message Message determines that the abnormal nodes have recovered normal, then releases the abnormal nodes isolation.
Optionally, the Centroid is according to the corresponding feedback message of the detection message, it is determined that there are abnormal nodes Detection message, including:
For any detection message in described M detection message, the Centroid does not receive institute in threshold time The corresponding feedback message of detection message is stated, or the corresponding feedback message of the detection message is error message, it is determined that exist different Chang Jiedian.
Optionally, the Centroid determines abnormal nodes, wrapped according to the corresponding feedback message of the detection message Include:
For the corresponding feedback message of all detection message by the first node to be detected, the Centroid exists Do not receive the feedback message in the threshold time, or the feedback message is error message, it is determined that described first Node to be detected is abnormal nodes.
A kind of fault locator, including:
First sending module, for determining M detection message according to transaction to be detected, and sends institute to node to be detected State M detection message;The transmission path of the M detection message covers all nodes to be detected;
Determining module, for for each detection message in described M detection message, according to the detection message phase Corresponding feedback message, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is disappeared according to the detection Cease what the operation result of corresponding node to be detected was obtained;In the transmission path of the detection message that there are abnormal nodes Node to be detected is primary dcreening operation node;
Second sending module, for determining N number of detection message according to the primary dcreening operation node, and sends out to the primary dcreening operation node Send N number of detection message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node;
Locating module, for for each detection message in N number of detection message, according to the detection message phase Corresponding feedback message, determines abnormal nodes.
Optionally, in addition to division module, for based on different types of transaction, by all node divisions to be detected for not The message of the same type of transaction of node processing to be detected in same cluster, same cluster;
First sending module, specifically for for a cluster, the node to be detected into the cluster sends institute State M detection message;
Second sending module, specifically for for a cluster, the primary dcreening operation node into the cluster sends described N number of detection message.
Optionally, in addition to processing module, it is used for:
The state of the abnormal nodes is set to isolation.
Optionally, the processing module, is additionally operable to:
Detection message is sent to the abnormal nodes, according to being determined the corresponding feedback message of the detection message Abnormal nodes have recovered normal, then release the abnormal nodes isolation.
Optionally, the determining module, specifically for:
For any detection message in described M detection message, the detection message pair is not received in threshold time The feedback message answered, or the corresponding feedback message of the detection message is error message, it is determined that there are abnormal nodes.
Optionally, the locating module, specifically for:
For the corresponding feedback message of all detection message by the first node to be detected, in the threshold time The feedback message is not received inside, or the feedback message is error message, it is determined that first node to be detected is Abnormal nodes.
In the embodiment of the present invention, when some node in node to be detected breaks down or needs to overall section to be detected When point carries out maintenance examination, Centroid determines M detection information according to transaction to be detected, and is sent out to node to be detected Send.The transmission path of this M detection message covers all nodes to be detected.Centroid sends to node to be detected and detected After message, the corresponding feedback message of each detection message is received, feedback message is the node root to be detected for receiving detection message Fed back according to operation result.Centroid is according to the feedback message received, it is determined that there is the detection message of abnormal nodes, and Corresponding node to be detected is as primary dcreening operation node in the transmission path for the detection message that there will be abnormal nodes, it may be determined that different Chang Jiedian is present in primary dcreening operation node.Centroid determines N number of detection message according to primary dcreening operation node, and is sent to primary dcreening operation node N number of detection message, the transmission path of this N number of detection message covers all transmission paths of primary dcreening operation node.Finally, centromere Point determines abnormal nodes according to the corresponding feedback message of N number of detection message.The embodiment of the present invention by detection node all standing and Transmission path power covering is combined, and realizes the high speed precise positioning of server cluster system failure.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, required in being described below to embodiment The accompanying drawing used is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the present invention, right For one of ordinary skill in the art, without having to pay creative labor, it can also be obtained according to these accompanying drawings Obtain other accompanying drawings.
Fig. 1 is a kind of system architecture schematic diagram that the embodiment of the present invention is applicable;
Fig. 2 is a kind of schematic flow sheet of aerial hair fastener method provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic flow sheet of aerial hair fastener method in the specific embodiment of the invention;
Fig. 4 is a kind of structural representation of aerial card-issuing device provided in an embodiment of the present invention;
Fig. 5 is the structural representation of another aerial card-issuing device provided in an embodiment of the present invention.
Embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with accompanying drawing the present invention is made into One step it is described in detail, it is clear that described embodiment is only embodiment of the invention a part of, rather than whole implementation Example.Based on the embodiment in the present invention, what those of ordinary skill in the art were obtained under the premise of creative work is not made All other embodiment, belongs to the scope of protection of the invention.
As shown in figure 1, a kind of system architecture that the embodiment of the present invention is applicable, including Centroid and node to be detected. Wherein, Centroid and node to be detected can be the network equipments such as computer, node to be detected can be multiple servers or The system of process composition.Preferably, Centroid and node to be detected can carry out information processing using cloud computing technology.
Centroid is used for the transmitting path for calculating detection message, and sends detection message to node to be detected, by right The record of the transaction results of node feeding back to be detected and analysis, judge the state of each node to be detected, and are sent to abnormal nodes Isolated instructions, releasing isolated instructions are sent to normal abnormal nodes are recovered.Node to be detected is used to receive Centroid transmission Detection message, according to transmitting path forwarding data packets, and to Centroid back-to-back running result, so that Centroid root The state of each node to be detected is determined according to operation result.
Centroid can be communicated by INTERNET networks with node to be detected, can also be mobile by the whole world Communication system (Global System for Mobile Communications, abbreviation GSM), Long Term Evolution (long term Evolution, abbreviation LTE) GSM such as system communicated with node to be detected.Can be between node to be detected Communicated by INTERNET networks, global system for mobile communications (Global System for Mobile can also be passed through Communications, abbreviation GSM), the mobile communication such as Long Term Evolution (long term evolution, abbreviation LTE) system System is communicated.
Fig. 2 illustrates a kind of schematic flow sheet of Fault Locating Method provided in an embodiment of the present invention.Such as Fig. 2 Shown, Fault Locating Method provided in an embodiment of the present invention comprises the following steps:
Step 201, Centroid determine M detection message according to transaction to be detected, and send described to node to be detected M detection message;The transmission path of the M detection message covers all nodes to be detected.
Step 202, for described M detection message in each detection message, the Centroid is according to the detection The corresponding feedback message of message, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is according to described What the operation result of the corresponding node to be detected of detection message was obtained;The transmission road of the detection message that there are abnormal nodes Node to be detected in footpath is primary dcreening operation node.
Step 203, the Centroid determine N number of detection message according to the primary dcreening operation node, and to the primary dcreening operation node Send N number of detection message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node.
Step 204, for it is described it is N number of detection message in each detection message, the Centroid is according to the detection The corresponding feedback message of message, determines abnormal nodes.
In the embodiment of the present invention, when some node in node to be detected breaks down or needs to overall section to be detected When point carries out maintenance examination, Centroid determines M detection information according to transaction to be detected, and is sent out to node to be detected Send.The transmission path of this M detection message covers all nodes to be detected.Centroid sends to node to be detected and detected After message, the corresponding feedback message of each detection message is received, feedback message is the node root to be detected for receiving detection message Fed back according to operation result.Centroid is according to the feedback message received, it is determined that there is the detection message of abnormal nodes, and Corresponding node to be detected is as primary dcreening operation node in the transmission path for the detection message that there will be abnormal nodes, it may be determined that different Chang Jiedian is present in primary dcreening operation node.Centroid determines N number of detection message according to primary dcreening operation node, and is sent to primary dcreening operation node N number of detection message, the transmission path of this N number of detection message covers all transmission paths of primary dcreening operation node.Finally, centromere Point determines abnormal nodes according to the corresponding feedback message of N number of detection message.The embodiment of the present invention by detection node all standing and Transmission path power covering is combined, and realizes the high speed precise positioning of server cluster system failure.
Detection message in the embodiment of the present invention is obtained according to arm's length dealing Message Simulation, because normal transaction disappears The transmitting path of breath is random, it is impossible to ensure the accuracy and promptness of fault detect, and for normal transaction message, Need to ensure its success rate, therefore normal transaction message is not suitable for using in the fault detect of the embodiment of the present invention.Simulation The content of the detection message gone out and the content of arm's length dealing message can be consistent, can substitute arm's length dealing message and treat Failure in detection node system is positioned.
In general, a transaction message needs multiple server node collaboration processing, and each server node is complete successively Into a link of the transaction message, meanwhile, there are multiple server nodes to handle the same link of transaction message.By Server node, according to the sequencing of processing transaction message, can be divided into many levels, the server of each level by this Node completes a link of transaction message.For example, node system to be detected includes A, B, C totally 3 levels, as shown in Figure 3.A Layer has 3 nodes to be detected, respectively A1, A2, A3, and B layers have 3 nodes to be detected, respectively B1, B2, B3, and C layers have 2 Node to be detected, respectively C1 and C2.Any node to be detected in A layers can handle first link of transaction message, i.e., Any node to be detected in A1, A2, A3 receives the detection message that Centroid is sent, and will be sent after the detection Message Processing To B layers of node to be detected.Any node to be detected in B1, B2, B3 receives the detection message of A layers of transmission, and this is detected C layers of node to be detected is sent to after Message Processing.After C1 or C2 receive the detection message and be successfully processed, by result by original Road feeds back to Centroid.Centroid is normally received after feedback message, and detection message can be determined according to the feedback message The state of 3 nodes to be detected in transmitting path is normal condition.
In the embodiment of the present invention, Centroid is divided into two stages to node system to be detected transmission detection message.First The sending method of stage detection message can be determined to be detected to cover all nodes to be detected according to the feedback of node to be detected It whether there is malfunctioning node in node system.If it is determined that there is malfunctioning node in node system to be detected, then Centroid is sent The detection message of second stage.The sending method of second stage detection message, can be according to feedback to cover all transmission paths Determine the particular location of malfunctioning node.The sending method of two kinds of detection message is still discussed in detail by taking Fig. 3 as an example below.
It is all nodes to be detected of covering first, for the node to be detected in Fig. 3, first stage Centroid needs Send 3 detection message.The transmission path of first detection message is A1-B1-C1, the transmission path of second detection message For A2-B2-C2, the transmission path of the 3rd detection message is A3-B3-C1, so, the transmission path covering of 3 detection message All nodes to be detected, i.e., M quantity is 3 in step 201.Centroid can send this 3 detection message simultaneously, It can also send successively.
Centroid receives the corresponding feedback message of this 3 detection message, and is determined according to the reception condition of feedback message There is the detection message of abnormal nodes.
The Centroid is according to the corresponding feedback message of the detection message, it is determined that there is the detection of abnormal nodes Message, including:
For any detection message in described M detection message, the Centroid does not receive institute in threshold time The corresponding feedback message of detection message is stated, or the corresponding feedback message of the detection message is error message, it is determined that exist different Chang Jiedian.
For above-mentioned 3 detection message, Centroid does not receive any in 3 detection message in threshold time Detect that the corresponding feedback message of any detection message in the corresponding feedback message of message, or above-mentioned 3 detections message is mistake Message, it is determined that there are abnormal nodes in node system to be detected.For example, the corresponding feedback message of second detection message, or The corresponding feedback message of the detection message of person second is error message, it is determined that exist in node to be detected abnormal nodes and this Individual abnormal nodes are present in the transmission path of second detection message, then by the transmission path of second detection message Node A2, B2 and C2 to be detected are used as primary dcreening operation node.
Afterwards, according to A2, B2 and C2, these three primary dcreening operation nodes determine to send the number that second stage detects message Centroid Amount and transmitting path.The transmission path of second stage detection message need to cover all transmission paths of primary dcreening operation node.For first For sieving node A2, possible transmission path be A2-B1-C1, A2-B2-C1, A2-B3-C1, A2-B1-C2, A2-B2-C2, A2-B3-C2.For primary dcreening operation node B2, possible transmission path is A1-B2-C1, A3-B2-C1, A1-B2-C2, A3- B2-C2.For primary dcreening operation node C2, possible transmission path is A1-B1-C2, A2-B1-C2, A3-B1-C2, A1-B2- C2、A3-B2-C2、A1-B3-C2、A2-B3-C2、A3-B3-C2.Therefore, the quantity of second stage detection message is per node layer Number is multiplied, and is 3*3*2=18, i.e., N quantity is 18 in step 203, and Centroid needs to have hair altogether to node system to be detected Send 18 detection message.
The feedback message of message is detected for second stage, if the transmission path feedback that some node to be detected is related to Message show abnormal, it is determined that the node to be detected be abnormal nodes.
The Centroid determines abnormal nodes according to the corresponding feedback message of the detection message, including:
For the corresponding feedback message of all detection message by the first node to be detected, the Centroid exists Do not receive the feedback message in the threshold time, or the feedback message is error message, it is determined that described first Node to be detected is abnormal nodes.
In the first stage, as long as having the corresponding feedback message of a detection message exception occur can determine that section to be detected There are abnormal nodes in dot system.In second stage, the presence of abnormal nodes can cause to be related to the abnormal nodes in path All detection message can not be normally processed, therefore it is abnormal accordingly to occur that multiple feedback messages occur.Therefore, centromere Point can determine corresponding detection message according to there is abnormal feedback message, further according to the node in detection message path It is accurately positioned out abnormal nodes.For example, above-mentioned transmission path A1-B2-C1, A2-B2-C1, A3-B2-C1, A1-B2-C2, A2- B2-C2, A3-B2-C2 corresponding feedback message of detection message are abnormal, then Centroid can determine that node B2 to be detected is Abnormal nodes.
It should be noted that in step 203, and the detection of non-once complete trails just can confirm that abnormal nodes, because being System is to allow part packet loss and abnormal retry mechanism, therefore suspects to exist and need to carry out many wheel complete trails detections after abnormal nodes, State of each node in complete trails detection is recorded, need to only there is a Successful Transaction process in the detection of each round complete trails Certain node, then it is normal to confirm the node state.Node state judgement is carried out again after many wheel complete trails detections, for example, carry out 5 and take turns Complete trails detects that then each node has 5 state recordings, can finally judge the node state according to certain rule.It is false Such as, A nodes records are (normal, abnormal, normally, abnormal, exception), if specification exception number of times is exception more than normal number of times, A is then judged as exception, isolates A;If regulation, which is all abnormal, just judges that A judges A to be normal as exception, now.This mechanism Decision algorithm robustness can be strengthened so that system is more stablized, otherwise will frequently occur system node misjudged and isolation Situation.
In order to reduce the consumption of system resource, strengthen the specific aim of fault detect, the embodiment of the present invention is based on type of transaction Difference, all Node stations to be detected are classified.Also include before above-mentioned steps 201:
The Centroid is based on different types of transaction, is different clusters by all node divisions to be detected, same The message of the same type of transaction of node processing to be detected in cluster.
The Centroid sends the M detection message to node to be detected, including:
For a cluster, to be detected node of the Centroid into the cluster sends the M detection and disappeared Breath;
The Centroid sends N number of detection message to the primary dcreening operation node, including:
For a cluster, primary dcreening operation node of the Centroid into the cluster sends N number of detection message.
Normal transaction message can be divided into inquiry class message, consumer message, operation class message etc. according to type, this It is different clusters by node division to be detected, and different clusters is entered according to the difference of type of transaction in inventive embodiments Row is detected respectively, can so cause inspection policies more flexible, and the quantity of detection message is reduced as far as possible, reduces system resource Consumption.Still by taking the node to be detected in Fig. 3 as an example, node system to be detected can handle two kinds of detections of inquiry and consumption and disappear Breath, wherein, 2 nodes to be detected of 3 of A layer node and C layers to be detected, which can be handled, to be inquired about and consumes both and detect Message, and node B1 to be detected only handles inquiry class detection message in B layers, B2 and B3 only handle consumer detection message.Cause This, for inquiry class detection message, it is necessary to detect the node B1 to be detected of A layers, C layers of all nodes and B to be detected layer;Pin To consumer detection message, it is necessary to detect the node B2 and B3 to be detected of A layers, C layers of all nodes and B to be detected layer.Therefore It is two clusters by node division to be detected based on different types of transaction, includes corresponding to the cluster of inquiry class detection message Node A1, A2, A3, B1, C1, C2 and C3 to be detected, corresponding to it is consumer detection message cluster include node A1 to be detected, A2, A3, B2, B3, C1, C2 and C3.The embodiment of the present invention sends detection message for the two clusters and carries out fault detect respectively And positioning.
In the embodiment of the present invention, after abnormal nodes are positioned, the abnormal nodes detected can be handled.Step 204, the Centroid is according to the corresponding feedback message of the detection message, after determining abnormal nodes, in addition to:
The state of the abnormal nodes is set to isolation.
In addition, after segregate abnormal nodes recover normal, isolation can be released.It is i.e. described to save the exception After point is isolated, in addition to:
The Centroid sends detection message to the abnormal nodes, according to the corresponding feedback of the detection message Message determines that the abnormal nodes have recovered normal, then releases the abnormal nodes isolation.
The embodiment of the present invention realizes server cluster system failure high speed precise positioning, the function of quickly isolating; In the case of isolating node recovery normal function, also can quickly it find, and automatically terminate isolation.
In order to be more clearly understood that the present invention, above-mentioned flow is described in detail with specific embodiment below, specifically Step as shown in figure 4, including:
Step 401, Centroid are based on different types of transaction, and by node division cluster to be detected, generation cluster network is opened up Flutter figure.
For the detection message that type of transaction is x, its map network topology is:
Gx=T1 A1 ..., AmA, T2 { B1 ..., BmB},...,Tn{C1,...,CmC... formula 1
The step of wherein T={ T1, T2 ..., Tn } is performed for needed for processing detection message is gathered, and as i > j, Ti Performed prior to Tj, { C1 ..., CmCIt is the node set to be detected that can perform step Tn.Network topological diagram Gx illustrates transaction The maximum cluster set that x is covered.
Step 402, the detection message of all nodes to be detected of Centroid generation covering send plan.Wherein, cover complete The transmission plan of node is as follows:
Wherein Pi sends plan, M=max (m to cover the detection message Id of full nodeA,mB,...,mC), i.e. M is processing Detect the maximum of the nodes to be detected of each step of message.In formula 2, as i > mx(i.e. i values are more than node to be detected Cluster interior joint number) when, node subscript i values to be detected replace with i mod mx+1。
Step 403, Centroid are sent according to the detection message for covering full node to be planned, and is sent to each node to be detected Message is detected, and receives feedback result.
Step 404, Centroid according to feedback result judge in node system to be detected whether there is abnormal nodes, if It is to perform step 405, otherwise detection of end.
Step 405, Centroid determine that the node to be detected in the transmission path for the detection message that there are abnormal nodes is Primary dcreening operation node, and generate the detection message transmission plan of all transmission paths of covering primary dcreening operation node.Wherein, complete trails is covered Transmission plan it is as follows:
Wherein, Q sends plan, N=m for the detection message Id of covering complete trailsA*mB*...*mC
Step 406, Centroid are sent according to the detection message of covering complete trails to be planned, and is sent to each node to be detected Message is detected, and receives feedback result.
Step 407, Centroid determine abnormal nodes position according to the feedback result in step 406.
Step 408, Centroid are isolated abnormal nodes.
Fig. 5 illustrates a kind of structural representation of fault locator provided in an embodiment of the present invention.
As shown in figure 5, a kind of fault locator provided in an embodiment of the present invention, including:
First sending module 501, for determining M detection message according to transaction to be detected, and sends to node to be detected The M detection message;The transmission path of the M detection message covers all nodes to be detected;
Determining module 502, for for each detection message in described M detection message, according to the detection message Corresponding feedback message, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is according to the detection What the operation result of the corresponding node to be detected of message was obtained;In the transmission path of the detection message that there are abnormal nodes Node to be detected be primary dcreening operation node;
Second sending module 503, for determining N number of detection message according to the primary dcreening operation node, and to the primary dcreening operation node Send N number of detection message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node;
Locating module 504, for for each detection message in N number of detection message, according to the detection message Corresponding feedback message, determines abnormal nodes.
Optionally, in addition to division module 505, for based on different types of transaction, being by all node divisions to be detected The message of the same type of transaction of node processing to be detected in different clusters, same cluster;
First sending module 501, specifically for for a cluster, the node to be detected into the cluster is sent out Send the M detection message;
Second sending module 503, specifically for for a cluster, the primary dcreening operation node into the cluster is sent N number of detection message.
Optionally, in addition to processing module 506, it is used for:
The state of the abnormal nodes is set to isolation.
Optionally, the processing module 506, is additionally operable to:
Detection message is sent to the abnormal nodes, according to being determined the corresponding feedback message of the detection message Abnormal nodes have recovered normal, then release the abnormal nodes isolation.
Optionally, the determining module 502, specifically for:
For any detection message in described M detection message, the detection message pair is not received in threshold time The feedback message answered, or the corresponding feedback message of the detection message is error message, it is determined that there are abnormal nodes.
Optionally, the locating module 504, specifically for:
For the corresponding feedback message of all detection message by the first node to be detected, in the threshold time The feedback message is not received inside, or the feedback message is error message, it is determined that first node to be detected is Abnormal nodes.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product Figure and/or block diagram are described.It should be understood that can be by each in computer program instructions implementation process figure and/or block diagram Flow and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computers can be provided Processor of the programmed instruction to all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices To produce a machine so that produced by the instruction of computer or the computing device of other programmable data processing devices For realizing the function of being specified in one flow of flow chart or multiple flows and/or one square frame of block diagram or multiple square frames Device.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which is produced, to be included The manufacture of command device, the command device is realized in one flow of flow chart or multiple flows and/or one square frame of block diagram Or the function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that Series of operation steps is performed on computer or other programmable devices to produce computer implemented processing, so as to calculate The instruction performed on machine or other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or side The step of function of being specified in one square frame of block diagram or multiple square frames.
, but those skilled in the art once know basic wound although preferred embodiments of the present invention have been described The property made concept, then can make other change and modification to these embodiments.So, appended claims are intended to be construed to bag Include preferred embodiment and fall into having altered and changing for the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without departing from the present invention's to the present invention Spirit and scope.So, if these modifications and variations of the present invention belong to the model of the claims in the present invention and its equivalent technologies Within enclosing, then the present invention is also intended to including these changes and modification.

Claims (12)

1. a kind of Fault Locating Method, it is characterised in that including:
Centroid determines M detection message according to transaction to be detected, and sends the M detection message to node to be detected; The transmission path of the M detection message covers all nodes to be detected;
For each detection message in described M detection message, the Centroid is corresponding according to the detection message Feedback message, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is corresponding according to the detection message What the operation result of node to be detected was obtained;Node to be detected in the transmission path of the detection message that there are abnormal nodes For primary dcreening operation node;
The Centroid determines N number of detection message according to the primary dcreening operation node, and sends N number of inspection to the primary dcreening operation node Survey message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node;
For each detection message in N number of detection message, the Centroid is corresponding according to the detection message Feedback message, determines abnormal nodes.
2. the method as described in claim 1, it is characterised in that the Centroid determines M detection according to transaction to be detected Before message, in addition to:
The Centroid is based on different types of transaction, is different clusters, same cluster by all node divisions to be detected In the same type of transaction of node processing to be detected message;
The Centroid sends the M detection message to node to be detected, including:
For a cluster, to be detected node of the Centroid into the cluster sends the M detection message;
The Centroid sends N number of detection message to the primary dcreening operation node, including:
For a cluster, primary dcreening operation node of the Centroid into the cluster sends N number of detection message.
3. the method as described in claim 1, it is characterised in that the Centroid is according to corresponding anti-of the detection message Message is presented, after determining abnormal nodes, in addition to:
The state of the abnormal nodes is set to isolation.
4. method as claimed in claim 3, it is characterised in that it is described the abnormal nodes are isolated after, in addition to:
The Centroid sends detection message to the abnormal nodes, true according to the feedback message that the detection message is corresponding The fixed abnormal nodes have recovered normal, then release the abnormal nodes isolation.
5. the method as described in any one of Claims 1-4, it is characterised in that the Centroid is according to the detection message Corresponding feedback message, it is determined that there is the detection message of abnormal nodes, including:
For any detection message in described M detection message, the Centroid does not receive the inspection in threshold time The corresponding feedback message of message is surveyed, or the corresponding feedback message of the detection message is error message, it is determined that there is abnormal section Point.
6. the method as described in any one of Claims 1-4, it is characterised in that the Centroid is according to the detection message Corresponding feedback message, determines abnormal nodes, including:
For the corresponding feedback message of all detection message by the first node to be detected, the Centroid is in the threshold Do not receive the feedback message in the value time, or the feedback message is error message, it is determined that described first is to be detected Node is abnormal nodes.
7. a kind of fault locator, it is characterised in that including:
First sending module, for determining M detection message according to transaction to be detected, and sends described M to node to be detected Detect message;The transmission path of the M detection message covers all nodes to be detected;
Determining module, it is corresponding according to the detection message for for each detection message in described M detection message Feedback message, it is determined that there is the detection message of abnormal nodes;Wherein, the feedback message is corresponding according to the detection message What the operation result of node to be detected was obtained;Node to be detected in the transmission path of the detection message that there are abnormal nodes For primary dcreening operation node;
Second sending module, for determining N number of detection message according to the primary dcreening operation node, and sends described to the primary dcreening operation node N number of detection message;The transmission path of N number of detection message covers all transmission paths of the primary dcreening operation node;
Locating module, it is corresponding according to the detection message for for each detection message in N number of detection message Feedback message, determines abnormal nodes.
8. device as claimed in claim 7, it is characterised in that
Also include division module, it is same for being different clusters by all node divisions to be detected based on different types of transaction The message of the same type of transaction of node processing to be detected in individual cluster;
First sending module, specifically for for a cluster, the node to be detected into the cluster sends described M Detect message;
Second sending module, specifically for for a cluster, the primary dcreening operation node into the cluster sends N number of inspection Survey message.
9. device as claimed in claim 7, it is characterised in that also including processing module, is used for:
The state of the abnormal nodes is set to isolation.
10. device as claimed in claim 9, it is characterised in that the processing module, is additionally operable to:
Detection message is sent to the abnormal nodes, the abnormal section is determined according to the corresponding feedback message of the detection message Point has recovered normal, then releases the abnormal nodes isolation.
11. the device as described in any one of claim 7 to 10, it is characterised in that the determining module, specifically for:
For any detection message in described M detection message, the detection message is not received in threshold time corresponding Feedback message, or the corresponding feedback message of the detection message is error message, it is determined that there are abnormal nodes.
12. the device as described in any one of claim 7 to 10, it is characterised in that the locating module, specifically for:
For the corresponding feedback message of all detection message by the first node to be detected, in the threshold time not The feedback message is received, or the feedback message is error message, it is determined that first node to be detected saves to be abnormal Point.
CN201710632889.9A 2017-07-28 2017-07-28 Fault positioning method and device Active CN107332709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710632889.9A CN107332709B (en) 2017-07-28 2017-07-28 Fault positioning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710632889.9A CN107332709B (en) 2017-07-28 2017-07-28 Fault positioning method and device

Publications (2)

Publication Number Publication Date
CN107332709A true CN107332709A (en) 2017-11-07
CN107332709B CN107332709B (en) 2020-08-11

Family

ID=60199507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710632889.9A Active CN107332709B (en) 2017-07-28 2017-07-28 Fault positioning method and device

Country Status (1)

Country Link
CN (1) CN107332709B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019119269A1 (en) * 2017-12-19 2019-06-27 深圳前海达闼云端智能科技有限公司 Network fault detection method and control center device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040151497A1 (en) * 2003-02-05 2004-08-05 Ki-Cheol Lee Wavelength path monitoring/correcting apparatus in transparent optical cross-connect and method thereof
CN104993960A (en) * 2015-07-01 2015-10-21 广东工业大学 Location method of network node fault

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040151497A1 (en) * 2003-02-05 2004-08-05 Ki-Cheol Lee Wavelength path monitoring/correcting apparatus in transparent optical cross-connect and method thereof
CN104993960A (en) * 2015-07-01 2015-10-21 广东工业大学 Location method of network node fault

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019119269A1 (en) * 2017-12-19 2019-06-27 深圳前海达闼云端智能科技有限公司 Network fault detection method and control center device

Also Published As

Publication number Publication date
CN107332709B (en) 2020-08-11

Similar Documents

Publication Publication Date Title
CN107341068A (en) The method and apparatus that O&M troubleshooting is carried out by natural language processing
US7898971B2 (en) Method and apparatus for automating hub and spoke Internet Protocol Virtual Private Network trouble diagnostics
CN109302346A (en) A kind of method and apparatus of transmitting data stream amount
CN107690778A (en) Use the performance of mirror image detection grouping mea-sure network
CN108282497A (en) For the ddos attack detection method of SDN control planes
CN106453286A (en) Reputation method and system based on block chain
CN113938407B (en) Data center network fault detection method and device based on in-band network telemetry system
US20130159778A1 (en) Operations management apparatus, operations management method and program thereof
CN106411617A (en) Power communication network fault warning correlation processing method
CN103138988B (en) Positioning treatment method and positioning treatment device of network faults
CN103905440A (en) Network security situation awareness analysis method based on log and SNMP information fusion
CN107026790B (en) A kind of transmission control method and equipment
CN106789625A (en) A kind of loop detecting method and device
CN107210927A (en) Abnormality detection in protocol processes
CN106330588A (en) BFD detection method and device
CN106845881A (en) A kind of detection method of stock abnormal data, device and electronic equipment
CN102281103B (en) Optical network multi-fault recovering method based on fuzzy set calculation
CN106301997A (en) Gateway device response to network connectedness method and apparatus
CN109039959A (en) A kind of the consistency judgment method and relevant apparatus of SDN network rule
CN103763137B (en) A kind of device configuration connection guard method, system and device
CN107332709A (en) A kind of Fault Locating Method and device
CN106452880A (en) Network wiring detection method and apparatus thereof
CN107925702B (en) Method, apparatus, computer readable medium for detecting a broken binder
US8438262B2 (en) Method and system for analysis of message transactions in a distributed system
CN108040067B (en) Cloud platform intrusion detection method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant