CN104572329A - Fault determination method and device - Google Patents

Fault determination method and device Download PDF

Info

Publication number
CN104572329A
CN104572329A CN201410853612.5A CN201410853612A CN104572329A CN 104572329 A CN104572329 A CN 104572329A CN 201410853612 A CN201410853612 A CN 201410853612A CN 104572329 A CN104572329 A CN 104572329A
Authority
CN
China
Prior art keywords
event sequence
regular event
measured value
group
regular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410853612.5A
Other languages
Chinese (zh)
Other versions
CN104572329B (en
Inventor
刘科佑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Huawei Enterprises Communications Technologies Co Ltd
Original Assignee
Hangzhou Huawei Enterprises Communications Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawei Enterprises Communications Technologies Co Ltd filed Critical Hangzhou Huawei Enterprises Communications Technologies Co Ltd
Priority to CN201410853612.5A priority Critical patent/CN104572329B/en
Publication of CN104572329A publication Critical patent/CN104572329A/en
Application granted granted Critical
Publication of CN104572329B publication Critical patent/CN104572329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a fault determination method and device. The fault determination method comprises the steps of extracting measurement values of N groups of regular event sequences in a normal state, wherein events and event sequences of each group of regular event sequences are identical, and the event sequences are predetermined sequences; conducting normal-state marking on the measurement values of the N groups of regular event sequences; extracting measurement values of M groups of regular event sequences in a fault state, wherein events and event sequences of each group of regular event sequences in the M groups of regular event sequences are identical to the events and event sequences of each group of regular event sequences in the N groups of regular event sequences; conducting fault-state marking on the measurement values of the M groups of regular event sequences; training a mode recognition algorithm on the measurement values of the N groups of regular event sequences and the measurement values of M groups of regular event sequences; adopting the trained mode recognition algorithm to classify the measurement values of the regular event sequences to be determined and determining whether devices break down or not according to classification results.

Description

A kind of fault determination method and device
Technical field
The application relates to electronic technology field, particularly relates to a kind of fault determination method and device.
Background technology
In computer system, distributed system or network system, often system can be caused normally can not to provide business because of certain one malfunctions.In order to ensure the continuity providing business, when one of them one malfunctions, need the work for the treatment of of other node taking over fault nodes.
Therefore, in system operation, need to judge whether system breaks down, and when fault occurs, be switched on normal node, to ensure the normal operation of system.In the prior art, fault time-out time being set usually, such as, when not receiving packet in two minutes, or thering is no read-write operation etc. in schedule time length.During node failure especially as non-sensor class, fault perception is just slower.Therefore, determine that the time of fault just needs the long period, the continuity of business will be caused poor.
Summary of the invention
The application provides a kind of fault determination method and device, in order to solve the trouble shooting time of the prior art long technical matters causing business continuance poor.
The application's first aspect provides a kind of fault determination method, comprising:
The measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer;
The measured value of described N group regular event sequence is carried out to the mark of normal condition;
Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Malfunction mark is carried out to the measured value of described M group regular event sequence;
According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark;
Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
In conjunction with first aspect, in the first possible implementation of first aspect, described method also comprises:
According to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
In conjunction with the first possible implementation of first aspect or first aspect, in the implementation that the second of first aspect is possible, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
In conjunction with the implementation that the second of the first possible implementation goods first aspect of first aspect or first aspect is possible, in the third possible implementation of first aspect, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
The application's second aspect provides a kind of failure determination device, comprising:
Extracting unit, for the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer; Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Processing unit, for carrying out the mark of normal condition to the measured value of described N group regular event sequence; Malfunction mark is carried out to the measured value of described M group regular event sequence; According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark; Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
In conjunction with second aspect, in the first possible implementation of second aspect, described processing unit also for: according to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
In conjunction with the first possible implementation of second aspect or second aspect, in the implementation that the second of second aspect is possible, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
In conjunction with the implementation that the second of the first possible implementation goods second aspect of second aspect or second aspect is possible, in the third possible implementation of second aspect, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
The application's third aspect provides a kind of network equipment, comprising:
Monitor, for the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer; Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Processor, for carrying out the mark of normal condition to the measured value of described N group regular event sequence; Malfunction mark is carried out to the measured value of described M group regular event sequence; According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark; Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
In conjunction with the third aspect, in the first possible implementation of the third aspect, described processor also for: according to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
In conjunction with the first possible implementation of the third aspect or the third aspect, in the implementation that the second of the third aspect is possible, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
In conjunction with the implementation that the second of the first possible implementation goods third aspect of the third aspect or the third aspect is possible, in the third possible implementation of the third aspect, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
The one or more technical schemes provided in the embodiment of the present application, at least have following technique effect or advantage:
In the embodiment of the present application, because the measured value under can utilizing normal condition and the measured value under malfunction carry out the training of algorithm for pattern recognition, then the algorithm for pattern recognition after utilizing training carries out breakdown judge, i.e. sensed in advance fault, and do not need in picture prior art, need to wait pending fault time-out time then could determine to break down.Therefore, the method in the embodiment of the present application can sensed in advance fault, reduces the trouble shooting time.Therefore, it is possible to be switched to fast in normal node, to ensure the continuity providing business.For example, suppose that fault time-out time of the prior art is 2 minutes, and in the 15th second, just determine equipment failure by the scheme in the embodiment of the present application, so just can switch immediately, and do not need to switch again, so add the continuity of business when 2 minutes by the time.Further, in the embodiment of the present application because the event in regular event sequence and event sequence identical and be predetermined, so the accuracy of breakdown judge can be improved.
Accompanying drawing explanation
The process flow diagram of a kind of fault determination method that Fig. 1 provides for the embodiment of the present application;
The schematic diagram of a kind of decision boundary that Fig. 2 provides for the embodiment of the present application;
The functional block diagram of a kind of failure determination device that Fig. 3 provides for the embodiment of the present application;
The structured flowchart of a kind of network equipment that Fig. 4 provides for the embodiment of the present application.
Embodiment
The embodiment of the present application provides a kind of fault determination method and device, in order to solve the trouble shooting time of the prior art long technical matters causing business continuance poor.
For making the object of the embodiment of the present application, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present application, technical scheme in the embodiment of the present application is clearly and completely described, obviously, described embodiment is some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making the every other embodiment obtained under creative work prerequisite, all belong to the scope of the application's protection.
Term "and/or" herein, being only a kind of incidence relation describing affiliated partner, can there are three kinds of relations in expression, and such as, A and/or B, can represent: individualism A, exists A and B simultaneously, these three kinds of situations of individualism B.In addition, character "/" herein, general expression forward-backward correlation is to the relation liking a kind of "or".
In addition, herein " equipment " and " system " can exchange use.
Please refer to Fig. 1, is the process flow diagram of a kind of fault determination method that the application one embodiment provides.As shown in Figure 1, the method comprises following content:
Step 101: the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer;
Step 102: the mark measured value of described N group regular event sequence being carried out to normal condition;
Step 103: the measured value extracting the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Step 104: malfunction mark is carried out to the measured value of described M group regular event sequence;
Step 105: according to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark;
Step 106: use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
Be convenient to the implementation process that fault determination method in the embodiment of the present application is described, will the embodiment of each step be described in detail below.
In a step 101, the measured value of the N group regular event sequence under the normal condition of extracting device.In system normal course of operation, the measured value of multiple event can be gathered, such as: central processing unit is (English: Central Processing Unit, be called for short: the situation that takies CPU), the EMS memory occupation of physical node, disk per second carry out read-write operation number of times (English: Input/Output Operations Per Second, IOPS) and the network bandwidth be called for short:.Or the EMS memory occupation situation of Monitoring Service process, transmitting-receiving packet delay situation or whether have abnormality alarming etc.Because equipment is all in most of the cases normal operation, so be easy to the measured value that just can collect many group regular event sequences.
It should be noted that, in the embodiment of the present application, the event often organizing regular event sequence in N group regular event sequence is identical with event sequence and described event sequence is predetermined order.Specifically, the event in the present embodiment is through human expert's great many of experiments, the event that may cause fault then filtered out.Further, event sequence is also by human expert's great many of experiments, the event sequence that may cause fault then determined.Therefore, before step 101, can by event sequence when input equipment needs the event of extraction measured value and carrying out in step 105 to train manually.In the gatherer process of step 101, sequencing can be there is no.Preferably, simultaneously the measured value often organizing regular event sequence measures acquisition.
For example, suppose that the event of regular event sequence, event sequence and measured value are: (regular event 1, V1), (regular event n, V2) ... (regular event n, Vn).The measured value so obtaining N group regular event sequence can be recorded as (V11, V12 respectively ... V1n), (V21, V22 ... V2n), (V31, V32 ... V3n)
Such as: when certain is measured, event 1 is mainboard temperature, and measured value is 35 degree; What event 2 was CPU takies, and measured value is 80%; What event 3 was internal memory takies, and measured value is 45%.So obtain the measured value of one group of regular event sequence for (35,80,45).Adopt identical method, the measured value of many group regular times sequences under equipment normal condition can be obtained.
Similar, in step 103, can extracting device nonserviceable under the measured value of M group regular event sequence.The event often organizing regular event sequence in M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in the N group regular event sequence in step 101 with event sequence.So just have comparative.
Continue to be described for aforementioned events, when supposing to nonserviceable, mainboard temperature is 50 degree; Taking of CPU is 300%; Taking of internal memory is 90%.So obtain the measured value of one group of regular event sequence for (50,300,90).Adopt identical method, the measured value of many group regular event sequences under equipment failure state can be obtained.
In practice, malfunction can be the fault scenes adopting direct fault location mode to construct, and so can obtain data during a large amount of malfunctions at short notice.The value of M is larger, and the algorithm for pattern recognition after training is more accurate, so also more accurate to the judgement of fault.
It should be noted that, the execution of step 101 and step 103 does not have sequencing, in practice, can be repeatedly intersect to carry out.
After the measured value obtaining N group regular event sequence in a step 101, can step 102 be performed, namely the measured value of described N group regular event sequence be carried out to the mark of normal condition.In practice, mark can use various ways, such as in normal state, mainboard temperature is respectively V11, V21, V31 in measuring at three times, so just can be labeled as (V11, ok), (V21, ok) and (V31, ok).Again such as, " ok " can substitute with " 1 ", or substitutes with " true ".As long as equipment can be measured value under normal condition according to this measured value of marker recognition.
Similar with step 102, after the measured value obtaining M group regular event sequence in step 103, can step 104 be performed, malfunction mark is carried out to the measured value of described M group regular event sequence.Corresponding with step 102, the mark carrying out malfunction can be such as " error ", corresponding with " ok ".Also can be " 0 ", corresponding with " 1 ".Also can be " false ", corresponding with " true ".
It should be noted that, step 102 also can perform after step 103.The execution of step 102 and step 104 does not have sequencing.
Following execution step 105, namely according to the measured value of the described N group regular event sequence of annotation and the measured value training mode recognizer of described M group regular event sequence.
In the embodiment of the present application, algorithm for pattern recognition is such as that (English: Support VectorMachine is called for short: SVM) algorithm support vector machine.In practice, algorithm for pattern recognition can be such as also Bayes net algorithm, association rule algorithm.
After the training through step 105, a partitioning boundary can be formed, also can be called decision boundary.Wherein, the both sides that partitioning boundary or decision boundary demarcate represent normal condition and malfunction respectively.As shown in Figure 2, suppose that normal condition and the potential partitioning boundary of malfunction are border 201.The partitioning boundary obtained after step 105 is trained is border 202.Region 203 represents normal condition region.Region 204 represents malfunction region.When the registration on border 202 and border 201 is higher, just illustrate that partitioning boundary is more accurate.So more accurate to the judged result of equipment whether fault.Therefore, in the embodiment of the present application, need to extract the event of measured value and event sequence and be that human expert obtains through great many of experiments, so the partitioning boundary utilizing such measured value to train to obtain is just more accurate.The value of M and N is larger, trains the partitioning boundary obtained also more accurate.
After training algorithm for pattern recognition in step 105, equipment can controlling periodic or the real-time measured value to be determined extracting identical event and sequence of events.Then perform step 106, the algorithm for pattern recognition after namely utilizing training is classified to measured value to be determined, and determines whether described equipment breaks down according to the result of classification.Such as: measured value to be determined is (40,60,70).
Specifically, classify because adopt with identical algorithm for pattern recognition when training, so can ensure the classification of measured value to be determined and be categorized as same standard for the measured value of training.Further, the result according to classification determines whether described equipment breaks down, and specifically, shown in Fig. 2, supposes that the result of classifying is positioned at the region 203 in the left side of partitioning boundary 202, is so in normal condition at present with regard to indication equipment.If the result of classification is positioned at the region 204 on the right side of partitioning boundary 202, so there is fault at present with regard to indication equipment, needed to carry out failover.
As can be seen here, because the accuracy of partitioning boundary 202 is higher, so the accuracy carrying out the result of fault verification is in step 106 also higher.
Described as can be seen from above, in the embodiment of the present application, because the measured value under can utilizing normal condition and the measured value under malfunction carry out the training of algorithm for pattern recognition, then the algorithm for pattern recognition after utilizing training carries out breakdown judge, i.e. sensed in advance fault, and do not need in picture prior art, need to wait pending fault time-out time then could determine to break down.Therefore, the method in the embodiment of the present application can sensed in advance fault, reduces the trouble shooting time.Therefore, it is possible to be switched to fast in normal node, to ensure the continuity providing business.For example, suppose that fault time-out time of the prior art is 2 minutes, and in the 15th second, just determine equipment failure by the scheme in the embodiment of the present application, so just can switch immediately, and do not need to switch again, so add the continuity of business when 2 minutes by the time.
Further, when judging equipment failure, the method also comprises: the business on the malfunctioning node of equipment be switched in normal node, or notice system utilizes miscellaneous equipment to replace described equipment.
Optionally, if in step 106 to the judicious words (can be determined by human expert) that whether break down, so the method also comprises: again trained as new trained values the algorithm for pattern recognition after described training by the measured value of regular event sequence to be determined.Specifically, when the result determined is described device fails, the measured value of described regular event sequence to be determined is carried out to the mark of malfunction; According to the algorithm for pattern recognition of measured value training after step 105 is trained of the regular event sequence described to be determined of mark.When the result determined is that described equipment does not break down, so the measured value of described regular event sequence to be determined is carried out to the mark of normal condition; According to the algorithm for pattern recognition of measured value training after step 105 is trained of the regular event sequence described to be determined of mark.Repetition like this, along with increasing of trained values, partitioning boundary is accurately more next, so also will be more and more accurate according to the result of partitioning boundary determining apparatus whether fault.Therefore, the equipment that the method in the embodiment of the present application can also be carries out automatic learning, improves the intellectuality of equipment.
If the incorrect words of the result judged in step 106, equally also the measured value of described regular event sequence to be determined can be trained as new trained values again to the algorithm for pattern recognition after described training.Just when marking, mark just in time contrary, other are identical with said process, so do not repeat them here.
Based on same inventive concept, shown in figure 3, for the functional block diagram of a kind of failure determination device that the embodiment of the present application provides, for realizing the fault determination method shown in Fig. 1 of the present invention, the implication of term involved in the present embodiment please refer to the content described in previous embodiment.This failure determination device comprises: extracting unit 301, for the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer; Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer; Processing unit 302, for carrying out the mark of normal condition to the measured value of described N group regular event sequence; Malfunction mark is carried out to the measured value of described M group regular event sequence; According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark; Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
Optionally, processing unit 302 also for: according to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
Optionally, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
Optionally, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
Various variation pattern in fault determination method in earlier figures 1 and embodiment and instantiation are equally applicable to the failure determination device of the present embodiment, by the aforementioned detailed description to fault determination method, those skilled in the art clearly can know the implementation method of failure determination device in the present embodiment, so succinct in order to instructions, be not described in detail in this.
Based on same inventive concept, shown in figure 4, for the structured flowchart of a kind of network equipment that the embodiment of the present application provides, for realizing the fault determination method shown in Fig. 1 of the present invention, the implication of term involved in the present embodiment please refer to the content described in previous embodiment.As shown in Figure 4, this network equipment comprises processor 401, transmitter 402, receiver 403, storer 404 and monitor 405.Processor 401 can be specifically general central processing unit (CPU), can be that ASIC(Application Specific Integrated Circuit) is (English: Application Specific Integrated Circuit, being called for short: ASIC), can be one or more integrated circuit performed for control program.The quantity of storer 404 can be one or more.Storer 404 can comprise ROM (read-only memory) (English: Read Only Memory, be called for short: ROM), random access memory (English: Random Access Memory, RAM) and magnetic disk memory be called for short:.These storeies, receiver 403 are connected with processor 401 by bus with transmitter 402.Receiver 403 and transmitter 402, for carrying out network service with external unit, specifically can be communicated with external unit by networks such as Ethernet, wireless access network, WLAN (wireless local area network).Receiver 403 and transmitter 402 can be two physically independent elements, also can be same elements physically.
Instruction can be stored, the instruction that processor 401 can store in execute store 404 in storer 404.
Monitor 405, for the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer; Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer.
Processor 401 is also for carrying out the mark of normal condition to the measured value of described N group regular event sequence; Malfunction mark is carried out to the measured value of described M group regular event sequence; According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark; Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
Optionally, processor 401 also for: according to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
Optionally, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
Optionally, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
The one or more technical schemes provided in application embodiment, at least have following technique effect or advantage:
In the embodiment of the present application, because the measured value under can utilizing normal condition and the measured value under malfunction carry out the training of algorithm for pattern recognition, then the algorithm for pattern recognition after utilizing training carries out breakdown judge, i.e. sensed in advance fault, and do not need in picture prior art, need to wait pending fault time-out time then could determine to break down.Therefore, the method in the embodiment of the present application can sensed in advance fault, reduces the trouble shooting time.Therefore, it is possible to be switched to fast in normal node, to ensure the continuity providing business.For example, suppose that fault time-out time of the prior art is 2 minutes, and in the 15th second, just determine equipment failure by the scheme in the embodiment of the present application, so just can switch immediately, and do not need to switch again, so add the continuity of business when 2 minutes by the time.Further, in the embodiment of the present application because the event in regular event sequence and event sequence identical and be predetermined, so the accuracy of breakdown judge can be improved.
Those skilled in the art should understand, the embodiment of the application can be provided as method, system or computer program.Therefore, the application can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the application can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory and optical memory etc.) of computer usable program code.
The application describes with reference to according to the process flow diagram of the method for the embodiment of the present application, equipment (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Obviously, those skilled in the art can carry out various change and modification to the application and not depart from the spirit and scope of the application.Like this, if these amendments of the application and modification belong within the scope of the application's claim and equivalent technologies thereof, then the application is also intended to comprise these change and modification.

Claims (8)

1. a fault determination method, is characterized in that, comprising:
The measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer;
The measured value of described N group regular event sequence is carried out to the mark of normal condition;
Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Malfunction mark is carried out to the measured value of described M group regular event sequence;
According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark;
Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
2. the method for claim 1, is characterized in that, described method also comprises:
According to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
3. method as claimed in claim 1 or 2, is characterized in that, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
4. the method as described in any one of claim 1-3, is characterized in that, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
5. a failure determination device, is characterized in that, comprising:
Extracting unit, for the measured value of the N group regular event sequence under the normal condition of extracting device, the event often organizing regular event sequence in described N group regular event sequence is identical with event sequence and described event sequence is predetermined order; Wherein, N is positive integer; Extract the measured value of the M group regular event sequence under the malfunction of described equipment, the event often organizing regular event sequence in described M group regular event sequence is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence; Wherein, M is positive integer;
Processing unit, for carrying out the mark of normal condition to the measured value of described N group regular event sequence; Malfunction mark is carried out to the measured value of described M group regular event sequence; According to the measured value of described N group regular event sequence and the measured value training mode recognizer of described M group regular event sequence of mark; Use the measured value of algorithm for pattern recognition to regular event sequence to be determined after training to classify, and determine whether described equipment breaks down according to the result of classification; Wherein, the event of described regular event sequence to be determined is identical with event sequence with the event often organizing regular event sequence in described N group regular event sequence with event sequence.
6. device as claimed in claim 5, it is characterized in that, described processing unit also for: according to determining that the measured value of described regular event sequence to be determined is trained as new trained values the algorithm for pattern recognition after described training by determination result that whether described equipment breaks down again.
7. the device as described in claim 5 or 6, is characterized in that, described often organize regular event sequence event and event sequence be mainboard temperature, the taking of central processor CPU, the taking of internal memory.
8. the device as described in any one of claim 5-7, is characterized in that, described algorithm for pattern recognition is support vector machines algorithm, Bayes net algorithm or association rule algorithm.
CN201410853612.5A 2014-12-31 2014-12-31 A kind of fault determination method and device Active CN104572329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410853612.5A CN104572329B (en) 2014-12-31 2014-12-31 A kind of fault determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410853612.5A CN104572329B (en) 2014-12-31 2014-12-31 A kind of fault determination method and device

Publications (2)

Publication Number Publication Date
CN104572329A true CN104572329A (en) 2015-04-29
CN104572329B CN104572329B (en) 2018-12-25

Family

ID=53088469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410853612.5A Active CN104572329B (en) 2014-12-31 2014-12-31 A kind of fault determination method and device

Country Status (1)

Country Link
CN (1) CN104572329B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092540A (en) * 2017-04-10 2017-08-25 联想(北京)有限公司 processing method, device and equipment
CN110943858A (en) * 2019-11-21 2020-03-31 中国联合网络通信集团有限公司 Fault positioning method and device
CN113127237A (en) * 2019-12-27 2021-07-16 北京金风慧能技术有限公司 Main fault identification method and system of wind generating set
JP2021163020A (en) * 2020-03-31 2021-10-11 日鉄ソリューションズ株式会社 Information processing device, information processing method and program
WO2022061813A1 (en) * 2020-09-27 2022-03-31 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for detecting event in communication network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050144537A1 (en) * 2003-11-12 2005-06-30 Siemens Corporate Research Inc. Method to use a receiver operator characteristics curve for model comparison in machine condition monitoring
US7424619B1 (en) * 2001-10-11 2008-09-09 The Trustees Of Columbia University In The City Of New York System and methods for anomaly detection and adaptive learning
CN102498445A (en) * 2009-09-17 2012-06-13 西门子公司 Supervised fault learning using rule-generated samples for machine condition monitoring

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7424619B1 (en) * 2001-10-11 2008-09-09 The Trustees Of Columbia University In The City Of New York System and methods for anomaly detection and adaptive learning
US20050144537A1 (en) * 2003-11-12 2005-06-30 Siemens Corporate Research Inc. Method to use a receiver operator characteristics curve for model comparison in machine condition monitoring
CN102498445A (en) * 2009-09-17 2012-06-13 西门子公司 Supervised fault learning using rule-generated samples for machine condition monitoring

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092540A (en) * 2017-04-10 2017-08-25 联想(北京)有限公司 processing method, device and equipment
CN107092540B (en) * 2017-04-10 2020-08-25 联想(北京)有限公司 Processing method, device and equipment
CN110943858A (en) * 2019-11-21 2020-03-31 中国联合网络通信集团有限公司 Fault positioning method and device
CN113127237A (en) * 2019-12-27 2021-07-16 北京金风慧能技术有限公司 Main fault identification method and system of wind generating set
JP2021163020A (en) * 2020-03-31 2021-10-11 日鉄ソリューションズ株式会社 Information processing device, information processing method and program
US11762726B2 (en) 2020-03-31 2023-09-19 Ns Solutions Corporation Information processing apparatus, information processing method, and program
JP7382882B2 (en) 2020-03-31 2023-11-17 日鉄ソリューションズ株式会社 Information processing device, information processing method and program
WO2022061813A1 (en) * 2020-09-27 2022-03-31 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for detecting event in communication network

Also Published As

Publication number Publication date
CN104572329B (en) 2018-12-25

Similar Documents

Publication Publication Date Title
CN104572329A (en) Fault determination method and device
US11294754B2 (en) System and method for contextual event sequence analysis
US9658916B2 (en) System analysis device, system analysis method and system analysis program
US20190095266A1 (en) Detection of Misbehaving Components for Large Scale Distributed Systems
CN104583968A (en) Management system and management program
EP2634696A3 (en) Information processing apparatus, control method, and control program
JP2009217382A (en) Failure analysis system, failure analysis method, failure analysis server, and failure analysis program
CN110275992B (en) Emergency processing method, device, server and computer readable storage medium
CN113313280B (en) Cloud platform inspection method, electronic equipment and nonvolatile storage medium
CN111540020B (en) Method and device for determining target behavior, storage medium and electronic device
CN114113984A (en) Fault drilling method, device, terminal equipment and medium based on chaotic engineering
CN110598797B (en) Fault detection method and device, storage medium and electronic device
CN113469137A (en) Abnormal behavior recognition method and device, storage medium and electronic device
CN102546652B (en) System and method for server load balancing
CN106294364A (en) Realize the method and apparatus that web crawlers captures webpage
CN105425739A (en) System for predicting abnormality occurrence using PLC log data
CN111159029A (en) Automatic testing method and device, electronic equipment and computer readable storage medium
CN109388544B (en) Fault monitoring method and device and electronic equipment
US20150149829A1 (en) Failure detecting apparatus and failure detecting method
CN109446398A (en) The method, apparatus and electronic equipment of intelligent measurement web crawlers behavior
CN113986618A (en) Cluster brain split automatic repairing method, system, device and storage medium
CN108196985A (en) A kind of storage system failure prediction method and device based on intelligent predicting
CN115189961A (en) Fault identification method, device, equipment and storage medium
AU2021240276A1 (en) Methods, apparatuses, devices and storage media for switching states of card games
CN104348641A (en) Fault detection method and fault detection device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant