CN110716820A - Fault diagnosis method based on decision tree algorithm - Google Patents

Fault diagnosis method based on decision tree algorithm Download PDF

Info

Publication number
CN110716820A
CN110716820A CN201910959230.3A CN201910959230A CN110716820A CN 110716820 A CN110716820 A CN 110716820A CN 201910959230 A CN201910959230 A CN 201910959230A CN 110716820 A CN110716820 A CN 110716820A
Authority
CN
China
Prior art keywords
decision tree
fault
data
method based
tree algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910959230.3A
Other languages
Chinese (zh)
Inventor
许阿义
陈跃鸿
庄少波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Titanium Shang Artificial Intelligence Technology Co Ltd
Original Assignee
Xiamen Titanium Shang Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Titanium Shang Artificial Intelligence Technology Co Ltd filed Critical Xiamen Titanium Shang Artificial Intelligence Technology Co Ltd
Priority to CN201910959230.3A priority Critical patent/CN110716820A/en
Publication of CN110716820A publication Critical patent/CN110716820A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Abstract

The invention provides a fault diagnosis method based on a decision tree algorithm, and relates to the technical field of fault diagnosis. The fault diagnosis method based on the decision tree algorithm comprises the following steps: s1, collecting sample data; s2, classifying the collected samples to form a new set; s3, extracting key characteristic values of the sample set, fusing similar characteristics, and updating sample data; s4, establishing decision tree nodes and training a sample data set; s5, pruning the decision tree; s6, generating a final decision tree and diagnosing faults; and S7, testing the diagnosis accuracy and correcting the decision tree in time. The fault of the sports equipment is diagnosed by utilizing the decision tree algorithm, so that the diagnosis process is time-saving and labor-saving, a plurality of parts of the sports equipment do not need to be eliminated one by one, the position of the fault can be quickly found, much convenience is brought to the maintenance of the sports equipment, and the diagnosis and maintenance cost is reduced.

Description

Fault diagnosis method based on decision tree algorithm
Technical Field
The invention relates to the technical field of fault diagnosis, in particular to a fault diagnosis method based on a decision tree algorithm.
Background
The decision tree algorithm constructs a decision tree to find out classification rules implied in data, how to construct the decision tree with high precision and small scale is the core content of the decision tree algorithm, the decision tree construction can be carried out in two steps, and the first step is the generation of the decision tree, namely, the process of generating the decision tree by a training sample set, wherein the training sample set is a data set which has history according to actual needs, has a certain comprehensive degree and is used for data analysis and processing in general; and secondly, pruning the decision tree, namely, the process of checking, correcting and repairing the decision tree generated at the previous stage by pruning the decision tree, wherein the process is mainly to prune branches influencing the accuracy of pre-balance by using a preliminary rule generated in the process of generating the decision tree by checking data in a new sample data set (called a test data set).
At present, the fault of the sports equipment is diagnosed by manpower mostly, the diagnosis process is time-consuming and labor-consuming, a plurality of parts of the sports equipment need to be eliminated one by one, a great amount of time is usually spent to find the position of the fault, much inconvenience is brought to the maintenance of the sports equipment, and the diagnosis and maintenance cost is improved, so that the fault diagnosis method based on the decision tree algorithm is provided to solve the defects in the prior art.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a fault diagnosis method based on a decision tree algorithm, which solves the problems that the fault of the sports equipment is diagnosed by manpower, the diagnosis process is time-consuming and labor-consuming, a plurality of parts of the sports equipment need to be eliminated one by one, the position of the fault can be found by spending a large amount of time, the maintenance of the sports equipment is inconvenient, and the diagnosis and maintenance cost is increased.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme: a fault diagnosis method based on a decision tree algorithm comprises the following steps:
s1, collecting sample data;
s2, classifying the collected samples to form a new set;
s3, extracting key characteristic values of the sample set, fusing similar characteristics, and updating sample data;
s4, establishing decision tree nodes and training a sample data set;
s5, pruning the decision tree;
s6, generating a final decision tree and diagnosing faults;
and S7, testing the diagnosis accuracy and correcting the decision tree in time.
Preferably, in the step 1, a big data capture algorithm is used to capture historical failure analysis result samples, the failure analysis result samples account for more than 98% of the total database, and meanwhile, a keyword group extraction algorithm is used to extract collected effective samples and screen out samples of irrelevant content.
Preferably, in the step 2, all the collected samples are classified according to the same attribute value, the attribute value includes a key phrase, a fault type, a fault analysis result and an inefficacy factor, similar samples are divided into the same set, all the generated sets are marked as P1, P2, P3.. Pi and Pj, and meanwhile, sufficient sample amount is guaranteed to be available in all the sets P1, P2, P3.. Pi and Pj.
Preferably, in the step 3, all sample feature values in each set P are extracted, the similarity of the sample data is observed, the feature values of the samples with higher similarity are fused to optimize a new sample data, and simultaneously, all the samples in the sets P1, P2, P3.
Preferably, in step 4, the second largest deterministic feature in the set is found by using recursion until all data in the sub-data sets belong to the same class, one feature is selected from a plurality of features in the training data as a splitting criterion of the current node, assuming a sample space (X, Y) of a sorted set, X representing a sample, Y representing n classes, and possible values are W1, W2,.., Wn, and the probability of occurrence of each class is G (W1), G (W2.. G (Wn)), and the conditional gain ratio of the decision tree node is calculated by the following calculation formula:
Figure BDA0002228373430000031
preferably, when the decision tree is constructed in step 5, many branches reflect the abnormality in the training data due to noise or isolated points in the training data, and the classification of the data with unknown class is performed by using such decision tree, so that the classification accuracy is not high, and the unnecessary branches are detected and subtracted.
Preferably, a decision tree algorithm finally related to fault diagnosis is established in the step 6, and the algorithm is used for analyzing and diagnosing the faults of the sports equipment.
Preferably, in the step 7, the fault data is imported into the decision tree algorithm to diagnose the fault, and then the fault data is analyzed and compared with the manual diagnosis result, so as to compare the accuracy of the decision tree algorithm in diagnosing the fault and adjust the decision tree algorithm in time.
(III) advantageous effects
The invention provides a fault diagnosis method based on a decision tree algorithm. The method has the following beneficial effects:
1. according to the fault diagnosis method based on the decision tree algorithm, the fault of the sports equipment is diagnosed by utilizing the decision tree algorithm, so that the diagnosis process is time-saving and labor-saving, a plurality of parts of the sports equipment do not need to be eliminated one by one, the position of the fault can be quickly found, much convenience is brought to the maintenance of the sports equipment, and the diagnosis and maintenance cost is reduced.
2. According to the fault diagnosis method based on the decision tree algorithm, the accuracy of the decision tree algorithm is greatly improved through optimization and continuity test of the decision tree algorithm.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example (b):
as shown in fig. 1, an embodiment of the present invention provides a fault diagnosis method based on a decision tree algorithm, including the following steps:
s1, collecting sample data;
s2, classifying the collected samples to form a new set;
s3, extracting key characteristic values of the sample set, fusing similar characteristics, and updating sample data;
s4, establishing decision tree nodes and training a sample data set;
s5, pruning the decision tree;
s6, generating a final decision tree and diagnosing faults;
and S7, testing the diagnosis accuracy and correcting the decision tree in time.
In the step 1, a big data capturing algorithm is used for capturing historical fault analysis result samples, the fault analysis result samples account for more than 98% of the total database, meanwhile, a key phrase extracting algorithm is used for extracting collected effective samples, and samples with irrelevant contents are screened out.
In step 2, all collected samples are classified according to the same attribute value, the attribute value comprises a key phrase, a fault type, a fault analysis result and an inefficacy factor, similar samples are divided into the same set, all generated sets are marked as P1, P2, P3.
In step 3, all sample characteristic values in each set P are extracted, the similarity degree of sample data is observed, the samples with higher similarity degree are subjected to characteristic value fusion to be optimized into new sample data, and simultaneously, all samples in the sets P1, P2, P3.
In step 4, a second largest decisive feature in the set is found by using recursion until all data in the sub-data sets belong to the same class, one feature is selected from a plurality of features in the training data as a splitting standard of a current node, a sample space (X, Y) of a classified set is assumed, X represents a sample, Y represents n classes, possible values are W1, W2,.., Wn, the probability of occurrence of each class is G (W1), G (W2.. G (Wn), and the conditional gain rate of the nodes of the decision tree is calculated, wherein the calculation formula is as follows:
Figure BDA0002228373430000061
when the decision tree is constructed in the step 5, due to noise or isolated points in the training data, a plurality of branches reflect the abnormity in the training data, the decision tree is used for classifying the data with unknown classes, the classification accuracy is not high, and therefore the unnecessary branches are detected and subtracted.
And 6, establishing a final decision tree algorithm related to fault diagnosis, and analyzing and diagnosing the faults of the sports equipment by using the algorithm.
And 7, importing the fault data into a decision tree algorithm, diagnosing the fault, analyzing and comparing the result with a manual diagnosis result, comparing the accuracy of the decision tree algorithm on fault diagnosis, and adjusting the decision tree algorithm in time.
The fault of the sports equipment is diagnosed by utilizing the decision tree algorithm, so that the diagnosis process is time-saving and labor-saving, a plurality of parts of the sports equipment do not need to be eliminated one by one, the position of the fault can be quickly found, much convenience is brought to the maintenance of the sports equipment, and the diagnosis and maintenance cost is reduced.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1. A fault diagnosis method based on decision tree algorithm is characterized in that: the method comprises the following steps:
s1, collecting sample data;
s2, classifying the collected samples to form a new set;
s3, extracting key characteristic values of the sample set, fusing similar characteristics, and updating sample data;
s4, establishing decision tree nodes and training a sample data set;
s5, pruning the decision tree;
s6, generating a final decision tree and diagnosing faults;
and S7, testing the diagnosis accuracy and correcting the decision tree in time.
2. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: in the step 1, a big data capture algorithm is used for capturing historical fault analysis result samples, the fault analysis result samples account for more than 98% of the total database, meanwhile, a key phrase extraction algorithm is used for extracting collected effective samples, and samples with irrelevant contents are screened out.
3. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: in the step 2, all the collected samples are classified according to the same attribute value, the attribute value includes a key phrase, a fault type, a fault analysis result and an inefficacy factor, similar samples are divided into the same set, all the generated sets are marked as P1, P2, P3.
4. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: in the step 3, all sample characteristic values in each set P are extracted, the similarity of the sample data is observed, the sample with higher similarity is subjected to characteristic value fusion to be optimized into a new sample data, and simultaneously, all samples in the sets P1, P2, P3.. Pi and Pj are updated, so that the sample capacity in all the sets is optimized.
5. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: in the step 4, a second largest decisive feature in the set is found by using recursion until all data in the sub-data sets belong to the same class, one feature is selected from a plurality of features in the training data as a splitting standard of the current node, a sample space (X, Y) of a classified set is assumed, X represents a sample, Y represents n classes, possible values are W1, W2,.., Wn, and the probability of occurrence of each class is G (W1), G (W2.. G (Wn), and a conditional gain rate of the decision tree node is calculated, wherein the calculation formula is as follows:
Figure FDA0002228373420000021
6. the fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: when the decision tree is constructed in the step 5, due to noise or isolated points in the training data, many branches reflect the abnormality in the training data, and the decision tree is used for classifying the data with unknown class, so that the classification accuracy is not high, and the unnecessary branches are detected and subtracted.
7. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: and 6, establishing a final decision tree algorithm related to fault diagnosis, and analyzing and diagnosing the faults of the sports equipment by using the algorithm.
8. The fault diagnosis method based on decision tree algorithm according to claim 1, characterized in that: and 7, importing the fault data into the decision tree algorithm to diagnose the fault, analyzing and comparing the fault data with a manual diagnosis result, comparing the accuracy of the decision tree algorithm on fault diagnosis, and adjusting the decision tree algorithm in time.
CN201910959230.3A 2019-10-10 2019-10-10 Fault diagnosis method based on decision tree algorithm Pending CN110716820A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910959230.3A CN110716820A (en) 2019-10-10 2019-10-10 Fault diagnosis method based on decision tree algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910959230.3A CN110716820A (en) 2019-10-10 2019-10-10 Fault diagnosis method based on decision tree algorithm

Publications (1)

Publication Number Publication Date
CN110716820A true CN110716820A (en) 2020-01-21

Family

ID=69211364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910959230.3A Pending CN110716820A (en) 2019-10-10 2019-10-10 Fault diagnosis method based on decision tree algorithm

Country Status (1)

Country Link
CN (1) CN110716820A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111562111A (en) * 2020-06-05 2020-08-21 上海交通大学 Engine cold state test fault diagnosis method
CN112733775A (en) * 2021-01-18 2021-04-30 苏州大学 Hyperspectral image classification method based on deep learning
CN113256176A (en) * 2021-07-06 2021-08-13 北京全路通信信号研究设计院集团有限公司 Dispatching command compiling system and method for railway locomotive application state conversion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104506340A (en) * 2014-11-21 2015-04-08 河南中烟工业有限责任公司 Creation method of decision tree in industrial Ethernet fault diagnosis method
US20180238951A1 (en) * 2016-09-07 2018-08-23 Jiangnan University Decision Tree SVM Fault Diagnosis Method of Photovoltaic Diode-Clamped Three-Level Inverter
CN109218114A (en) * 2018-11-12 2019-01-15 西安微电子技术研究所 A kind of server failure automatic checkout system and detection method based on decision tree
CN109522957A (en) * 2018-11-16 2019-03-26 上海海事大学 The method of harbour gantry crane machine work status fault classification based on decision Tree algorithms
CN110188834A (en) * 2019-06-04 2019-08-30 广东电网有限责任公司 A kind of method for diagnosing faults of power telecom network, device and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104506340A (en) * 2014-11-21 2015-04-08 河南中烟工业有限责任公司 Creation method of decision tree in industrial Ethernet fault diagnosis method
US20180238951A1 (en) * 2016-09-07 2018-08-23 Jiangnan University Decision Tree SVM Fault Diagnosis Method of Photovoltaic Diode-Clamped Three-Level Inverter
CN109218114A (en) * 2018-11-12 2019-01-15 西安微电子技术研究所 A kind of server failure automatic checkout system and detection method based on decision tree
CN109522957A (en) * 2018-11-16 2019-03-26 上海海事大学 The method of harbour gantry crane machine work status fault classification based on decision Tree algorithms
CN110188834A (en) * 2019-06-04 2019-08-30 广东电网有限责任公司 A kind of method for diagnosing faults of power telecom network, device and equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111562111A (en) * 2020-06-05 2020-08-21 上海交通大学 Engine cold state test fault diagnosis method
CN112733775A (en) * 2021-01-18 2021-04-30 苏州大学 Hyperspectral image classification method based on deep learning
CN113256176A (en) * 2021-07-06 2021-08-13 北京全路通信信号研究设计院集团有限公司 Dispatching command compiling system and method for railway locomotive application state conversion

Similar Documents

Publication Publication Date Title
CN107301296B (en) Data-based qualitative analysis method for circuit breaker fault influence factors
CN110716820A (en) Fault diagnosis method based on decision tree algorithm
CN112766550B (en) Random forest-based power failure sensitive user prediction method, system, storage medium and computer equipment
CN111177655B (en) Data processing method and device and electronic equipment
CN113239365B (en) Vulnerability repairing method based on knowledge graph
CN114221790A (en) BGP (Border gateway protocol) anomaly detection method and system based on graph attention network
TW202038110A (en) Classifying defects in a semiconductor specimen
CN110825642B (en) Software code line-level defect detection method based on deep learning
CN111949480A (en) Log anomaly detection method based on component perception
CN112328499A (en) Test data generation method, device, equipment and medium
CN115800272A (en) Power grid fault analysis method, system, terminal and medium based on topology identification
CN114841789A (en) Block chain-based auditing and auditing pricing fault data online editing method and system
CN114416573A (en) Defect analysis method, device, equipment and medium for application program
CN113283973A (en) Account checking difference data processing method and device, computer equipment and storage medium
CN112687402A (en) Intelligent medical internet big data processing method based on artificial intelligence and intelligent cloud service platform
CN116522111A (en) Automatic diagnosis method for remote power failure
CN106569944A (en) Constraint-tree-based onboard software test data analysis method
CN115329663A (en) Key feature selection method and device for processing power load monitoring sparse data
CN116846837A (en) Traffic identification method and device, electronic equipment and storage medium
CN113726558A (en) Network equipment flow prediction system based on random forest algorithm
CN114547294A (en) Rumor detection method and system based on comprehensive information of propagation process
CN115904920A (en) Test case recommendation method and device, terminal and storage medium
CN113268419A (en) Method, device, equipment and storage medium for generating test case optimization information
CN114663102A (en) Method, equipment and storage medium for predicting debt subject default based on semi-supervised model
D’Orazio Some Approaches to Outliers’ Detection in R

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200121

RJ01 Rejection of invention patent application after publication