CN114722915A - Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm - Google Patents

Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm Download PDF

Info

Publication number
CN114722915A
CN114722915A CN202210267057.2A CN202210267057A CN114722915A CN 114722915 A CN114722915 A CN 114722915A CN 202210267057 A CN202210267057 A CN 202210267057A CN 114722915 A CN114722915 A CN 114722915A
Authority
CN
China
Prior art keywords
fault
data set
random forest
algorithm
wavelet packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210267057.2A
Other languages
Chinese (zh)
Other versions
CN114722915B (en
Inventor
舒一飞
樊博
刘鹏
康洁滢
郭汶昇
马智强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marketing Service Center Of State Grid Ningxia Electric Power Co ltd Metering Center Of State Grid Ningxia Electric Power Co ltd
Original Assignee
Marketing Service Center Of State Grid Ningxia Electric Power Co ltd Metering Center Of State Grid Ningxia Electric Power Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketing Service Center Of State Grid Ningxia Electric Power Co ltd Metering Center Of State Grid Ningxia Electric Power Co ltd filed Critical Marketing Service Center Of State Grid Ningxia Electric Power Co ltd Metering Center Of State Grid Ningxia Electric Power Co ltd
Priority to CN202210267057.2A priority Critical patent/CN114722915B/en
Publication of CN114722915A publication Critical patent/CN114722915A/en
Application granted granted Critical
Publication of CN114722915B publication Critical patent/CN114722915B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/28Testing of electronic circuits, e.g. by signal tracer
    • G01R31/2801Testing of printed circuits, backplanes, motherboards, hybrid circuits or carriers for multichip packages [MCP]
    • G01R31/281Specific types of tests or tests for a specific type of fault, e.g. thermal mapping, shorts testing
    • G01R31/2812Checking for open circuits or shorts, e.g. solder bridges; Testing conductivity, resistivity or impedance
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/28Testing of electronic circuits, e.g. by signal tracer
    • G01R31/2832Specific tests of electronic circuits not provided for elsewhere
    • G01R31/2836Fault-finding or characterising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a fault diagnosis method based on an ADASYN algorithm and a random forest algorithm, and belongs to the technical field of fault diagnosis. The method comprises the following steps: collecting fault current of an internal circuit of the intelligent household appliance, and recording the fault type of the fault current; carrying out multi-layer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current into a data set; preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set; constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set; and diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model.

Description

Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm
Technical Field
The invention relates to the technical field of fault diagnosis, in particular to a fault diagnosis method and system based on an ADASYN algorithm and a random forest algorithm.
Background
With the continuous development of scientific technology and information technology, the use ratio of the intelligent household appliances is remarkably improved. The intelligent household appliances are different from the traditional household appliances, the internal structures of most of the intelligent household appliances are complex, and the internal circuits related to digital signal control are difficult to check through a traditional method when the internal circuits are in failure.
In the prior art, a Fourier analysis method is used for diagnosing circuit faults, firstly, time domain information is converted into a frequency domain by adopting Fourier conversion, then frequency domain fault characteristic selection is carried out, and fault diagnosis is realized, but the method is only suitable for open-circuit fault analysis; the prior art also provides a circuit fault diagnosis method combining an optimized wavelet packet and an extreme learning machine, and the extreme learning machine is used for classifying and identifying faults, so that the method has the advantage of short training time, but still has the problem of result deviation caused by unbalanced sample distribution.
Disclosure of Invention
In view of this, the invention provides a fault diagnosis method and system based on an ADASYN algorithm and a random forest algorithm, which diagnose the internal circuit fault of the intelligent household appliance by combining the ADASYN algorithm and the random forest algorithm, are suitable for circuit fault analysis in three states of a path, an open circuit and a short circuit, have the advantage of uniform sample data size, and can diagnose the internal circuit fault of the intelligent household appliance with higher diagnosis precision.
The technical scheme adopted by the embodiment of the invention for solving the technical problem is as follows:
on one hand, the invention provides a fault diagnosis method based on an ADASYN algorithm and a random forest algorithm, which comprises the following steps:
step S1, collecting fault current of an internal circuit of the intelligent household appliance, and recording the fault type of the fault current;
step S2, carrying out multi-layer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current into a data set;
step S3, preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
step S4, constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set;
and step S5, diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
Preferably, the step S1 is to collect a fault current of an internal circuit of the smart appliance, and recording a fault type of the fault current includes:
establishing a circuit model of the internal circuit of the intelligent household appliance by using PSCAD simulation software;
setting the fault type and operating the circuit model;
and collecting and recording the fault current of the circuit model in the fault type state.
Preferably, the step S2 performs multi-layer wavelet packet decomposition on the fault current and calculates a normalized feature vector of the fault current, and recording the fault type corresponding to the normalized feature vector and the fault current to a data set includes:
step 21, constructing a wavelet packet decomposition layer not less than one layer, wherein each layer of wavelet packet decomposition is represented by recursion:
Figure BDA0003550123410000021
wherein,
Figure BDA0003550123410000022
is the wavelet packet of the nth node in the jth wavelet packet decomposition layer, J belongs to [1, J ]],n∈[0,2j-1]J is the total number of layers, k is a translation variable, k belongs to (- ∞, and + ∞), t is a time variable, the function h (k-2t) is the output value of the low-pass filter, and the function g (k-2t) is the output value of the high-pass filter;
step 22, using said waveletThe packet decomposition layer carries out multi-layer wavelet packet decomposition on the fault current to obtain node energy Ej,n
Figure BDA0003550123410000031
Wherein E isj,nIs the node energy, W, of the nth node in the jth wavelet packet decomposition layer(j,n)Is a node signal of the nth node in the jth wavelet packet decomposition layer, dj,n(p) is the node signal W(j,n)The corresponding p-th coefficient after decomposition, p ∈ [1, m [ ]]M is the frequency band length of the nth node in the jth wavelet packet decomposition layer;
step 23, according to the node energy Ej,nEstablishing a hierarchical energy feature vector Ej
Figure BDA0003550123410000032
Wherein E isjA hierarchical energy feature vector for the jth wavelet packet decomposition layer;
step 24, according to the level energy feature vector EjAnd calculating the normalized feature vector X:
Figure BDA0003550123410000033
and 25, recording the normalized feature vector X and the corresponding fault type into the data set.
Preferably, the step S3 is to pre-process the data set by an ADASYN algorithm, and obtaining the pre-processed data set includes:
step S31, calculating the number G of samples needing to be generated according to the data sets
Gs=(ms-max-ms)×β
Wherein m iss-maxThe fault type s with the maximum total number of samples in the data setThe number of samples, m, for maxsFor the failure type m in the data sets-maxNumber of samples corresponding to other fault types S, S ∈ [1, S-1 ]]S is the number of fault types in the data set, beta is a sample factor, and beta belongs to [0,1 ]];
Step S32, randomly selecting an ith sample belonging to the fault type S, and calculating the sample data proportion r of the fault type S-max in K adjacent samples of the ith samplei
Figure BDA0003550123410000041
Wherein Δ i is the number of samples belonging to the fault type s-max in K adjacent samples of the ith sample, i ∈ [1, ms];
Step S33, normalizing the data ratio riObtaining the normalized data ratio
Figure BDA0003550123410000042
Step S34, calculating the number g of new samples of the final newly-added sample number required by the fault type SsThe calculation formula is as follows:
Figure BDA0003550123410000043
step S35, calculating g belonging to the fault type SsA new sample siAnd adding the data set to obtain a preprocessed data set, wherein a new sample siIs represented as:
si=xi+(xzi-xi)×λ
wherein x isziIs the ith sample xiOf the K neighboring samples of (a), the sample xziNot belonging to the fault type s-max, λ being a random number.
Preferably, the step S4, based on the preprocessed data set, constructing a random forest fault diagnosis model through a random forest algorithm includes:
step S41, resampling the preprocessed data set by a Bootstrap method to obtain a training data set;
step S42, a root node is constructed, a fault types are selected as decision tree features, and a decision tree is generated based on all the normalized feature vectors corresponding to the fault types a in the training data set;
step S43, repeating step S41 and step S42 to obtain C decision trees;
step S44, randomly extracting the normalized feature vector from the preprocessed data set as a test sample;
step S45, performing fault type voting on the test sample through the C decision trees, and verifying whether the fault type with the most votes is consistent with the actual fault type of the test sample, where the expression of the voting result R is:
Figure BDA0003550123410000051
wherein x is the training data set, y is the actual fault type of the test sample, rc(x) For the decision tree model, C is an element of [1, C ∈],I[.]To indicate the function, argmax (.) is a maximum function;
step S46, if not, adjusting the parameters of the C decision trees, and then repeatedly executing the step S45; and if the random forest fault diagnosis models are consistent, the random forest fault diagnosis model is constructed.
Preferably, the step S5 of diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model includes:
calculating a normalized feature vector of the fault current to be diagnosed;
based on the normalized feature vector of the fault current to be diagnosed, performing fault type voting through the random forest fault diagnosis model and counting voting results;
and selecting the fault type with the most votes as the diagnosis result of the fault current to be diagnosed.
Preferably, the fault types include short circuit faults, disconnection faults, series arc faults, and leakage current faults.
The invention also provides a fault diagnosis system based on the ADASYNN algorithm and the random forest algorithm, which comprises the following steps:
the fault current acquisition module is used for acquiring fault current of an internal circuit of the intelligent household appliance and recording the fault type of the fault current;
the data set construction module is used for carrying out multilayer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current to a data set;
the data set preprocessing module is used for preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
the fault diagnosis module is used for constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set; and the fault type of the fault current to be diagnosed is diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
Preferably, the data set construction module comprises:
the wavelet packet decomposition layer construction submodule is used for constructing not less than one wavelet packet decomposition layer;
the node energy calculation submodule is used for carrying out multi-layer wavelet packet decomposition on the fault current by utilizing the wavelet packet decomposition layer to obtain node energy of each wavelet packet;
the data set construction submodule is used for establishing a level energy characteristic vector according to the node energy of each wavelet packet and performing normalization calculation to obtain a normalization characteristic vector; recording the normalized feature vectors and the corresponding fault types to the data set.
Preferably, the fault diagnosis module comprises:
the resampling submodule is used for resampling the preprocessed data set by a Bootstrap method to obtain a training data set;
the decision tree generation submodule is used for constructing a root node, selecting a fault types as decision tree characteristics, and generating a decision tree based on all the normalized characteristic vectors corresponding to the fault types a in the training data set;
the verification submodule is used for randomly extracting the normalized feature vector from the preprocessed data set to serve as a test sample, voting fault types of the test sample through the C decision trees and verifying whether the fault type with the largest voting times is consistent with the actual fault type of the test sample; and if the random forest fault diagnosis models are consistent, the random forest fault diagnosis model is constructed.
According to the technical scheme, the fault diagnosis method based on the ADASYN algorithm and the random forest algorithm, provided by the embodiment of the invention, is used for collecting the fault current of the internal circuit of the intelligent household appliance and recording the fault type of the fault current; carrying out multilayer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current to a data set; preprocessing the data set through an ADASYNN algorithm to obtain a preprocessed data set; constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set; and finally, diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model. The method is suitable for analyzing the circuit faults in three states of open circuit, open circuit and short circuit, has the advantage of uniform sample data volume, and can diagnose the internal circuit faults of the intelligent household electrical appliance with higher diagnosis precision.
Drawings
FIG. 1 is a flow chart of the steps of the fault diagnosis method based on the ADASYNN algorithm and the random forest algorithm according to the present invention;
FIG. 2(a) is a short-circuit fault current waveform diagram;
FIG. 2(b) is a broken line fault current waveform diagram;
FIG. 2(c) is a series arc fault current waveform diagram;
FIG. 2(d) is a leakage current fault current waveform diagram;
fig. 3 is an exploded schematic view of a three-layer wavelet packet provided in this embodiment;
FIG. 4 is a schematic structural diagram of a fault diagnosis system based on the ADASYN algorithm and the random forest algorithm according to the present invention;
FIG. 5(a) shows wavelet packet decomposition waveforms for different failure conditions at node 1;
FIG. 5(b) shows wavelet packet decomposition waveforms for different failure conditions at node 32;
FIG. 6 is a graph comparing the accuracy of different algorithms for various fault types in an embodiment of the present invention;
FIG. 7 is a graph comparing recall rates for various types of faults for different algorithms in an embodiment of the present invention;
fig. 8 is a graph comparing F1 values for various fault types for different algorithms in an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 1, an embodiment of the present invention provides a fault diagnosis method based on ADASYN algorithm and random forest algorithm, including the following steps:
step S1, collecting the fault current of the internal circuit of the intelligent household appliance, and recording the fault type of the fault current;
step S2, carrying out multi-layer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current into a data set;
step S3, preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
step S4, constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set;
and step S5, diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
Preferably, step S1 is specifically:
step S11, establishing a circuit model of the internal circuit of the intelligent household appliance by using PSCAD simulation software;
step S12, setting fault type and operating circuit model;
and step S13, acquiring and recording the fault current of the circuit model in the fault type state.
In practice, a PSCAD simulation software is used for constructing an internal circuit model of the intelligent household appliance, the effective value of the power voltage is 220V, the total simulation time is 5s, the measurement frequency is 1kHz, the model is used for simulating faults, and under each fault, current signals of the fault branch are collected, and fault current waves of the fault branch are shown in fig. 2.
When collecting data, the fault types may include a short circuit fault F1, a disconnection fault F2, a series arc fault F3, a leakage current fault F4, and the like.
The specific process of performing multi-layer wavelet packet decomposition on the fault current and calculating the normalized feature vector of the fault current in step S2 includes:
step 21, constructing a wavelet packet decomposition layer not less than one layer, wherein each layer of wavelet packet decomposition is represented by recursion:
Figure BDA0003550123410000091
wherein,
Figure BDA0003550123410000092
is the wavelet packet of the nth node in the jth wavelet packet decomposition layer, J belongs to [1, J ]],n∈[0,2j-1]J is the total number of layers, k is the translation variable, kE (- ∞, + ∞), t is the time variable, function h (k-2t) is the low pass filter output, and function g (k-2t) is the high pass filter output;
Figure BDA0003550123410000093
is the wavelet packet of the 2 n-th node in the j + 1-th wavelet packet decomposition layer,
Figure BDA0003550123410000094
is the wavelet packet of the 2n +1 node in the j +1 th wavelet packet decomposition layer.
In practice, too many wavelet packet decomposition layers increase the complexity of operation, reduce the diagnosis efficiency, and easily cause signal distortion, and 3-6 wavelet packet decomposition layers are more suitable for ensuring the efficiency; in the embodiment of the present invention, J — 3 wavelet packet decomposition layers are taken as an example, and a schematic diagram of three-layer wavelet packet decomposition is shown in fig. 3, where W is an original time domain signal, and W is a time domain signal(j,n)And (j is 0,1,2, 3; n is 0,1,2,3,4,5,6,7) is a node signal of the nth node in the jth wavelet packet decomposition layer.
Step 22, utilizing the wavelet packet decomposition layer to carry out multi-layer wavelet packet decomposition on the fault current to obtain node energy Ej,n
Figure BDA0003550123410000095
Wherein E isj,nIs the node energy of the nth node in the jth wavelet packet decomposition layer, W(j,n)For node signals of the nth node in the jth wavelet packet decomposition layer, dj,n(p) is a node signal W(j,n)The corresponding p-th coefficient after decomposition, p ∈ [1, m [ ]]M is the frequency band length of the nth node in the jth wavelet packet decomposition layer;
step 23, according to the node energy Ej,nEstablishing a hierarchical energy feature vector Ej
Figure BDA0003550123410000101
Wherein E isjIs the jth waveletA hierarchical energy feature vector of a packet decomposition layer; ej,2j-1Node energy on a 2j-1 node of a jth wavelet packet decomposition layer is obtained;
step 24, according to the level energy characteristic vector EjAnd calculating a normalized feature vector X:
Figure BDA0003550123410000102
and 25, recording the normalized feature vector X and the corresponding fault type into a data set.
In practice, the original current signal has higher dimensionality, and in order to reduce training time and resource consumption, the original current signal is subjected to feature extraction, a wavelet packet decomposition layer is adopted to extract wavelet packet energy features as feature vectors, and the original high-dimensional current signal is converted into a low-dimensional feature vector, so that the calculation amount is favorably reduced, and the training time is shortened; energy on each node of the last layer is extracted through wavelet packet decomposition to serve as a feature vector, more details of original signals are shown, and subsequent feature extraction is facilitated.
By extracting the wavelet packet energy characteristics as the characteristic vectors, the original high-dimensional current signals are converted into the low-dimensional characteristic vectors, the problem that high order harmonics possibly exist in a time domain is solved, the stability of fault characteristic extraction is improved, and the calculated amount is reduced.
Step S3, oversampling a training set formed by feature vectors of a few types of samples by using an ADASYN algorithm to solve the problem of data unbalance, and the specific steps include:
step S31, calculating the number G of samples needing to be generated according to the fault type S according to the data sets
Gs=(ms-max-ms)×β (5)
Wherein m iss-maxThe number of samples, m, corresponding to the fault type s-max with the largest total number of samples in the data setsFor failure type m in data sets-maxNumber of samples corresponding to other fault types S, S ∈ [1, S-1 ]]S is the number of fault types in the data set; beta is the sameThe factor, beta ∈ [0,1 ]](ii) a In practice, when β is 1, the number of samples of the synthesized other fault types and the fault type s-max is equal.
According to the embodiment of the invention, 9100 groups of data are collected together, wherein 1500 groups of short-circuit faults, 1200 groups of disconnection faults, 1400 groups of series arc faults and 5000 groups of leakage current faults, 6370 groups are randomly selected as training samples, the rest 2730 groups are selected as test samples, and 32-dimensional fault feature vectors are obtained from each group of samples after the 6370 groups are used as the training samples and are decomposed by 5 layers of wavelet packets.
Step S32, randomly selecting an ith sample belonging to a fault type S, and calculating the sample data proportion r belonging to the fault type S-max in K adjacent samples of the ith samplei
Figure BDA0003550123410000111
Wherein, Delta i is the number of samples belonging to the fault type s-max in K adjacent samples of the ith sample, i belongs to [1, m ∈s];
Step S33, normalizing the data ratio riObtaining the normalized data ratio
Figure BDA0003550123410000112
In practice, the data ratio is normalized, and the calculation formula can be expressed as:
Figure BDA0003550123410000113
the number of new samples that need to be generated is calculated for each few class of samples using normalized data occupancy.
Step S34, calculating the number g of new samples of the final newly-added sample number required by the fault type SsThe calculation formula is as follows:
Figure BDA0003550123410000114
step S35, calculating g belonging to fault type SsA new sample siAnd adding the data set to obtain a preprocessed data set, wherein the new sample siIs represented as:
si=xi+(xzi-xi)×λ (8)
wherein x isziIs the ith sample xiOf the K neighboring samples of (1), sample xziNot belonging to fault type s-max, lambda is random number, lambda belongs to [0,1 ]]。
In practice, after three faults with a small number are expanded by adopting an ADASYN algorithm, a series arc fault data 3546 group, a short-circuit fault data 3511 group, a disconnection fault data 3536 group and a leakage current fault characteristic data 3522 group are obtained and serve as a final training set.
The ADASYN algorithm carries out oversampling on a training set formed by feature vectors of a few types of samples, solves the problem of unbalanced data in a data set, and can improve the diagnosis precision of a fault diagnosis model.
Step S4 is based on the preprocessed data set, a random forest fault diagnosis model is constructed through a random forest algorithm, in practice, the random forest algorithm is a randomly constructed forest model, each forest model is composed of a plurality of decision trees, all samples in a training set are placed in root nodes by the decision trees, then the optimal features are selected to divide the data set, if the subsets are classified basically correctly, leaf nodes are constructed, otherwise, new optimal features are selected again to continue division, and the number of the decision trees in the random forest is set to be 300.
And step S41, resampling the preprocessed data set by a Bootstrap method to obtain a training data set. Resampling the data set by using a Bootstrap method, randomly drawing a plurality of samples with the samples being replaced, repeating the sampling for K times, and recording a new data set as Nk,k=1,2,…,K。
Step S42, a root node is constructed, a fault types are selected as decision tree characteristics, and a decision tree T is generated based on all normalized characteristic vectors corresponding to the fault types in the training data setc
Step S43, repeatedly executing step S41 and stepS42, and making the decision tree TcGrowing as much as possible, without pruning, generating K decision trees to obtain C decision trees;
step S44, randomly extracting normalized feature vectors from the preprocessed data set as test samples;
step S45, fault type voting is carried out on the test sample through C decision trees, and whether the fault type with the most voting times is consistent with the actual fault type of the test sample is verified, wherein the expression of the voting result R is as follows:
Figure BDA0003550123410000121
wherein x is a training data set, y is an actual fault type of a test sample, rc(x) For the decision tree model, C is an element of [1, C ∈],I[.]To indicate the function, argmax (.) is a maximum function;
step S46, if not, adjusting the parameters of the C decision trees, and then repeatedly executing the step S45; and if the random forest fault diagnosis models are consistent, finishing the construction of the random forest fault diagnosis model.
Step S5, the implementation step of diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model comprises the following steps:
step S61, calculating a normalized feature vector of the fault current to be diagnosed;
step S62, based on the normalized feature vector of the fault current to be diagnosed, the fault type voting is carried out through a random forest fault diagnosis model, and the voting result is counted;
and step S63, selecting the fault type with the most voting times as the diagnosis result of the fault current to be diagnosed.
As shown in fig. 4, in another aspect, the present invention provides a fault diagnosis system based on ADASYN algorithm and random forest algorithm, which can be used to implement the method shown in fig. 1, including:
the fault current acquisition module is used for acquiring the fault current of the internal circuit of the intelligent household appliance and recording the fault type of the fault current;
the data set construction module is used for carrying out multilayer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current and recording the normalized feature vector and the fault type corresponding to the fault current to the data set;
the data set preprocessing module is used for preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
the fault diagnosis module is used for constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set; and the fault type of the fault current to be diagnosed is diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
Specifically, the data set building module comprises:
the wavelet packet decomposition layer construction submodule is used for constructing not less than one wavelet packet decomposition layer;
the node energy calculation submodule is used for carrying out multi-layer wavelet packet decomposition on the fault current by utilizing the wavelet packet decomposition layer to obtain the node energy of each wavelet packet;
the data set construction submodule is used for establishing a level energy feature vector according to the node energy of each wavelet packet and carrying out normalization calculation to obtain a normalized feature vector; and recording the normalized feature vectors and the corresponding fault types to a data set.
The fault diagnosis module includes:
the resampling submodule is used for resampling the preprocessed data set by a Bootstrap method to obtain a training data set;
the decision tree generation submodule is used for constructing a root node, selecting a fault types as decision tree characteristics, and generating a decision tree based on all normalized feature vectors corresponding to the fault types a in the training data set;
the verification submodule is used for randomly extracting the normalized feature vector from the preprocessed data set to serve as a test sample, performing fault type voting on the test sample through the C decision trees and verifying whether the fault type with the largest voting times is consistent with the actual fault type of the test sample; and if the random forest fault diagnosis models are consistent, finishing the construction of the random forest fault diagnosis model.
In practice, in the embodiment of the present invention, for 4 different fault situations, after five-layer wavelet packet decomposition is performed on fault current data, the wavelet packet decomposition waveforms of the 1 st node and the 32 th node under different fault situations are shown in fig. 5(a) and fig. 5(b), where F1 is a short-circuit fault, F2 is a broken-line fault, F3 is a series arc fault, and F4 is a leakage current fault, and it can be seen from the diagram that the decomposition waveforms of different faults are very different no matter at low frequency or high frequency, so that the energy of each node of the last layer after 5-layer wavelet packet decomposition is used as a feature vector to better distinguish different fault situations.
In practice, in order to make the experimental results more convincing, on the basis of the ADASYN algorithm amplification data set, the random forest fault diagnosis model is compared with the test results of different fault conditions of the SVM and the decision tree, the Precision (Precision, P), Recall (Recall, R) and F1 value (F1-score) are used as evaluation indexes, the test results are shown in fig. 6,7 and 8, and as can be seen from fig. 6, on the basis of the ADASYN algorithm amplification training set, the diagnosis Precision of the random forest fault (F2) is slightly lower than that of the SVM and the decision tree algorithm is higher than or equal to that of the rest faults. As shown in fig. 7 and 8, the recall rate and F1 value of four faults in random forest diagnosis are obviously superior to those of SVM and decision tree.
In this embodiment, in order to further verify the effectiveness of the method provided by the present invention, a random forest, an SVM and a decision tree algorithm are compared on the basis of an original data set and a data set amplified by an ADASYN algorithm, respectively, and the test results are shown in table 1;
as can be seen from Table 1, the test effect of the model after the training set is expanded by using the ADASYN algorithm is obviously better than that of the original unbalanced training set no matter the model is SVM, decision tree or random forest. Wherein each index of the ADASYN-random forest is obviously superior to other algorithms.
TABLE 1 comparison of test results
Figure BDA0003550123410000151
Aiming at the diagnosis of the internal circuit fault of the intelligent household appliance, the embodiment of the invention uses a random forest algorithm for diagnosing the internal circuit fault of the intelligent household appliance by building an internal circuit simulation model of the intelligent household appliance, utilizes wavelet packet decomposition to extract the energy characteristic of a fault signal, is favorable for solving the problem that higher harmonics may exist in a time domain, improves the stability of fault characteristic extraction, and simultaneously adopts an ADASYN algorithm to expand a training set, thereby solving the problem that the distribution of various samples acquired under the actual condition is possibly unbalanced. The experimental results show that the ADASYN-random forest fault diagnosis model has higher diagnosis precision and certain practical significance and theoretical research value for circuit fault diagnosis.
The method can diagnose the fault of the intelligent household appliance through a convenient and effective algorithm, and can save the after-sale cost while improving the product competitiveness of the intelligent household appliance manufacturer. For the family user, use experience can be improved, risks can be checked in time, and potential safety hazards are reduced.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (10)

1. A fault diagnosis method based on an ADASYN algorithm and a random forest algorithm is characterized by comprising the following steps:
step S1, collecting fault current of an internal circuit of the intelligent household appliance, and recording the fault type of the fault current;
step S2, carrying out multi-layer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current into a data set;
step S3, preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
step S4, constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set;
and step S5, diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
2. The fault diagnosis method based on the ADASYN algorithm and the random forest algorithm according to claim 1, wherein the step S1 of collecting the fault current of the internal circuit of the intelligent household appliance and recording the fault type of the fault current comprises:
step S11, establishing a circuit model of the internal circuit of the intelligent household appliance by using PSCAD simulation software;
step S12, setting the fault type and running the circuit model;
and step S13, collecting and recording the fault current of the circuit model in the fault type state.
3. The method of claim 2, wherein the step S2 is implemented by performing multi-layer wavelet packet decomposition on the fault current and calculating a normalized feature vector of the fault current, and the recording the fault type corresponding to the normalized feature vector and the fault current to a data set comprises:
step 21, constructing a wavelet packet decomposition layer not less than one layer, wherein each layer of wavelet packet decomposition is represented by recursion:
Figure FDA0003550123400000021
wherein,
Figure FDA0003550123400000022
refers to the nth node in the jth wavelet packet decomposition layerIs in the wavelet packet, J belongs to [1, J ∈ [ ]],n∈[0,2j-1]J is the total number of layers, k is a translation variable, k belongs to (- ∞, and + ∞), t is a time variable, the function h (k-2t) is the output value of the low-pass filter, and the function g (k-2t) is the output value of the high-pass filter;
step 22, utilizing the wavelet packet decomposition layer to carry out multilayer wavelet packet decomposition on the fault current to obtain node energy Ej,n
Figure FDA0003550123400000023
Wherein E isj,nIs the node energy of the nth node in the jth wavelet packet decomposition layer, W(j,n)Is a node signal of the nth node in the jth wavelet packet decomposition layer, dj,n(p) is the node signal W(j,n)The corresponding p-th coefficient after decomposition, p ∈ [1, m [ ]]M is the frequency band length of the nth node in the jth wavelet packet decomposition layer;
step 23, according to the node energy Ej,nEstablishing a hierarchical energy feature vector Ej
Figure FDA0003550123400000024
Wherein, EjA hierarchical energy feature vector for the jth wavelet packet decomposition layer;
step 24, according to the hierarchical energy feature vector EjAnd calculating the normalized feature vector X:
Figure FDA0003550123400000031
and 25, recording the normalized feature vector X and the corresponding fault type into the data set.
4. The method for fault diagnosis based on ADASYNN algorithm and random forest algorithm as claimed in claim 3, wherein said step S3 is performed by preprocessing said data set through ADASYNN algorithm, and obtaining the preprocessed data set comprises:
step S31, calculating the sample number G needed to be generated by the fault type S according to the data sets
Gs=(ms-max-ms)×β
Wherein m iss-maxThe number of samples, m, corresponding to the fault type s-max with the maximum total number of samples in the data setsFor the fault type m in the datasets-maxNumber of samples corresponding to other fault types S, S ∈ [1, S-1 ]]S is the number of the fault types in the data set, beta is a sample factor, and beta belongs to [0,1 ]];
Step S32, randomly selecting an ith sample belonging to the fault type S, and calculating the sample data proportion r of the fault type S-max in K adjacent samples of the ith samplei
Figure FDA0003550123400000032
Wherein Δ i is the number of samples belonging to the fault type s-max in K adjacent samples of the ith sample, i ∈ [1, ms];
Step S33, normalizing the data ratio riObtaining the normalized data ratio
Figure FDA0003550123400000033
Step S34, calculating the number g of new samples of the final newly-added sample number required by the fault type SsThe calculation formula is as follows:
Figure FDA0003550123400000041
step S35, calculating g belonging to the fault type SsA new sample siAnd is added toThe data set is described, a preprocessed data set is obtained, wherein a new sample siIs represented as:
Si=xi+(xzi-xi)×λ
wherein x isziIs the ith sample xiOf the K neighboring samples of (a), the sample xziAnd the fault type is not s-max, and the lambda is a random number.
5. The ADASYNN algorithm and random forest algorithm-based fault diagnosis method as claimed in claim 1, wherein the step S4 comprises the steps of, based on the preprocessed data set, constructing a random forest fault diagnosis model by a random forest algorithm:
step S41, resampling the preprocessed data set by a Bootstrap method to obtain a training data set;
step S42, a root node is constructed, a fault types are selected as decision tree features, and a decision tree is generated based on all the normalized feature vectors corresponding to the fault types a in the training data set;
step S43, repeating step S41 and step S42 to obtain C decision trees;
step S44, randomly extracting the normalized feature vector from the preprocessed data set as a test sample;
step S45, performing fault type voting on the test sample through the C decision trees, and verifying whether the fault type with the most votes is consistent with the actual fault type of the test sample, where the expression of the voting result R is:
Figure FDA0003550123400000051
wherein x is the training data set, y is the actual fault type of the test sample, rc(x) For the decision tree model, C is an element of [1, C ∈],I[.]To indicate the function, argmax (.) is a maximum function;
step S46, if not, adjusting the parameters of the C decision trees, and then repeatedly executing the step S45; and if the random forest fault diagnosis models are consistent, the random forest fault diagnosis model is constructed.
6. The method for fault diagnosis based on ADASYNN algorithm and random forest algorithm as claimed in claim 5, wherein the step S5 for diagnosing the fault type of the fault current to be diagnosed according to the preprocessed data set and the random forest fault diagnosis model comprises:
step S61, calculating the normalized feature vector of the fault current to be diagnosed;
step S62, based on the normalized feature vector of the fault current to be diagnosed, voting the fault types through the random forest fault diagnosis model and counting the voting results;
and step S63, selecting the fault type with the largest voting times as the diagnosis result of the fault current to be diagnosed.
7. The ADASYNN and random forest algorithm based fault diagnosis method of claim 1, wherein the fault types include short circuit fault, disconnection fault, series arc fault, and leakage current fault.
8. A fault diagnosis system based on an ADASYNN algorithm and a random forest algorithm is characterized by comprising the following steps:
the fault current acquisition module is used for acquiring fault current of an internal circuit of the intelligent household appliance and recording the fault type of the fault current;
the data set construction module is used for carrying out multilayer wavelet packet decomposition on the fault current, calculating a normalized feature vector of the fault current, and recording the normalized feature vector and the fault type corresponding to the fault current to a data set;
the data set preprocessing module is used for preprocessing the data set through an ADASYN algorithm to obtain a preprocessed data set;
the fault diagnosis module is used for constructing a random forest fault diagnosis model through a random forest algorithm based on the preprocessed data set; and the fault type of the fault current to be diagnosed is diagnosed according to the preprocessed data set and the random forest fault diagnosis model.
9. The ADASYN algorithm and random forest algorithm based fault diagnosis system of claim 8, wherein the data set construction module comprises:
the wavelet packet decomposition layer construction submodule is used for constructing not less than one wavelet packet decomposition layer; the node energy calculation submodule is used for carrying out multi-layer wavelet packet decomposition on the fault current by utilizing the wavelet packet decomposition layer to obtain node energy of each wavelet packet;
the data set construction submodule is used for establishing a level energy characteristic vector according to the node energy of each wavelet packet and carrying out normalization calculation to obtain a normalization characteristic vector; recording the normalized feature vectors and the corresponding fault types to the data set.
10. The ADASYN algorithm and random forest algorithm based fault diagnosis system of claim 8, wherein the fault diagnosis module comprises:
the resampling submodule is used for resampling the preprocessed data set by a Bootstrap method to obtain a training data set;
the decision tree generation submodule is used for constructing a root node, selecting a fault types as decision tree characteristics, and generating a decision tree based on all the normalized characteristic vectors corresponding to the fault types a in the training data set;
the verification submodule is used for randomly extracting the normalized feature vector from the preprocessed data set to serve as a test sample, voting fault types of the test sample through the C decision trees and verifying whether the fault type with the largest voting times is consistent with the actual fault type of the test sample; and if the random forest fault diagnosis models are consistent, the random forest fault diagnosis model is constructed.
CN202210267057.2A 2022-03-16 2022-03-16 Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm Active CN114722915B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210267057.2A CN114722915B (en) 2022-03-16 2022-03-16 Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210267057.2A CN114722915B (en) 2022-03-16 2022-03-16 Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm

Publications (2)

Publication Number Publication Date
CN114722915A true CN114722915A (en) 2022-07-08
CN114722915B CN114722915B (en) 2024-07-23

Family

ID=82237028

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210267057.2A Active CN114722915B (en) 2022-03-16 2022-03-16 Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm

Country Status (1)

Country Link
CN (1) CN114722915B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118585917A (en) * 2024-08-07 2024-09-03 北京中联太信科技有限公司 Electromagnetic valve fault online diagnosis method based on time-frequency domain characteristic analysis

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909977A (en) * 2019-10-12 2020-03-24 郑州电力高等专科学校 Power grid fault diagnosis method based on ADASYN-DHSD-ET
US20210293881A1 (en) * 2017-11-09 2021-09-23 Hefei University Of Technology Vector-valued regularized kernel function approximation based fault diagnosis method for analog circuit

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210293881A1 (en) * 2017-11-09 2021-09-23 Hefei University Of Technology Vector-valued regularized kernel function approximation based fault diagnosis method for analog circuit
CN110909977A (en) * 2019-10-12 2020-03-24 郑州电力高等专科学校 Power grid fault diagnosis method based on ADASYN-DHSD-ET

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
殷涛;苏盛;刘爱国;舒一飞;薛阳;杨艺宁: "基于向前逐步回归的高损线路窃电检测", 《电网技术》, 3 November 2021 (2021-11-03) *
袁帅;张慧丽;王晓燕;王涵;赵波;: "不平衡学习在电力设备故障诊断中的应用", 信息与电脑(理论版), no. 09, 15 May 2019 (2019-05-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118585917A (en) * 2024-08-07 2024-09-03 北京中联太信科技有限公司 Electromagnetic valve fault online diagnosis method based on time-frequency domain characteristic analysis
CN118585917B (en) * 2024-08-07 2024-10-11 北京中联太信科技有限公司 Electromagnetic valve fault online diagnosis method based on time-frequency domain characteristic analysis

Also Published As

Publication number Publication date
CN114722915B (en) 2024-07-23

Similar Documents

Publication Publication Date Title
CN109740523B (en) Power transformer fault diagnosis method based on acoustic features and neural network
CN112041693B (en) Power distribution network fault positioning system based on mixed wave recording
CN112885372B (en) Intelligent diagnosis method, system, terminal and medium for power equipment fault sound
CN105930901B (en) A kind of Diagnosis Method of Transformer Faults based on RBPNN
CN111612650B (en) DTW distance-based power consumer grouping method and system
CN107909118B (en) Power distribution network working condition wave recording classification method based on deep neural network
CN107122790A (en) Non-intrusion type load recognizer based on hybrid neural networks and integrated study
CN113177357B (en) Transient stability assessment method for power system
CN111368904B (en) Electrical equipment identification method based on electric power fingerprint
CN109145706A (en) A kind of sensitive features selection and dimension reduction method for analysis of vibration signal
CN106447039A (en) Non-supervision feature extraction method based on self-coding neural network
CN114781435B (en) Power electronic circuit fault diagnosis method based on improved Harris eagle optimization algorithm optimization variation modal decomposition
Gowrishankar et al. Transmission line fault detection and classification using discrete wavelet transform and artificial neural network
CN112766140A (en) Transformer fault identification method based on kernel function extreme learning machine
CN109307852A (en) A kind of method and system of the measurement error of determining electric automobile charging pile electric energy metering device
CN113203914A (en) Underground cable early fault detection and identification method based on DAE-CNN
CN109142976A (en) Cable fault examination method and device
CN112085111A (en) Load identification method and device
CN114062832A (en) Method and system for identifying short-circuit fault type of power distribution network
CN114722915A (en) Fault diagnosis method and system based on ADASYN algorithm and random forest algorithm
CN115712871A (en) Power electronic system fault diagnosis method combining resampling and integrated learning
CN115015683A (en) Cable production performance test method, device, equipment and storage medium
CN106597154A (en) Transformer fault diagnosis lifting method based on DAG-SVM
CN116776245A (en) Three-phase inverter equipment fault diagnosis method based on machine learning
CN109459609B (en) Distributed power supply frequency detection method based on artificial neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant