CN114818904B - Fan fault detection method and storage medium based on Stack-GANs model - Google Patents

Fan fault detection method and storage medium based on Stack-GANs model Download PDF

Info

Publication number
CN114818904B
CN114818904B CN202210422554.5A CN202210422554A CN114818904B CN 114818904 B CN114818904 B CN 114818904B CN 202210422554 A CN202210422554 A CN 202210422554A CN 114818904 B CN114818904 B CN 114818904B
Authority
CN
China
Prior art keywords
data
formula
stack
generator
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210422554.5A
Other languages
Chinese (zh)
Other versions
CN114818904A (en
Inventor
强保华
谢元
滕可亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210422554.5A priority Critical patent/CN114818904B/en
Publication of CN114818904A publication Critical patent/CN114818904A/en
Application granted granted Critical
Publication of CN114818904B publication Critical patent/CN114818904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Computer Hardware Design (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Water Supply & Treatment (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Human Resources & Organizations (AREA)
  • Geometry (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

The invention provides a fan fault detection method based on a Stack-GANs model. The invention utilizes the effectiveness of the Stack-GANs algorithm in processing the problem of unbalanced fan data, reduces the negative effect of unbalanced data set category on fan fault detection, and improves the efficiency and accuracy of fan fault detection.

Description

Fan fault detection method and storage medium based on Stack-GANs model
Technical Field
The invention relates to the technical field of computer and fan fault detection, in particular to a fan fault detection method based on a Stack-GANs model and a storage medium.
Background
The data processing and detection of the faults of the wind turbine generator have great significance for improving the reliability of the power generation of the wind turbine generator and reducing the operation and maintenance cost. Besides the traditional detection method, there is an artificial intelligence method at present, and the method is used for preparing a data set by collecting various data of the fan during operation, and training a model by using the data set finally achieves the purpose of fan fault detection. In the actual data set, the proportion of the number of normal running data of the fan to the number of failed data is unbalanced, which can adversely affect the subsequent model training and further affect the detection efficiency of wind turbine faults.
Disclosure of Invention
The invention provides a fan fault detection method based on a Stack-GANs model, which utilizes the effectiveness of a Stack-GANs algorithm in processing the problem of unbalance of fan data, reduces the negative effect of unbalance of data set category on fan fault detection, and improves the efficiency and accuracy of fan fault detection.
The invention comprises a fan data set formed by working condition parameters of a wind turbine, wherein the fan data set usually takes part or all of 18 parameters including U2 winding current, U3 winding current, U1 winding voltage, U2 winding voltage, U3 winding voltage, pitch distance, hydraulic oil temperature, generator cooling temperature, generator slip ring temperature, generator rotating speed, impeller rotating speed, gearbox blade temperature, gearbox filter pressure, wind speed 1, wind speed 2, wind direction 1, wind direction 2, cabinet temperature and the like as elements, and the elements are processed by a Stack-GANs model and then are input into a general fan fault detection model to obtain a detection result.
The construction of the Stack-GANs model comprises the following steps:
(1) By using a random forest method, features in a fan data set are screened according to the importance degree of working condition parameters of the wind turbine generator to obtain a new feature subset, and a minority feature data set P= { P is extracted from the new feature subset 1 ,p 2 ,…,p i ,…,p m }。
(2) Dividing the minority class feature data set P by using a Pearson correlation coefficient method and an MIC analysis method to respectively obtain Group1 feature data subsetsAnd Group2 feature data subset
(3) Using the subsets of Group1 and Group2 feature data to train GANs1 and GANs2, respectively, training a dispeimator and a Generator by alternately using formula (a) and formula (b) using random noise as input during training; wherein,
the formula (a) is:the formula (b) is: />
In the formula: p (x) represents the feature data subset P 1 And P 2 Distribution of x t Representing a feature data subset P 1 And P 2 In the real sample data at time t, p (z) represents the distribution of random noise input into the Generator, z t Represents random noise data input into the Generator at time t, D represents the dispermizer, G represents the Generator,and->The hidden state values at a time on RNN in D and G, respectively.
(4) After the model training of step (3) is completed, both displacers are removed, leaving the corresponding generators 1 and 2.
(5) And generating corresponding Group data Group3 and Group4 by using the generators 1 and 2, and splicing the Group data Group3 and Group4 to form a data Group which is rough sample data.
(6) Training a dispeimator and a Generator by alternately using formula (c) and formula (d) with the rough sample data as an input; wherein,
the formula (c) is:
the formula (d) is:
in the formula: g1 and G2 represent the Generator1 and Generator2, p, respectively, remaining in step (4) A (x) Representing the distribution of a minority class feature dataset P, x t A Representing real sample data at time t in a minority class feature dataset P G1 (z) and p G2 (z) represents the distribution of random noise input to G1 and G2, respectively, z t G1 And z t G2 Represents random noise data inputted into G1 and G2 at time t, D represents a dispermizer, G represents a Generator,and->The hidden state values at a time on RNN in D and G, respectively.
(7) After the training of step (6) is completed, the dispermizer is removed, leaving the corresponding Generator3.
(8) The construction of the Stack-GANs model was completed based on the remaining generators 1, 2 and 3.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the fan fault detection method described above.
Drawings
FIG. 1 is a flow chart of feature screening using random forests in the present invention;
FIG. 2 is a diagram of the method construction of the Stack-GANs model provided by the invention;
FIG. 3 is a diagram showing the network structure of the displacers and generators of the present invention;
fig. 4a and 4b are diagrams showing comparison between a minority class sample and a real minority class sample generated by the present invention.
Detailed Description
A specific implementation example is given below for describing the technical solution of the present invention in detail, where the data set used in this embodiment is derived from historical operation data of a wind farm in Yunnan, and the data set is divided, where the training set has 315600 pieces of data, the test set has 20190 pieces of data, and the entire data set totally involves 30 features.
(1) In order to reduce the difficulty of building a subsequent model and avoid generating unnecessary data features, the invention firstly adopts a Random Forest (RF) method to sort and screen the feature importance of the data set, and the RF is an algorithm containing a plurality of decision trees, and has the advantages of high accuracy and good stability. A specific flow of RF feature importance selection is shown in fig. 1. The detailed steps are as follows:
first, N new data sets can be formed by randomly extracting a number of data samples from the original data set with a substitution sample to form a new data set, and repeating the process N times. And randomly selecting a plurality of features from each new data set to form feature subsets, constructing a corresponding decision tree by utilizing each feature subset, wherein for each decision tree, a part of feature data subsets are not involved in the construction of the decision tree, and the unutilized feature subsets are called as out-of-bag data of the corresponding decision tree.
Secondly, when importance scores are calculated for the features x, all decision trees constructed by the features x are used as base models, and corresponding out-of-bag data are used as test data to calculate Error values Error1.
Then, random noise is added to the characteristic x of all samples in the out-of-bag data, and the Error value Error2 is calculated on the out-of-bag data with noise again by using a corresponding decision tree.
Finally, the importance score of the feature x is calculated by formula (e).
Where N represents the number of decision trees that feature x participates in the construction.
After RF feature importance ranking and screening, the dataset retains important 18 classes of features, including: the specific conditions of the U2 winding current, the U3 winding current, the U1 winding voltage, the U2 winding voltage, the U3 winding voltage, the pitch distance, the hydraulic oil temperature, the generator cooling temperature, the generator slip ring temperature, the generator rotating speed, the impeller rotating speed, the gearbox blade temperature, the gearbox filter pressure, the wind speed 1, the wind speed 2, the wind direction 1, the wind direction 2 and the cabinet temperature are shown in the table 1.
Fan characteristics after screening by table 1 RF algorithm
And carrying out correlation analysis on the data set subjected to feature screening by using a Pearson correlation coefficient method and an MIC analysis method, and finally selecting thresholds of the Pearson correlation coefficient method and the MIC analysis method to be 0.6 and 0.7 respectively through multiple experiments. In order to make both the linear and nonlinear relationships between features strongly correlated, features are grouped together when both the Pearson coefficient value and the MIC value between features are greater than the above thresholds. The final features are divided into two groups, group1 and Group2, as shown in table 2.
Table 2 fan signature grouping case
(2) The invention is based on the concept of GANs, builds a Stack-GANs model to generate few types of sample data, so as to solve the problem of unbalance of the data, the overall model construction is shown in figure 2, the main steps of the model construction are described in the invention content, more specifically, for the generators and displayers for constructing GANs, taking the time sequence of fan data into consideration, an RNN network structure model is adopted to excavate the time sequence characteristics in the data, and the specific structure is shown in figure 3.
(3) Referring to fig. 4a and 4b, the comparison of the mean and standard deviation of 18 feature variables in the minority data generated by the Stack-GANs model and the real minority data can be seen, and the situation shows that: on most features, the mean and standard deviation in the generated minority sample data are relatively close to the mean and standard deviation in the real minority sample data. This demonstrates that the minority sample data generated by the Stack-GANs method is similar to the real minority sample data, and the method and model provided by the invention can generate high-quality minority sample data.
(4) Furthermore, a comparison experiment is carried out on the Stack-GANs model processing method and other data unbalance processing methods. In the experiment, in order to eliminate the influence of random factors on each fan fault detection model, 10 experiments were performed for each algorithm, and the average value of F1-Score, G-mean and AUC thereof was calculated, and the experimental results of each data unbalance processing algorithm are shown in Table 3.
Table 3 comparison of the F1, G-means and AUC values of the respective data imbalance treatment algorithms (unit:%)
The experimental results in the analytical table can be concluded as follows: (1) compared with an unprocessed data set, the data set processed by the Stack-GANs method can improve the comprehensive performance of the fan fault detection model, and compared with other algorithms, the Stack-GANs method obtains the optimal value of the evaluation index on each fan fault detection model. (2) In each fan fault detection model, the lifting effect of the BSMOTE-Sequence algorithm on the model is slightly worse than that of Stack-GANs, but better than that of other algorithms, probably because the BSMOTE-Sequence algorithm considers time Sequence characteristics among data when synthesizing few types of samples, but does not consider the reason of correlation among features. (3) The model trained by using the data set after TomekLink sampling has poor comprehensive effect because TomekLink is an undersampling technique, and certain important information is lost when a sample is deleted, so that the performance of the model is reduced.
The effectiveness of the Stack-GANs method in treating the problem of unbalance of fan data is proved by the experimental results, the correlation and time sequence characteristics of data characteristics are comprehensively considered when new minority samples are synthesized, the generated data are more real, and the model can learn the distribution of fault characteristics better, so that the overall performance of the model is improved.
The invention has the technical characteristics and beneficial effects that: for the Stack-GANs model in the invention, firstly, important features in fault data are screened out by using an RF algorithm, so that the complexity of constructing the model is reduced; secondly, constructing a dispermizer and a Generator of each stage by using RNN according to the thought of GANs, and capturing time sequence characteristics among data; and finally, a Stack-GANs model is built by a progressive method, so that strong and weak correlations among features can be considered simultaneously, a high-quality fan fault data sample is generated, the unbalanced problem in a fan fault data set is solved, and further the efficiency and accuracy of fan fault detection are improved.

Claims (2)

1. A fan fault detection method based on a Stack-GANs model comprises a fan data set formed by working condition parameters of a wind turbine generator, and is processed through the Stack-GANs model and then input into a fan fault detection model to obtain a detection result, and is characterized in that the construction of the Stack-GANs model comprises the following steps:
(1) By using a random forest method, features in a fan data set are screened according to the importance degree of working condition parameters of the wind turbine generator to obtain a new feature subset, and a minority feature data set P= { P is extracted from the new feature subset 1 ,p 2 ,…,p i ,…,p m };
(2) Using Pearson correlation coefficient method and MIC scoreThe analysis method divides the minority class feature data set P to respectively obtain Group1 feature data subsetsAnd Group2 feature data subset
(3) Using the subsets of Group1 and Group2 feature data to train GANs1 and GANs2, respectively, training a dispeimator and a Generator by alternately using formula (a) and formula (b) using random noise as input during training; wherein,
the formula (a) is:
the formula (b) is:
in the formula: p (x) represents the feature data subset P 1 And P 2 Distribution of x t Representing a feature data subset P 1 And P 2 In the real sample data at time t, p (z) represents the distribution of random noise input into the Generator, z t Represents random noise data input into the Generator at time t, D represents the dispermizer, G represents the Generator,and->The hidden state values of the RNNs in D and G at a moment are respectively represented;
(4) After the model training in the step (3) is finished, removing the two displacers, and reserving the corresponding generators 1 and 2;
(5) Generating corresponding Group data Group3 and Group4 by using the generators 1 and 2, and splicing the Group data Group3 and Group4 to form a data Group which is rough sample data;
(6) Training a dispeimator and a Generator by alternately using formula (c) and formula (d) with the rough sample data as an input; wherein,
the formula (c) is:
the formula (d) is:
in the formula: g1 and G2 represent the Generator1 and Generator2, p, respectively, remaining in step (4) A (x) Representing the distribution of a minority class feature dataset P, x t A Representing real sample data at time t in a minority class feature dataset P G1 (z) and p G2 (z) represents the distribution of random noise input to G1 and G2, respectively, z t G1 And z t G2 Represents random noise data inputted into G1 and G2 at time t, D represents a dispermizer, G represents a Generator,and->The hidden state values of the RNNs in D and G at a moment are respectively represented;
(7) After the training in the step (6) is finished, removing the dispermizer, and reserving the corresponding Generator3;
(8) The construction of the Stack-GANs model was completed based on the remaining generators 1, 2 and 3.
2. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the Stack-GANs model-based fan fault detection method of claim 1.
CN202210422554.5A 2022-04-21 2022-04-21 Fan fault detection method and storage medium based on Stack-GANs model Active CN114818904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210422554.5A CN114818904B (en) 2022-04-21 2022-04-21 Fan fault detection method and storage medium based on Stack-GANs model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210422554.5A CN114818904B (en) 2022-04-21 2022-04-21 Fan fault detection method and storage medium based on Stack-GANs model

Publications (2)

Publication Number Publication Date
CN114818904A CN114818904A (en) 2022-07-29
CN114818904B true CN114818904B (en) 2024-03-15

Family

ID=82505516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210422554.5A Active CN114818904B (en) 2022-04-21 2022-04-21 Fan fault detection method and storage medium based on Stack-GANs model

Country Status (1)

Country Link
CN (1) CN114818904B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766175A (en) * 2019-10-25 2020-02-07 长沙理工大学 Pitch system fault detection method and device based on optimal interval distribution machine
EP3620983A1 (en) * 2018-09-05 2020-03-11 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
CN111337243A (en) * 2020-02-27 2020-06-26 上海电力大学 ACGAN-based wind turbine generator planet wheel gearbox fault diagnosis method
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN112801151A (en) * 2021-01-18 2021-05-14 桂林电子科技大学 Wind power equipment fault detection method based on improved BSMOTE-Sequence algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753998A (en) * 2018-12-20 2019-05-14 山东科技大学 The fault detection method and system, computer program of network are generated based on confrontation type

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3620983A1 (en) * 2018-09-05 2020-03-11 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
CN110766175A (en) * 2019-10-25 2020-02-07 长沙理工大学 Pitch system fault detection method and device based on optimal interval distribution machine
CN111337243A (en) * 2020-02-27 2020-06-26 上海电力大学 ACGAN-based wind turbine generator planet wheel gearbox fault diagnosis method
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN112801151A (en) * 2021-01-18 2021-05-14 桂林电子科技大学 Wind power equipment fault detection method based on improved BSMOTE-Sequence algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于极端随机森林的大型风电机组发电机故障检测;陈宇韬;唐明珠;吴华伟;赵琪;匡子杰;;湖南电力;20191225(06);49-55 *
基于长短时记忆―自编码神经网络的风电机组性能评估及异常检测;柳青秀;马红占;褚学宁;马斌彬;王峥;;计算机集成制造系统;20191215(12);233-243 *

Also Published As

Publication number Publication date
CN114818904A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
WO2022261805A1 (en) Diesel engine gearbox fault diagnosis method
CN106447039A (en) Non-supervision feature extraction method based on self-coding neural network
CN109672221B (en) Direct-drive wind power plant dynamic equivalence method for subsynchronous oscillation analysis
CN109766874A (en) A kind of fan trouble classifying identification method based on deep learning algorithm
CN110044623A (en) The rolling bearing fault intelligent identification Method of empirical mode decomposition residual signal feature
CN114742097B (en) Optimization method for automatically determining variation modal decomposition parameters based on bearing vibration signals
CN112257530A (en) Rolling bearing fault diagnosis method based on blind signal separation and support vector machine
CN113390631A (en) Fault diagnosis method for gearbox of diesel engine
CN114441173B (en) Rolling bearing fault diagnosis method based on improved depth residual error shrinkage network
Joshuva et al. A machine learning approach for condition monitoring of wind turbine blade using autoregressive moving average (ARMA) features through vibration signals: a comparative study
CN113466634A (en) Ground fault waveform identification method based on fault indicator
CN114818904B (en) Fan fault detection method and storage medium based on Stack-GANs model
CN117171544B (en) Motor vibration fault diagnosis method based on multichannel fusion convolutional neural network
CN116306767A (en) Gear fault diagnosis method for improving VMD of whale
CN112464708B (en) Double-fed asynchronous fan power quality abnormal fault diagnosis method
CN114781244A (en) Grouping and parameter optimization method in wind power plant
CN114235409A (en) Rolling bearing multi-user cooperative intelligent fault diagnosis method for light weight communication
CN112270273A (en) Wind driven generator fault part identification method based on GCN and WPT-MD
CN116499748B (en) Bearing fault diagnosis method and system based on improved SMOTE and classifier
Jiang'hong et al. Large rotating machinery fault diagnosis and knowledge rules acquiring based on improved RIPPER
CN114964780A (en) Wind power bearing fault diagnosis method based on time-frequency domain convolutional network and deep forest
CN114626498B (en) Photovoltaic operation and maintenance data reconstruction method based on evolution optimization
CN109765003B (en) Method for extracting characteristics of blade imbalance fault electrical signals based on Hilbert transform
CN113989201B (en) Axle center track identification method based on cloud computing and LSTM
CN117740381B (en) Bearing fault diagnosis method under low-speed heavy-load working condition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant