CN117370874A - Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM - Google Patents

Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM Download PDF

Info

Publication number
CN117370874A
CN117370874A CN202311284791.0A CN202311284791A CN117370874A CN 117370874 A CN117370874 A CN 117370874A CN 202311284791 A CN202311284791 A CN 202311284791A CN 117370874 A CN117370874 A CN 117370874A
Authority
CN
China
Prior art keywords
data
samples
generator
sample
discriminator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311284791.0A
Other languages
Chinese (zh)
Inventor
邢建模
伍盛金
王卫玉
高金林
赖兴全
欧适
罗立军
魏加达
谭文胜
莫凡
王思嘉
刘禹
马腾飞
康志远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Power Investment Group Chongqing Jiangkou Hydropower Co ltd
Hunan Wuling Power Technology Co Ltd
Original Assignee
State Power Investment Group Chongqing Jiangkou Hydropower Co ltd
Hunan Wuling Power Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Power Investment Group Chongqing Jiangkou Hydropower Co ltd, Hunan Wuling Power Technology Co Ltd filed Critical State Power Investment Group Chongqing Jiangkou Hydropower Co ltd
Priority to CN202311284791.0A priority Critical patent/CN117370874A/en
Publication of CN117370874A publication Critical patent/CN117370874A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2131Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on a transform domain processing, e.g. wavelet transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

A small sample fault diagnosis method of a hydroelectric generating set based on DA-WGAN-SVM firstly generates an countermeasure network, wasserstein generates the countermeasure network, can strengthen differential data, and then utilizes the obtained data to carry out fault analysis diagnosis based on the small sample of the hydroelectric generating set. On the basis of a small sample of a hydroelectric generating set, the invention firstly utilizes a micro-enhancement generation countermeasure network to carry out data enhancement on the existing training data set and expand the training data set, and then carries out feature vector extraction on the data set; and finally, the support vector machine classifier is used for classifying the expanded data set so as to realize the fault classification of the hydroelectric generating set, and compared with most traditional physical model-based methods, the method can be better suitable for the characteristics of different generating sets, does not need excessive priori knowledge and assumptions, has stronger practicability and operability, and plays a powerful technical support for effectively judging the faults of the hydroelectric generating set.

Description

Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM
Technical Field
The invention relates to the technical field of a hydroelectric generating set fault analysis method, in particular to a hydroelectric generating set small sample fault diagnosis method based on a DA-WGAN-SVM (micro-enhancement generation countermeasure network-support vector machine).
Background
In the field of hydroelectric generating sets, vibration monitoring signals of the hydroelectric generating sets contain abundant set state information, industrial big data platforms of all generating groups accumulate massive set operation data, and fault analysis and diagnosis are realized through data decomposition, data reconstruction and the like. Although the existing hydroelectric generating set state monitoring technology is relatively mature and widely applied, most data of the same generating set still exist as normal operation stage data rather than fault period state monitoring data when monitoring signals are recorded, so that the number of samples of the hydroelectric generating set, which are calibrated with fault types, is still small, and accordingly, the hydroelectric generating set is hindered in carrying out fault diagnosis on water and electricity based on the data. The method is characterized in that the method comprises the steps of carrying out a first analysis on the operation data of a plurality of units, carrying out a second analysis on the operation data of the units, and carrying out a first analysis on the operation data of the units, wherein the first analysis is carried out on the operation data of the units, and the second analysis is carried out on the operation data of the units; that is, the conventional hydroelectric generating set fault diagnosis method mainly depends on the experience knowledge of the field expert, the method is easily affected by artificial subjective factors, and the fault judgment result is often based on experience facts and lacks scientific basis.
At present, the following relatively more advanced hydroelectric generating set fault diagnosis methods are also presented. First kind: in specific application, firstly, a tool for measuring signal complexity, namely time-shifting multi-scale attention entropy, is provided based on the fractal theory, then, the time-shifting multi-scale attention entropy is subjected to dimension reduction processing by utilizing principal component analysis, the characteristic redundancy problem is overcome, finally, the dimension reduced characteristics are input into a random forest model for diagnosis, and noise with different signal to noise ratios is added to a vibration signal, so that the anti-noise performance of the model is explored under different noise intensities. Second kind: the self-attention mechanism is combined with the auxiliary classifier generation type countermeasure network, so that effective learning of data features in a multi-classification scene is realized, and the fault diagnosis accuracy of the hydroelectric generating set fault model is increased. Third kind: the vibration signals of the hydroelectric generating set are subjected to variation modal decomposition to obtain a plurality of components, a time chart is constructed by utilizing the components, then a deep convolutional neural network is built to perform feature extraction and fault identification on the time chart, and a mapping relation between the components and fault states is built, so that the fault diagnosis of the hydroelectric generating set is realized.
Although the above-mentioned methods can achieve better data analysis and judgment compared with the conventional techniques, the following problems still exist due to technical limitations, namely: the fault label or normal label is still required to be marked on the unit operation data in advance, which involves a great amount of labor investment and priori knowledge, so that the fault diagnosis model often faces the problem of insufficient complete marking training data in the actual training process, and the model obtained by training is unsatisfactory in generalization capability and robustness. And two,: the intelligent diagnosis capability of the hydroelectric generating set fault can be improved based on the strong calculation power of a computer and potential fault symptoms in an autonomous mining hydroelectric generating set fault sample, but the diagnosis accuracy is highly related to algorithm selection and training, and the instability in the training process is considered as an important problem of GAN (countermeasure network) because the related technology is immature, so that the judgment result is more or less deviated from actual data. And thirdly,: the training data set is transformed to a certain extent to increase the diversity of samples, the transformation can comprise rotation, overturning, scaling, shearing, brightness and contrast adjustment and the like, the purpose of data enhancement is to simulate the data change in the real world, the model is better generalized to the unseen samples, however, the traditional data enhancement is carried out before model training, the enhancement transformation is usually defined manually and is relatively independent from the model training process, and the actual needs cannot be met when the main fault of the water motor is judged.
Disclosure of Invention
In order to overcome the defects of the prior hydroelectric generating set due to the technical limitation and the background, the invention provides a method for carrying out data enhancement on the prior training data set and expanding the training data set by utilizing a micro enhancement generation countermeasure network on the basis of small samples of the hydroelectric generating set under the combined action of related steps and methods, and then extracting feature vectors of the data set; and finally, the expanded data set is classified by using a support vector machine (supportvector machines, SVM) classifier, so that the hydroelectric generating set fault classification is realized, and compared with most traditional methods based on physical models, the method can be better adapted to the characteristics of different generating sets, does not need excessive priori knowledge and assumptions, has stronger practicability and operability, and plays a powerful technical support role in effectively judging the hydroelectric generating set fault.
The technical scheme adopted for solving the technical problems is as follows:
a method for diagnosing faults of small samples of a hydroelectric generating set based on DA-WGAN-SVM is characterized in that a generating countermeasure network, a Wasserstein generating countermeasure network and differentiable data enhancement are firstly carried out, then the obtained data are utilized for carrying out fault analysis and diagnosis based on the small samples of the hydroelectric generating set, and the method specifically comprises the following steps of: data processing and model initialization, specifically, normalizing original data, dividing the data into a training set and a testing set, and initializing related parameters for a DA-WGAN model; and (B) step (B): updating the discriminator D, specifically, generating 'false sample' data by using a generator G, inputting the generated data and the original data into the discriminator D, and independently training the discriminator D to update the parameters of the discriminator; step C: updating the generator G, in particular, independently updating the generator G to enable the generated false sample data to successfully deceptively judge the D; step D: repeating the step B and the step C to lead the DA-WGAN model to reach Nash balance, and completing the expansion of a small sample; step E: extracting features of the expanded data to construct feature vectors; step F: model training, namely dividing the mixed feature vector into a training set and a testing set, and inputting training set data into an SVM model for training; step G: and (3) fault diagnosis, namely inputting test set data into a trained SVM classifier, and obtaining the identification accuracy of a fault diagnosis model.
Further, the generating countermeasure network is a neural network composed of two networks of a generator G and a discriminator D, the generator is aimed at generating data as real as possible, so that the discriminator cannot distinguish the difference between the generated data and the actual data, and the discriminator is aimed at distinguishing the generated data and the actual data as correctly as possible; specifically, generator G will p z The (z) distributed random noise samples z are converted into false samples G (z) and the arbiter D tries to distinguish them from the true samples byAnd realizing a function formula.
Further, the specific flow of the generation of the countermeasure network algorithm is as follows, 1): initializing D parameter theta of discriminator d Sum generator G parameter θ g The method comprises the steps of carrying out a first treatment on the surface of the 2): taking m samples from the sample set and taking m vectors from the noise samples; 3): inputting m vectors into generator G to obtain m sample data, and updating parameter theta of discriminator D d To maximizeIn particular byRealizing a formula; 4): extracting m vectors from noise samples, updating the parameter θ of generator G g Specifically by->The formula is realized.
Further, in the Wasserstein generation countermeasure network, the Wasserstein distance is used as a loss function, and the adopted formula is as follows,
further, the step of enhancing the differentiable data is as follows: 1) Defining a differentiable data enhancement change; 2) "false sample" data enhancement; 3): enhancing the real sample data; 4) Merging the real samples and generating samples; 5) Training a discriminator; 6) Training a generator; 7) And (5) finishing training.
Further, in said differentiable data enhancement, 1), specifically selecting a set of differentiable data enhancement transforms, such as translation, scaling, rotation, etc.; 2) Specifically, a generator G is used for generating a batch of false samples from random noise and carrying out data enhancement by utilizing differentiable data enhancement changes; 3) Applying a differentiable data enhancement transform to a collection of real samples from the training dataset; 4) Combining the real sample after data enhancement with the generated sample; 5) Inputting the combined samples into a discriminator, calculating a loss function, and updating parameters of the discriminator according to the loss function; 6) In the method, another batch of false samples are generated by using a generator, a loss function is calculated, then parameters of the generator are updated according to the loss function, and the gradient of the differentiable data enhancement transformation is considered when the gradient is calculated; 7) And (3) repeatedly executing the steps 2-6 until the stopping condition is met.
Further, the differentiable data enhancement generation opposing network model loss function is represented by the formula:
L G =E z~p(z) [f G (-D(T(G(z))))]。
compared with the prior art, the invention has the beneficial effects that: (1) The invention is based on the mass unit operation data accumulated by the industrial big data platform of each power generation group, has the characteristics of intellectualization and digitalization, does not need excessive priori knowledge and assumption, and has stronger practicability and operability. (2) The Wasserstein distance is used as a loss function to solve the problems of instability and mode collapse in the traditional GAN training process, and the generalization and the robustness of the hydroelectric generating set fault diagnosis model are further enhanced. (3) The differential data enhancement technology is adopted, and the data enhancement and the model training process are tightly combined, so that the model can learn an optimal data enhancement strategy in the training process, and can be automatically adjusted according to a loss function and a gradient descent algorithm so as to improve the performance of the model. (4) According to the invention, small sample fault data of the hydroelectric generating set are adopted, firstly, the differential data are used for enhancing generation of an countermeasure network to expand the hydroelectric generating set data set, the expanded data set and the original data set are subjected to feature extraction, and finally, feature vectors are input into a fault diagnosis model for training, so that the accuracy of the model is improved.
Drawings
Fig. 1 is a schematic flow diagram of the present invention.
Fig. 2 is a schematic diagram of the generation of an antagonism network of the present invention.
FIG. 3 is a detailed information schematic diagram of cavitation signals of the hydroelectric generating set.
Fig. 4 is a schematic diagram of the time domain waveform of the hydraulic turbine signal in different states of the present invention.
Fig. 5 is a schematic diagram of the network architecture of the generator of the present invention.
Fig. 6 is a schematic diagram of a network structure of a discriminator of the invention.
Fig. 7 is a schematic diagram of time-frequency domain waveforms of a real signal and a generated signal under different working conditions of the present invention.
FIG. 8 is a graph of the error between a real sample and a generated sample under different conditions of the present invention.
FIG. 9 is a comparative schematic diagram of the classification effect of the present invention.
FIG. 10 is a graph showing the comparison of the effects of different sample numbers on SVM classification effects according to the present invention.
FIG. 11 is a schematic diagram showing the effect of different generation sample rates on SVM model accuracy.
Detailed Description
In actual hydroelectric generating set operation, the partial pressure of the flow passing component of the water turbine is reduced below the vapor pressure of water, so that liquid is vaporized and bubbles are formed, the bubbles are suddenly broken when flowing to a high-pressure area, shock waves and high-speed water flow are generated, the surface of the flow passing component is damaged, cavitation is caused, materials on the surface of a blade of the water turbine are worn or even fall off, the structural integrity of the blade is reduced, and therefore the performance and service life of the water turbine are affected. Cavitation bubbles can also break in a high-pressure area to generate impact force and local pressure fluctuation, so that abnormal noise and vibration are generated in the water turbine, and the running stability of equipment is affected. The invention provides a hydropower unit fault diagnosis method based on small sample learning aiming at cavitation fault of a water turbine.
As shown in fig. 1, in a method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM, an countermeasure network (WGAN) is first generated, a waserstein is generated, and differential data enhancement (DA) is then performed by using the obtained data, and fault analysis and diagnosis based on the small sample of the hydroelectric generating set are performed.
As shown in fig. 1, the specific steps are as follows:
1) Establishing DA-WGAN model
Fig. 2 is a schematic diagram of an impedance network. Generator G will p z (z) the random noise samples z of the distribution are converted into false samples G (z), and the arbiter D tries to distinguish them from the true samples, using the following objective function:
wherein p is data (x) Is the distribution of the real samples; e represents a desire; d (x) is the output of the discriminator D, when D (x)>Judging the sample to be a real sample when the sample is 0.5; g (z) is the output of generator G, having the same dimensions as the real sample. The specific flow of generating the countermeasure network algorithm is as follows, S1: initializing D parameter theta of discriminator d Sum generator G parameter θ g Providing a calculation basis for the next step; s2, taking m samples { x } from the sample set 1 ,x 2 ,…,x m Extracting m vectors { z } from noise samples 1 ,z 2 ,…,z m Reconstructing sample data; s3, inputting m vectors into a generator G to obtain m sample dataWherein the method comprises the steps ofUpdating the parameter θ of the arbiter D d To maximize +.>The applied formula is as follows, and the GAN neural network provided by the invention is trained:
s4, extracting m vectors { z } from the noise sample 1 ,z 2 ,…,z m (which may be inconsistent with the second step), update the parameters θ of generator G gThe formula applied is as follows:
in a wasperstein generating challenge network (WGAN), the wasperstein distance is used as a loss function, which is defined as the following formula:
wherein d (x, y) is the distance between x and y; pi (X, Y) is a joint distribution over X and Y, satisfying the edge distributions P respectively r And P g The method comprises the steps of carrying out a first treatment on the surface of the inf denotes taking the minimum over all possible pi (x, y) joint distributions; however, since the direct calculation formula is very difficult, the formula is converted into the following formula according to the Kantorovich-Rubistein dual theorem:
where sup represents the upper bound to all 1-Lipschitz continuous functions f. This dual form translates the calculation of the wasperstein distance into an optimization problem for the function f.
As shown in FIG. 1, the DA-WGAN neural network is next established the differential data enhancement (DA) basic steps are as follows: 1) Definition of differentiable data enhancement changes: a set of differentiable data enhancement transforms is selected, such as translation, scaling, rotation, etc. 2) "false sample" data enhancement: a set of "false samples" is generated from random noise using generator G and data enhancement is performed using differentiable data enhancement changes. 3) True sample data enhancement: a differentiable data enhancement transform is applied to a collection of real samples from a training dataset. 4) Merging the real samples and generating samples: and combining the real sample after data enhancement with the generated sample, wherein the sample size is increased and the accuracy of the neural network is improved. 5) Training a discriminator: and inputting the combined samples into a discriminator, calculating a loss function, and updating parameters of the discriminator according to the loss function. 6) Training a generator: generating another false sample by using the generator, calculating a loss function, and updating parameters of the generator according to the loss function; in computing the gradient, the gradient of the differentiable data enhancement transform is considered. 7) Ending training: steps 2-6 are repeated until a stop condition is met (e.g., a predetermined number of training iterations is reached or a performance indicator is met). The differentiable data enhancement generates a loss function against the network model as follows:
wherein L is D And L G Loss functions of the discriminator and the generator respectively; p is p data (x) Is the distribution of the real samples; e represents a desire; d (·) is the output of the arbiter D; g (z) is the output of generator G, f D And f G Loss functions of the discriminator and the generator respectively; t (x) is the differentiable data enhancement.
As shown in fig. 1, 2) data processing and model initialization: normalizing the original data, dividing the data into a training set and a testing set, and initializing related parameters for a DA-WGAN model; the data set contains n fault types in total, and the training set and the testing set are divided according to the proportion of 2:1 for each fault type data. The two-dimensional random noise vector of 32×32 is taken as input of a generator G, fault sample data of 32×32 generated by the generator G and real fault sample data corresponding to 32×32 are taken as input of a discriminator D, the DA-WGAN model is trained by adopting an Adam optimization algorithm, the learning rate of the discriminator D is set to be 0.0001, the learning rate of the generator G is set to be 0.0002, and the iteration number is 40000.
3) Updating the discriminator D: generating 'false sample' data by using a generator G, inputting the generated data and the original data into a discriminator D, and independently training the discriminator D to update the parameters of the discriminator;
4) Update generator G: the generator G is updated independently so that the generated false sample data can successfully deceptively judge the D;
5) Repeating the step 2 and the step 3 to lead the DA-WGAN model to reach Nash balance, and completing the expansion of a small sample;
6) Feature extraction: extracting features of the expanded data to construct feature vectors;
7) Model training: dividing the mixed feature vector into a training set and a test set, and inputting training set data into an SVM model for training;
8) Fault diagnosis: inputting the test set data into a trained SVM classifier, and obtaining the identification accuracy of the fault diagnosis model.
As shown in FIG. 1, the example verification is as follows, the data of the No. 5 sensor of the domestic BS hydropower station #2 unit is selected for acoustic emission signal acquisition in a test, the frequency response range of the sensor is 50kHz to 400kHz, and the sampling frequency is 1MHz. The test data are acoustic emission signals collected under three working conditions of water turbine idling, 30% opening of a guide vane and full load, detailed data information is shown in fig. 3, wherein the idle data are 12 samples, the 30% opening of the guide vane is 15 samples, the full load working condition is 9 samples, the total is 36 samples, each sample segment comprises 1024 sampling points, and fig. 4 shows time domain waveforms of corresponding signals of the water turbine under 3 different running states.
After the steps 1-8 are carried out, an countermeasure network is generated, wasserstein is generated, a process of enhancing the data can be differentiated, and then fault analysis and diagnosis based on a small sample of the hydroelectric generating set are carried out by utilizing the obtained data. The patent adopts a Root Mean Square Error (RMSE), an average absolute error (MAE) and a correlation coefficient (r) to evaluate the generated sample, and the root mean square error, the average absolute error and the correlation coefficient have the following calculation formulas:
wherein N is the dimension of the sample; y is i Is the original sample;generating a sample; VAR, cov are variance and covariance, respectively. Fig. 6 shows the time domain waveform and the frequency domain waveform of the real signal and the generated signal of the water turbine idling, 30% of the opening degree of the guide vane and full load respectively. As can be seen from fig. 6, the fault data generated through the DA-WGAN shows a high similarity with the actual fault data in terms of the time domain vibration mode. Further, the actual data and the generated data are subjected to a fast fourier transform, as shown in fig. 6The frequency spectrums of the two are relatively strong in similarity, and the frequency amplitude is obviously coincident near 75kHz and 175kHz, so that the distribution characteristic of actual data can be effectively captured by the method. Under the condition of scarce original sample data, the feature description of the sample data is enriched by combining the generated samples, so that the generalization performance and the fault diagnosis capability of the model are improved. In addition, in order to measure the error between the generated sample and the original sample, the invention realizes quantization by calculating the root mean square error and the average absolute error between the actual sample and the generated sample, the calculation result is shown in fig. 7, and finally, the feature extraction of the time domain and the frequency domain is carried out on the vibration signals under 3 fault states, and the feature vector is constructed.
As shown in FIG. 1, the generated data samples and the original samples are mixed and the training set and the test set are divided in a ratio of 2:1 and input into the SVM model. The training set and the test set were input to the SVM model in a ratio of 2:1 as a comparison and 100 trials were repeated, and the results are shown in fig. 8. As can be seen from FIG. 8, the SVM classifier obtained by training is unstable due to the small number of samples of the training model, and the minimum accuracy rate is only 50% and the maximum accuracy rate is 100% in 100 tests; after the number of samples is doubled, the average accuracy and the minimum accuracy of the model obtained by training are obviously improved, the average accuracy is improved from 81.58% to 92.29%, and the minimum accuracy is also improved by 25%. Therefore, the generated expansion sample based on the DA-WGAN can describe the characteristics of the fault sample more accurately, and the generated expansion sample can be used for expanding the fault sample data set of the hydroelectric generating set, so that the generalization and the robustness of the fault diagnosis model are greatly improved. Specifically, in order to study the influence of different numbers of generated samples on the accuracy of the fault diagnosis model, 1 to 8 times of generated sample data are added to the original samples respectively, the mixed sample data are divided into a training set and a testing set according to the proportion of 2:1, and the training set and the testing set are input into the SVM model for repeating the test 100 times, and the result is shown in FIG. 9. As can be seen from fig. 9, when the number of generated samples increases, the classification effect of the SVM model is improved, the minimum accuracy thereof is gradually increased, the minimum accuracy thereof reaches 88.10% after adding 6 times of generated samples, the maximum accuracy of the model is decreased when typing 4 times of generated samples, and the average accuracy of the model has no obvious correlation with the times of adding the generated samples, and the result is shown in fig. 10.
As shown in fig. 1 and 2, analysis of fig. 8 and 9 shows, adding 1-fold generated sample to the original sample can obviously increase generalization and robustness of the model, and the method has a certain engineering significance. After the generated samples with different multiples are added into the original samples, the average accuracy of the model is not obviously changed, but the maximum accuracy is reduced when the generated samples are increased by 4 times, because the differential data enhancement generation is against the characteristic instability of the sample data of the network to a certain extent, the characteristic of the differential data is deviated from the characteristic distribution of the original samples, and the classification effect of the classifier is affected. In summary, in practical engineering applications, the rational utilization of the generated samples to expand the training data set is an effective strategy to help to improve the generalization ability and robustness of the model, but at the same time, the quality of the generated samples and the consistency of the characteristics with the original samples need to be paid attention to. And finally, inputting the test set data into a trained SVM classifier to obtain the identification accuracy of the fault diagnosis model. Herein, DA means optimization algorithm, WGAN means generation of an antagonistic network, SVM means support vector machine.
The following objects are achieved by the present invention through the above technical solution, as shown in fig. 1. (1): aiming at the problem that the operation data of a large number of units of each power generation group industrial large data platform lacks scientific and effective analysis and utilization, the invention can effectively utilize the past data, and realize intelligent fault diagnosis by training the algorithm and deeply mining the hidden meaning of the data through the past data. (2): the invention adopts a support vector machine algorithm, can reduce the dependence of positioning fault reasons and parts on the analysis level and actual experience of monitoring data of staff, and can reduce subjectivity and limitation of the diagnosis process as an auxiliary decision-making means. (3): the hydroelectric generating set data based on the small sample fault data is analyzed, and most of the set monitoring data in the prior art are not labeled, so that the method has popularization, after the method is specifically applied, the data enhancement of the original vibration signals is realized, the hydroelectric generating set fault signal data set is expanded, further, the SVM model is trained by utilizing the expanded data set, and the generalization capability of the hydroelectric generating set fault diagnosis model is enhanced. (4): according to the invention, cavitation corrosion faults of the water turbine are taken as research objects, and the proposed hydropower unit fault diagnosis method based on small sample learning is verified, so that the characteristics of fault samples can be accurately described by the generated expansion samples based on DA-WGAN, the generated expansion samples can be utilized to expand the fault sample data set of the hydropower unit, and the generalization and the robustness of a fault diagnosis model can be greatly improved. (5): the invention adopts the Wasserstein distance as a loss function to solve the problems of instability and mode collapse in the traditional GAN training process, and compared with the traditional distance measures such as Jensen-Shannon divergence, kullback-Leibler divergence and the like, the Wasserstein distance has better gradient characteristics, is beneficial to solving the problem of gradient disappearance, and better ensures the performance of the hydropower unit small sample fault diagnosis method based on the DA-WGAN-SVM. (6): the intelligent algorithm is applied to the fault diagnosis of the operation and maintenance management of the power station unit, so that the management technology level of the power station can be improved, the power station can externally form a marker post, the innovation capability of the power station in the power generation industry can be improved, and meanwhile, the improvement of the human resource quality of the power station is driven, so that good social image and social benefit are obtained. (8): the technology can realize more intelligent operation and maintenance management of the power station, effectively improve the stability performance of the whole process unit, prolong the service life of the unit and ensure the economic benefit of the power plant.
While the basic principles and main features of the present invention and advantages thereof have been shown and described, it will be apparent to those skilled in the art that the present invention is limited to the details of the foregoing exemplary embodiments, which are described herein for the sake of clarity only, and that the present invention is not embodied in a single embodiment, and can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. The embodiments are, therefore, to be considered in all respects as illustrative and not restrictive,
the scope of the invention is indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (7)

1. A method for diagnosing faults of small samples of a hydroelectric generating set based on DA-WGAN-SVM is characterized in that a generating countermeasure network, a Wasserstein generating countermeasure network and differentiable data enhancement are firstly carried out, then the obtained data are utilized for carrying out fault analysis and diagnosis based on the small samples of the hydroelectric generating set, and the method specifically comprises the following steps of: data processing and model initialization, specifically, normalizing original data, dividing the data into a training set and a testing set, and initializing related parameters for a DA-WGAN model; and (B) step (B): updating the discriminator D, specifically, generating 'false sample' data by using a generator G, inputting the generated data and the original data into the discriminator D, and independently training the discriminator D to update the parameters of the discriminator; step C: updating the generator G, in particular, independently updating the generator G to enable the generated false sample data to successfully deceptively judge the D; step D: repeating the step B and the step C to lead the DA-WGAN model to reach Nash balance, and completing the expansion of a small sample; step E: extracting features of the expanded data to construct feature vectors; step F: model training, namely dividing the mixed feature vector into a training set and a testing set, and inputting training set data into an SVM model for training; step G: and (3) fault diagnosis, namely inputting test set data into a trained SVM classifier, and obtaining the identification accuracy of a fault diagnosis model.
2. The method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 1, wherein the generation countermeasure network is a neural network composed of two networks of a generator G and a discriminator D, the generator being aimed at generating as real data as possible so that the discriminator cannot distinguish the difference between the generated data and the actual data, the discriminator being aimed at distinguishing the generated data from the actual data as correctly as possible; specifically, generator G will p z The (z) distributed random noise samples z are converted into false samples G (z), and the arbiter D tries to compare them with the true samplesThe distinction is made byAnd realizing a function formula.
3. The method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 1, wherein the specific flow of the generating countermeasure network algorithm is as follows, 1): initializing D parameter theta of discriminator d Sum generator G parameter θ g The method comprises the steps of carrying out a first treatment on the surface of the 2): taking m samples from the sample set and taking m vectors from the noise samples; 3): inputting m vectors into generator G to obtain m sample data, and updating parameter theta of discriminator D d To maximizeIn particular byRealizing a formula; 4): extracting m vectors from noise samples, updating the parameter θ of generator G g Specifically by-> The formula is realized.
4. The method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 1, wherein in the Wasserstein generation countermeasure network, the Wasserstein distance is adopted as a loss function, the adopted formula is as follows,
5. the method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 1, wherein the step of differential data enhancement is as follows: 1) Defining a differentiable data enhancement change; 2) "false sample" data enhancement; 3): enhancing the real sample data; 4) Merging the real samples and generating samples; 5) Training a discriminator; 6) Training a generator; 7) And (5) finishing training.
6. The method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 5, wherein in the step 1), a group of differentiable data enhancement transformations such as translation, scaling, rotation and the like are specifically selected; 2) Specifically, a generator G is used for generating a batch of false samples from random noise and carrying out data enhancement by utilizing differentiable data enhancement changes; 3) Applying a differentiable data enhancement transform to a collection of real samples from the training dataset; 4) Combining the real sample after data enhancement with the generated sample; 5) Inputting the combined samples into a discriminator, calculating a loss function, and updating parameters of the discriminator according to the loss function; 6) In the method, another batch of false samples are generated by using a generator, a loss function is calculated, then parameters of the generator are updated according to the loss function, and the gradient of the differentiable data enhancement transformation is considered when the gradient is calculated; 7) And (3) repeatedly executing the steps 2-6 until the stopping condition is met.
7. The method for diagnosing a small sample fault of a hydroelectric generating set based on a DA-WGAN-SVM according to claim 1, wherein the generation of a loss function against a network model by differentiable data enhancement is represented by the following formula:
L G =E z~p(z) [f G (-D(T(G(z))))]。
CN202311284791.0A 2023-10-07 2023-10-07 Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM Pending CN117370874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311284791.0A CN117370874A (en) 2023-10-07 2023-10-07 Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311284791.0A CN117370874A (en) 2023-10-07 2023-10-07 Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM

Publications (1)

Publication Number Publication Date
CN117370874A true CN117370874A (en) 2024-01-09

Family

ID=89392082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311284791.0A Pending CN117370874A (en) 2023-10-07 2023-10-07 Hydropower unit small sample fault diagnosis method based on DA-WGAN-SVM

Country Status (1)

Country Link
CN (1) CN117370874A (en)

Similar Documents

Publication Publication Date Title
CN112149316B (en) Aero-engine residual life prediction method based on improved CNN model
CN108241873B (en) A kind of intelligent failure diagnosis method towards pumping plant main equipment
CN109977920B (en) Water turbine set fault diagnosis method based on time-frequency spectrogram and convolutional neural network
CN105512799B (en) Power system transient stability evaluation method based on mass online historical data
CN109102005A (en) Small sample deep learning method based on shallow Model knowledge migration
CN104330471B (en) Lamb wave time-varying probability model monitoring method for aviation structure damage
CN105678343B (en) Hydropower Unit noise abnormality diagnostic method based on adaptive weighted group of sparse expression
CN113935460A (en) Intelligent diagnosis method for mechanical fault under class imbalance data set
CN105738109A (en) Bearing fault classification diagnosis method based on sparse representation and ensemble learning
CN110046562A (en) A kind of wind power system health monitor method and device
CN105260998A (en) MCMC sampling and threshold low-rank approximation-based image de-noising method
CN111275108A (en) Method for performing sample expansion on partial discharge data based on generation countermeasure network
CN109359550A (en) Language of the Manchus document seal Abstraction and minimizing technology based on depth learning technology
CN111680875A (en) Unmanned aerial vehicle state risk fuzzy comprehensive evaluation method based on probability baseline model
CN104951787A (en) Power quality disturbance identification method for distinguishing dictionary learning under SRC framework
CN104504403A (en) Rotating machinery failure forecasting method based on scattering conversion
CN112183643A (en) Hard rock tension-shear fracture identification method and device based on acoustic emission
CN111010356A (en) Underwater acoustic communication signal modulation mode identification method based on support vector machine
Zhu et al. A simulation-data-driven subdomain adaptation adversarial transfer learning network for rolling element bearing fault diagnosis
CN109594967A (en) A kind of lock of tool drilling detection method for early warning based on well logging big data
CN115587290A (en) Aero-engine fault diagnosis method based on variational self-coding generation countermeasure network
CN110020637B (en) Analog circuit intermittent fault diagnosis method based on multi-granularity cascade forest
CN114897138A (en) System fault diagnosis method based on attention mechanism and depth residual error network
CN115600088A (en) Distribution transformer fault diagnosis method based on vibration signals
CN113076920B (en) Intelligent fault diagnosis method based on asymmetric domain confrontation self-adaptive model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination