CN117290685A - Power plant power equipment expert diagnosis system and method based on historical data - Google Patents
Power plant power equipment expert diagnosis system and method based on historical data Download PDFInfo
- Publication number
- CN117290685A CN117290685A CN202311421206.7A CN202311421206A CN117290685A CN 117290685 A CN117290685 A CN 117290685A CN 202311421206 A CN202311421206 A CN 202311421206A CN 117290685 A CN117290685 A CN 117290685A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- model
- analysis
- trend
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000003745 diagnosis Methods 0.000 title claims abstract description 27
- 230000001932 seasonal effect Effects 0.000 claims abstract description 26
- 230000000737 periodic effect Effects 0.000 claims abstract description 19
- 238000007405 data analysis Methods 0.000 claims abstract description 17
- 238000012545 processing Methods 0.000 claims abstract description 14
- 238000007689 inspection Methods 0.000 claims abstract description 12
- 230000008569 process Effects 0.000 claims abstract description 10
- 230000002159 abnormal effect Effects 0.000 claims abstract description 9
- 238000012731 temporal analysis Methods 0.000 claims abstract description 8
- 238000000700 time series analysis Methods 0.000 claims abstract description 8
- 238000002790 cross-validation Methods 0.000 claims abstract description 6
- 238000004458 analytical method Methods 0.000 claims description 31
- 238000012549 training Methods 0.000 claims description 20
- 238000005311 autocorrelation function Methods 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 12
- 238000007781 pre-processing Methods 0.000 claims description 9
- 238000013480 data collection Methods 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 7
- 230000007774 longterm Effects 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000012417 linear regression Methods 0.000 claims description 3
- 238000013112 stability test Methods 0.000 claims description 3
- 238000012550 audit Methods 0.000 claims description 2
- 238000012423 maintenance Methods 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 5
- 238000010200 validation analysis Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013450 outlier detection Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 206010027339 Menstruation irregular Diseases 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/02—Preprocessing
- G06F2218/04—Denoising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Abstract
The invention discloses a power plant power equipment expert diagnosis system and method based on historical data, comprising the following steps: step 1: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data; step 2: performing time series analysis to identify and model seasonal, trending and periodic components in the data; step 3: extracting features based on results of the time series data analysis; step 4: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model. The invention provides a powerful and flexible tool when solving the problem of time sequence data analysis, can improve the diagnosis and maintenance flow of the power equipment and improve the reliability and performance of the equipment.
Description
Technical Field
The invention relates to the technical field of equipment diagnosis, in particular to a power plant power equipment expert diagnosis system and method based on historical data.
Background
Historical data of electrical devices may be used for diagnostics of electrical devices, such techniques commonly referred to as electrical device fault diagnostics or electrical device health monitoring. However, the power equipment data is usually time series data, and needs to deal with problems in time series analysis, such as seasonality, trend and periodicity, and also needs to deal with instability and emergency of the data.
Disclosure of Invention
In order to solve the problems, the invention provides a power plant power equipment expert diagnosis system and a power plant power equipment expert diagnosis method based on historical data.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
on the one hand, the invention discloses a power plant power equipment expert diagnosis method based on historical data, which comprises the following steps:
step 1: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data;
step 2: performing a time series analysis including seasonal, trend and periodicity analysis, identifying and modeling seasonal, trend and periodicity components in the data;
step 3: extracting features based on results of the time series data analysis;
step 4: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model.
Further: the step 1 comprises the following steps:
and (3) data collection:
deploying sensors and data acquisition equipment on the power equipment to record indexes in real time; setting a data acquisition period to determine the frequency of the data points;
performing quality inspection on the acquired data, and identifying and processing missing values; detecting an abnormal value; noise is reduced to smooth data;
adding time stamp information into the data to ensure that the data points are arranged in time sequence so as to construct a time sequence;
the preprocessed data are classified and stored according to equipment and indexes so as to facilitate subsequent analysis;
recording all data quality inspection and preprocessing steps, and storing the original data and the processed data for verification and audit.
Further: the step 2 comprises the following steps:
seasonal analysis: for identifying periodic fluctuations in data, comprising the steps of:
performing stability test on the time sequence data to ensure that the data is stable in time;
calculating an autocorrelation function and a partial autocorrelation function graph of the data;
determining the periodicity of the seasonal component according to the pattern of the autocorrelation function and the partial autocorrelation function;
trend analysis: trend analysis is used to identify long-term trends in data, and whether the data exhibits increasing or decreasing trends, comprising the steps of:
smoothing the data using a moving average method to reduce noise;
calculating a linear regression trend line of the smoothed data, wherein the slope of the regression line represents the direction and speed of the trend;
statistical testing of the trend is performed to determine if the trend is significant;
periodic analysis: a periodicity analysis for identifying a periodic component in data, comprising the steps of:
converting the time series data to a frequency domain using fourier transform;
identifying frequency components having significant amplitudes in the frequency domain;
the contribution of each frequency component is quantified by calculating fourier coefficients.
Further: the step 3 comprises the following steps:
sliding window feature:
selecting a window size, expressed in terms of number of data points or time units;
sliding the window in turn, moving one time step at a time, starting from the starting point of the time series data;
calculating statistical features within each window;
for periodic data, computing frequency domain features of the fourier transform to capture periodic components;
the formula is as follows:
Xi={x t ,x t-1 ,...x t-w+1 }
where x (t) represents time-series data, w represents window size, and t represents time step;
time lag characteristics:
expressed in the following manner:
for each observation x (t), hysteresis terms x (t-1), x (t-2), x (t-k) are introduced, where k represents the number of time steps of hysteresis.
Further: the step 4 comprises the following steps:
dividing a data set into a training set, a verification set and a test set by using the historical data processed by the characteristic engineering;
establishing an LSTM-based power equipment diagnosis model;
training the power equipment diagnosis model by using a training set, and defining a loss function during training;
forward and backward propagation is performed using training data, and model parameters are updated to minimize the loss function.
In another aspect, the invention discloses a historical data-based power plant power equipment expert diagnostic system, comprising:
data collection and preprocessing module: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data;
a time sequence data analysis module: performing a time series analysis including seasonal, trend and periodicity analysis, identifying and modeling seasonal, trend and periodicity components in the data;
the feature engineering extraction module: extracting features based on results of the time series data analysis;
the diagnostic model building module: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model.
Compared with the prior art, the invention has the following technical progress:
in the conventional method, the time sequence data analysis usually needs to manually select a proper model and parameters, and the deep learning-based method can automatically capture the time sequence mode in the data without manually adjusting the model. Often complex seasonal, trending and periodic patterns are contained in power plant data, and deep learning based methods can better capture these patterns and even accommodate irregular periodicity. The deep learning model has stronger generalization capability when processing instability and emergencies, and can diagnose on unseen data, while the generalization capability of the traditional method is generally poor. While conventional methods typically require complex feature engineering to extract information from the time series data, deep learning-based methods can simplify the process of feature engineering by learning to extract useful features from the raw data. The deep learning-based method can rapidly process new data and realize faster response time in diagnosis, thereby being beneficial to timely coping with emergencies. The deep learning model can be customized and expanded according to the data characteristics of different power equipment, and is suitable for various types of power equipment. Therefore, the invention provides a more powerful and flexible tool when solving the problem of time sequence data analysis, can improve the diagnosis and maintenance flow of the power equipment and improve the reliability and performance of the equipment.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
In the drawings:
FIG. 1 is a flow chart of the present invention.
Detailed Description
The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
Example 1
As shown in FIG. 1, the invention discloses a power plant power equipment expert diagnosis method based on historical data, which comprises the following steps:
step 1: data collection and preprocessing
Collecting historical data from the power equipment, including various indexes such as current, voltage, temperature, vibration and the like; performing quality inspection on the data, and processing missing values, abnormal values and noise; and adding a time stamp to ensure the time sequence property of the data.
Step 2: time series data analysis
Performing basic time series analysis including seasonal analysis, trend analysis, and periodicity analysis; statistical methods or signal processing techniques such as fourier transforms may be used; seasonal, trending, and periodic components in the data are identified and modeled to understand the fundamental patterns of the data.
Step 3: feature engineering
Extracting useful features such as rolling statistical features, periodic features, change rates and the like based on the results of the time series data analysis; considering the time lag characteristics to capture the varying delay effect; considering the sliding window technique, statistics within the window are generated.
Step 4: establishing a diagnostic model
Selecting a diagnostic model to process the time series data; training a model using historical data, including a marked failure sample and a normal operation sample; the models are cross-validated and performance evaluated to ensure accuracy and generalization ability of the models.
The method combines technical means such as data preprocessing, time sequence data analysis, characteristic engineering, machine learning and the like, so that the historical data of the power equipment can be used for diagnosis, the problem of time sequence data analysis is solved, and the method can play an important role in monitoring the state of the power equipment in real time and discovering potential problems in advance, thereby improving the reliability and efficiency of the power equipment.
Specifically, step 1 includes:
and (3) data collection: sensors and data acquisition devices are deployed on the power devices to record various indicators, such as current, voltage, temperature, and vibration, in real time. The data acquisition period is set to determine the frequency of the data points, typically in seconds or minutes. The collected data is stored in a database or data warehouse for further processing and analysis.
Checking data quality: the acquired data is subjected to quality inspection, missing values are identified and processed, and interpolation methods can be used to fill in the missing data. Outlier detection is performed, and outliers are identified and processed using statistical methods or outlier detection algorithms. Filtering techniques are applied to reduce noise, such as average filtering or median filtering, to smooth the data.
Adding a time stamp: adding time stamp information to the data ensures that the data points are arranged in time order to construct a time series. The time stamp may be marked with the system time or network time of the device, ensuring the time-series nature of the data.
Data storage structure: and storing the preprocessed data according to the equipment and the index classification for subsequent analysis. Data is stored using a database or time series database to support efficient data retrieval and querying.
Data cleaning document record: all data quality checking and preprocessing steps are recorded, including methods to handle missing values, outliers and noise. The raw data and the processed data are saved for future verification and auditing.
Through the steps, an effective data collection and preprocessing flow can be established to ensure the quality and time series nature of the historical data, which provides a reliable data basis for subsequent time series data analysis and feature engineering.
Specifically, step 2 includes:
seasonal analysis:
seasonal analysis is used to identify periodic fluctuations in the data, typically caused by seasonal factors, such as day, week or year seasonality, and includes the steps of:
firstly, the stability test is carried out on time sequence data, so that the data is ensured to be stable in time. One common method is the ADF (Augmented Dickey-Fuller) test;
then, an autocorrelation function (ACF) and a partial autocorrelation function (PACF) map of the data are calculated. These maps can be used to determine hysteresis values for seasonal ingredients;
from the pattern of ACF and PACF, the periodicity of the seasonal components is determined, e.g., seasonal fluctuations occur every 7 days.
Trend analysis:
trend analysis is used to identify long-term trends in data, and whether the data exhibits increasing or decreasing trends, and includes the steps of:
smoothing the data using a moving average method or a weighted moving average method to reduce noise;
calculating a linear regression trend line of the smoothed data, and using a least square method, wherein the slope of the regression line represents the direction and speed of the trend;
statistical tests of the trend, such as a significance test of the slope, are performed to determine if the trend is significant.
Periodic analysis:
the periodicity analysis is used to identify periodic components in the data, which components, unlike seasonal, may have irregular periods, the periodicity analysis comprising the steps of:
converting the time series data into a frequency domain by using a Fourier transform, wherein the Fourier transform can decompose the data into fluctuation components with different frequencies;
in the frequency domain, frequency components with significant amplitude are identified, which frequencies correspond to the periodicity of the data;
the contribution of each frequency component can be quantified by calculating fourier coefficients or power spectral densities.
By the steps, seasonal, trend and periodicity analysis can be carried out on time series data to understand the basic mode of the data, and the analysis results are helpful for further feature engineering and establishment of a diagnosis model.
Specifically, step 3 includes:
sliding window feature:
a sliding window technique is used that allows us to slide a fixed size window over the data and calculate statistics over each window, helping to capture the local patterns and changes of the data.
First, a window size (window length), typically expressed in number of data points or time units, is selected;
then, starting from the starting point of the time series data, sliding the window in turn, and moving one time step at a time;
within each window, various statistical features may be calculated, such as mean, standard deviation, maximum, minimum, etc., which may reflect the distribution and trend of the data within the window;
for periodic data, the frequency domain characteristics of the fourier transform can also be computed to capture periodic components;
the formulation of the sliding window technique is expressed as follows,
Xi={x t ,x t-1 ,...x t-w+1 }
where x (t) represents time-series data, w represents window size, and t represents time step.
Time lag characteristics:
the time-lag feature is an observation that introduces a lag on the time series data, helping to capture the effects of varying delays, and can be expressed in the following way:
for each observation x (t), a hysteresis term x (t-1), x (t-2), x (t-k), where k represents the number of time steps of hysteresis, may be introduced.
These time-lag terms can be used to construct new features such as differential features (x (t) -x (t-1)) or ratio features (x (t)/x (t-1)), and the like.
Through sliding window techniques and time lapse features, a rich feature set can be generated for subsequent modeling and diagnosis. These features may reflect local patterns, trends, and variations in the time series data, helping to improve the performance of the diagnostic model.
Specifically, step 4 includes:
data preparation:
using historical data processed by characteristic engineering, including time sequence characteristics, sliding window characteristics and time lag characteristics; the data sets are divided into training sets, validation sets and test sets, typically using 70-80% of the data as training sets, 10-15% of the data as validation sets, and the remainder as test sets.
Establishing an LSTM-based power equipment diagnosis model:
importing necessary libraries:
prior to starting modeling, python libraries are imported, including deep learning frameworks (e.g., tensorFlow or PyTorch) and data processing and model evaluation libraries.
Data preparation:
the method comprises the steps of preparing historical data subjected to feature engineering processing, wherein the historical data comprises time sequence features, sliding window features and time lag features, and dividing a data set into a training set, a verification set and a test set.
Establishing an LSTM model:
importing an LSTM layer and other necessary deep learning modules; constructing a sequence Model (Sequential Model); adding an LSTM layer, setting the super parameters of LSTM unit number, activation function and the like. The LSTM cell number generally represents the complexity of the model, with multiple LSTM layers stacked to increase the complexity and expressive power of the model; adding a full-connection output layer for outputting diagnosis results; the number of units of the output layer should match the number of diagnostic categories, using the softmax activation function.
Model compilation:
compiling a model, and designating a loss function, an optimizer and an evaluation index; the loss function is typically the cross entropy loss of the classification problem; the optimizer selects an Adam optimizer; the evaluation index comprises accuracy, precision, recall rate and the like.
Model training:
importing necessary libraries:
before training begins, a deep learning framework (e.g., tensorFlow or PyTorch) and necessary libraries are imported.
Defining a loss function and an optimizer:
a Cross-entropy loss function (Cross-EntropyLoss) of the classification problem is defined, which is used to measure the difference between the prediction result of the model and the real label, and its formula is as follows:
wherein y is i An i-th element representing a real tag,an i-th element representing the predicted output of the model.
Adam optimizers are selected as the trained optimization algorithm, and Adam combines the advantages of Adagrad and RMSprop and is suitable for most deep learning tasks.
The update rules of Adam optimizer are as follows:
m t =β 1 m t-1 +(1-β 1 )g t
v t =β 2 v t-1 +(1-β 2 )(g t ) 2
wherein m is t And v t Representing first and second moment estimates of the gradient respectively,and->Representing the corrected estimate of the bias, beta 1 And beta 2 Is the decay coefficient, α is the learning rate, t is the number of iterations, ε is a small constant, and the denominator is prevented from being zero.
Model training cycle:
in the training cycle, a batch of training data is traversed.
For each batch, the following operations are performed:
forward propagation is performed using the model to generate a prediction result.
The cross entropy loss of the batch is calculated.
Back propagation is performed and gradients of the loss with respect to the model parameters are calculated.
Model parameters were updated using Adam optimizer.
The above steps are repeated until a predetermined number of training cycles is reached or the loss reaches a satisfactory level.
Model evaluation:
the model is evaluated using the validation set to select the best model and the super parameters.
Loss on the validation set and the evaluation index can be monitored to determine the performance of the model.
Performance improvement:
and performing super-parameter adjustment on the model according to the verification result to improve the performance.
Techniques such as adding regularization, batch normalization, etc. can be considered to improve the generalization ability of the model.
Model test:
finally, the final model performance was evaluated using the test set to verify its generalization performance on unseen data.
The test results are interpreted and visualized to generate a diagnostic report.
The LSTM-based model can utilize time sequence data to diagnose the power equipment, wherein the LSTM layer can capture long-term dependency relationship in the time sequence data, the full-connection output layer is used for outputting diagnosis results, and the model can be continuously optimized to improve the performance by monitoring loss and evaluating indexes.
Example two
The invention discloses a power plant power equipment expert diagnosis system based on historical data, which comprises:
data collection and preprocessing module: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data;
a time sequence data analysis module: performing a time series analysis including seasonal, trend and periodicity analysis, identifying and modeling seasonal, trend and periodicity components in the data;
the feature engineering extraction module: extracting features based on results of the time series data analysis;
the diagnostic model building module: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (6)
1. The power plant power equipment expert diagnosis method based on the historical data is characterized by comprising the following steps of:
step 1: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data;
step 2: performing a time series analysis including seasonal, trend and periodicity analysis, identifying and modeling seasonal, trend and periodicity components in the data;
step 3: extracting features based on results of the time series data analysis;
step 4: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model.
2. The method for diagnosing power plant experts based on historical data according to claim 1, wherein the step 1 comprises:
and (3) data collection:
deploying sensors and data acquisition equipment on the power equipment to record indexes in real time; setting a data acquisition period to determine the frequency of the data points;
performing quality inspection on the acquired data, and identifying and processing missing values; detecting an abnormal value; noise is reduced to smooth data;
adding time stamp information into the data to ensure that the data points are arranged in time sequence so as to construct a time sequence;
the preprocessed data are classified and stored according to equipment and indexes so as to facilitate subsequent analysis;
recording all data quality inspection and preprocessing steps, and storing the original data and the processed data for verification and audit.
3. The method for expert diagnosis of power plant power equipment based on historical data according to claim 2, wherein said step 2 comprises:
seasonal analysis: for identifying periodic fluctuations in data, comprising the steps of:
performing stability test on the time sequence data to ensure that the data is stable in time;
calculating an autocorrelation function and a partial autocorrelation function graph of the data;
determining the periodicity of the seasonal component according to the pattern of the autocorrelation function and the partial autocorrelation function;
trend analysis: trend analysis is used to identify long-term trends in data, and whether the data exhibits increasing or decreasing trends, comprising the steps of:
smoothing the data using a moving average method to reduce noise;
calculating a linear regression trend line of the smoothed data, wherein the slope of the regression line represents the direction and speed of the trend;
statistical testing of the trend is performed to determine if the trend is significant;
periodic analysis: a periodicity analysis for identifying a periodic component in data, comprising the steps of:
converting the time series data to a frequency domain using fourier transform;
identifying frequency components having significant amplitudes in the frequency domain;
the contribution of each frequency component is quantified by calculating fourier coefficients.
4. A method for expert diagnosis of power plant based on historical data according to claim 3, wherein said step 3 comprises:
sliding window feature:
selecting a window size, expressed in terms of number of data points or time units;
sliding the window in turn, moving one time step at a time, starting from the starting point of the time series data;
calculating statistical features within each window;
for periodic data, computing frequency domain features of the fourier transform to capture periodic components;
the formula is as follows:
Xi={x t ,x t-1 ,...x t-w+1 }
where x (t) represents time-series data, w represents window size, and t represents time step;
time lag characteristics:
expressed in the following manner:
for each observation x (t), hysteresis terms x (t-1), x (t-2), x (t-k) are introduced, where k represents the number of time steps of hysteresis.
5. The method for expert diagnosis of power plant based on historical data as recited in claim 4, wherein said step 4 comprises:
dividing a data set into a training set, a verification set and a test set by using the historical data processed by the characteristic engineering;
establishing an LSTM-based power equipment diagnosis model;
training the power equipment diagnosis model by using a training set, and defining a loss function during training;
forward and backward propagation is performed using training data, and model parameters are updated to minimize the loss function.
6. A historical data-based power plant expert diagnostic system, comprising:
data collection and preprocessing module: collecting historical data from the power equipment, performing quality inspection on the data, processing missing values, abnormal values and noise, adding a time stamp, and ensuring time sequence properties of the data;
a time sequence data analysis module: performing a time series analysis including seasonal, trend and periodicity analysis, identifying and modeling seasonal, trend and periodicity components in the data;
the feature engineering extraction module: extracting features based on results of the time series data analysis;
the diagnostic model building module: a diagnostic model is built to process the time series data, the model is trained using historical data, including marked fault samples and normal operation samples, and cross-validation and performance assessment of the model is performed to ensure accuracy and generalization capability of the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311421206.7A CN117290685A (en) | 2023-10-30 | 2023-10-30 | Power plant power equipment expert diagnosis system and method based on historical data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311421206.7A CN117290685A (en) | 2023-10-30 | 2023-10-30 | Power plant power equipment expert diagnosis system and method based on historical data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117290685A true CN117290685A (en) | 2023-12-26 |
Family
ID=89244517
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311421206.7A Pending CN117290685A (en) | 2023-10-30 | 2023-10-30 | Power plant power equipment expert diagnosis system and method based on historical data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117290685A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117708550A (en) * | 2024-02-05 | 2024-03-15 | 国网山东省电力公司电力科学研究院 | Automatic data analysis and model construction method for electric power big data |
-
2023
- 2023-10-30 CN CN202311421206.7A patent/CN117290685A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117708550A (en) * | 2024-02-05 | 2024-03-15 | 国网山东省电力公司电力科学研究院 | Automatic data analysis and model construction method for electric power big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110610035B (en) | Rolling bearing residual life prediction method based on GRU neural network | |
JP6740247B2 (en) | Anomaly detection system, anomaly detection method, anomaly detection program and learned model generation method | |
Soualhi et al. | Prognosis of bearing failures using hidden Markov models and the adaptive neuro-fuzzy inference system | |
CN116757534B (en) | Intelligent refrigerator reliability analysis method based on neural training network | |
CN104504296A (en) | Gaussian mixture hidden Markov model and regression analysis remaining life prediction method | |
CN105593864B (en) | Analytical device degradation for maintenance device | |
CN109376401B (en) | Self-adaptive multi-source information optimization and fusion mechanical residual life prediction method | |
US20040199362A1 (en) | Intelligent modelling of process and tool health | |
CN111310981B (en) | Reservoir water level trend prediction method based on time series | |
KR20140041766A (en) | Method of sequential kernel regression modeling for forecasting and prognostics | |
JP2014525096A (en) | Monitoring method using kernel regression modeling with pattern sequence | |
JP2014525097A (en) | A system of sequential kernel regression modeling for forecasting and forecasting | |
CN114297036B (en) | Data processing method, device, electronic equipment and readable storage medium | |
CN111337244B (en) | Method and device for monitoring and diagnosing faults of input shaft of fan gearbox | |
CN117290685A (en) | Power plant power equipment expert diagnosis system and method based on historical data | |
CN115293326A (en) | Training method and device of power load prediction model and power load prediction method | |
CN116380445B (en) | Equipment state diagnosis method and related device based on vibration waveform | |
CN113947017A (en) | Method for predicting residual service life of rolling bearing | |
CN116597939A (en) | Medicine quality control management analysis system and method based on big data | |
Melendez et al. | Self-supervised Multi-stage Estimation of Remaining Useful Life for Electric Drive Units | |
Matania et al. | Transfer across different machines by transfer function estimation | |
CN110532698B (en) | Industrial equipment vibration characteristic value trend prediction method based on data model | |
Wang et al. | Degradation pattern identification and remaining useful life prediction for mechanical equipment using SKF-EN | |
Farahat et al. | Similarity-based feature extraction from vibration data for prognostics | |
CN112307918A (en) | Diagnosis method for transformer direct-current magnetic biasing based on fuzzy neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |