CN113987033A - Main transformer online monitoring data group deviation identification and calibration method - Google Patents

Main transformer online monitoring data group deviation identification and calibration method Download PDF

Info

Publication number
CN113987033A
CN113987033A CN202111615216.5A CN202111615216A CN113987033A CN 113987033 A CN113987033 A CN 113987033A CN 202111615216 A CN202111615216 A CN 202111615216A CN 113987033 A CN113987033 A CN 113987033A
Authority
CN
China
Prior art keywords
data
deviation
association
node
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111615216.5A
Other languages
Chinese (zh)
Other versions
CN113987033B (en
Inventor
童超
辛建波
康琛
周梦垚
许勇
朱自伟
刘凤龙
李阳林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
Nanchang University
Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd
Original Assignee
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
Nanchang University
Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangxi Electric Power Co ltd, State Grid Corp of China SGCC, Nanchang University, Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd filed Critical State Grid Jiangxi Electric Power Co ltd
Priority to CN202111615216.5A priority Critical patent/CN113987033B/en
Publication of CN113987033A publication Critical patent/CN113987033A/en
Application granted granted Critical
Publication of CN113987033B publication Critical patent/CN113987033B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Strategic Management (AREA)
  • Fuzzy Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Biology (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Physiology (AREA)
  • Game Theory and Decision Science (AREA)

Abstract

The invention discloses a main transformer on-line monitoring data group deviation identification and calibration method, which comprises the steps of collecting data; performing piecewise linearization on online monitoring data checked by offline data, and extracting line segment curves and data group characteristics; constructing a segmented association mining model, symbolizing a line segment set characterized by curve characteristics, mining the association of different indexes of online detection data by using an Apriori algorithm, and finding an abnormal numerical value; obtaining support degree change by considering time interval characteristics, and identifying data deviation; constructing a multi-index prediction model by using the strongly correlated index sequence to finish deviation calibration; and recalculating the relevance of different indexes by using the calibration data, and verifying the reliability of the deviation calibration of the data group. The method can identify the data group deviation by mining association rules of different index sequences, and construct a BP neural network algorithm multi-index prediction model based on improved genetic algorithm optimization to calibrate the data group deviation.

Description

Main transformer online monitoring data group deviation identification and calibration method
Technical Field
The invention discloses a method for identifying and calibrating group deviation of main transformer on-line monitoring data, and belongs to the field of cleaning of on-line monitoring data of power equipment.
Background
The transformer is a pivotal device for electric energy conversion and transmission, and the safe and stable operation of the transformer is an important guarantee for the power supply quality of users. DGA online monitoring data of the transformer is real-time monitoring on the insulation performance of equipment, and the DGA online monitoring data is an important basis for state evaluation of the transformer. The monitoring data of a period of time, including the overall change trend, the extreme points and jump points in the change and the data statistical characteristics, can reflect the possible abnormal conditions in the transformer from multiple aspects. Meanwhile, the online monitoring device is used as a part of the transformer, and the health state of the online monitoring device is related to the overall health of the transformer, so that the validity and the rationality of online monitoring data are influenced.
Along with the continuous accumulation of the running time, the problems exposed by the DGA online monitoring device are increased day by day in the running process, the main reasons of the problems are two aspects, on one hand, the problems of various software and hardware faults are frequently caused by the product quality problems of different manufacturers or the management problems of a power supply office, and the problems are mainly characterized in that individual abnormal points of data occur or the signals are disconnected; on the other hand, due to the accumulation of the sampling sensors of the device along with the running time, the data read by a certain index sensor always has a deviation, so that the index data integrally deviates from the actual value. At present, a plurality of cleaning methods aiming at abnormal points of online monitoring data exist at home and abroad, but an effective analysis means for data group deviation is lacked. In order to solve the problem of misjudgment caused by integral offset of data, the power supply company adopts a mode that an operation and maintenance unit is mainly required to manually calibrate every section of data on site by using standard transformer oil or delete the offset data. The method is low in efficiency, time-consuming, labor-consuming and expensive, the real-time performance of the Online Monitoring Device (OMDS) of the power equipment is greatly reduced, online monitoring data are often misreported due to untimely operation and maintenance in many areas, in order to avoid misreporting, a plurality of units directly break signals or turn off early warning, the online monitoring device of the transformer cannot exert the monitoring effect, and early warning of latent faults of the transformer becomes untimely and applicable.
The method comprises the steps of constructing a box line graph of certain section of data group characteristics, analyzing the overall characteristics of sequence data, mining the segmentation association rules to obtain the variation of the relevance of the data in time sequence, and effectively judging the overall deviation of the data through the variation of the relevance.
After long-term operation of the power equipment, large-scale online monitoring data are stored in a power database, the data of the mined group samples are all data after offline check, the association relation existing in the group is mined by performing association analysis on the index data after offline check, the data of different deviations in the data are analyzed based on the association relation, and the deviations are calibrated.
Disclosure of Invention
The invention aims to provide a method for identifying and calibrating group deviation of online monitoring data of a main transformer, so as to solve the problems of the technical background.
The invention is realized by the following technical scheme, and the method for identifying and calibrating the group deviation of the online monitoring data of the main transformer comprises the following specific steps:
s1, data collection: the online monitoring data plays an important role in the real-time state representation of the transformer equipment, is an important basis for the state evaluation and fault identification of the transformer equipment, and collects the online monitoring data of the equipment and the transformer fault information related to the online monitoring data to form a perfect data system;
s2, carrying out piecewise linearization on the online monitoring data, and extracting line segment curve characteristics;
s3, constructing a segmented association mining model, symbolizing a line segment set characterized by curve characteristics, mining the association of different indexes of DGA by using an Apriori algorithm, and finding an abnormal numerical value: extracting boxplot characteristics of line segment group characteristics, analyzing overall characteristics of sequence data through boxplots, constructing a line segment group similarity model, classifying the line segments by using an improved K-means clustering algorithm based on the maximum and minimum distances, setting the number of classes as 6 according to the overall classification of curve characteristics, and giving symbols to the line segments of the same classes to finish the symbolization of online monitoring data; based on the idea of Apriori algorithm, setting minimum confidence and support degree, mining frequent item sets existing among different sequences, quantifying the relevance among different sequences, and obtaining the data support degree and confidence degree of an index level;
s4, taking the time interval characteristic into consideration, acquiring the support degree change, and identifying the data deviation: judging abnormal deviation existing in the tail end time in the data according to the relevance strength change of the online monitoring data on the time sequence, and finishing the tracking of the data deviation;
s5, constructing a multi-index prediction model of the BP neural network algorithm based on genetic simulated annealing algorithm optimization by using the strongly correlated index data, and completing calibration of deviation data;
and S6, mining association rules again for the two sequence data segments by using the calibrated data, judging whether the association degree of the calibrated deviation data segment meets the association degree threshold value, and checking the reliability of deviation calibration.
Specifically, in step S2, the continuous data points that can be linearly fitted are regarded as the same set according to the data point time sequence, and the data points are determined as the data point fitting error threshold, and when the error approaches the threshold, the addition of the data points is stopped; and selecting a line segment with the fitting error of 0.8 and fitting the line segment by using the characteristics of the slope, the constant term, the line segment span, the line segment fitting error, the change of the front and rear mean values of the line segment and the like.
Specifically, in step S4, association rule mining is performed on the data segments respectively, so as to obtain the association degree variation of two sequence data in time sequence, thereby implementing data association degree tracking; and identifying the data deviation according to the change of the correlation degree of the two sequence data.
Specifically, the main process of step S4 is:
s41, based on the segmented association mining model in the step S3, a frequent item set between the two types of data can be obtained as an association rule; comprehensively calculating the data association degree of the index level through the support degree and the confidence degree of the association rule;
s42, mining association rules of data segments respectively by dividing the data sequence into a plurality of data segments, acquiring association degree change of two sequence data in a time sequence, and realizing data association degree tracking; defining minimum support threshold of index level
Figure 100002_DEST_PATH_IMAGE001
And minimum confidence threshold
Figure 100002_DEST_PATH_IMAGE002
At deviation from the minimum confidence threshold
Figure 787071DEST_PATH_IMAGE002
When the data is associated with the original data, the data is considered to be associated with the original data.
Specifically, in step S5 of the present invention: and constructing a multi-index prediction model of the BP neural network algorithm based on genetic simulated annealing algorithm optimization by utilizing the strong association rule sequence data, and calibrating deviation data by predicting deviation data through the multi-index prediction model. The specific process of step S5 is as follows:
s51, based on the segmented association mining result in the step S3, constructing a multi-index prediction model of a BP neural network algorithm based on genetic simulation annealing algorithm optimization by using strong association rule sequence data, taking sequence data with strong association with a deviation data sequence as an input layer of the neural network, and taking deviation data as an output layer to construct the neural network;
s52, correcting the network node connection weight through a genetic simulation annealing algorithm, substituting the optimal weight into the network node connection weight, obtaining normal data of the standard oil sample, completing training and testing of the neural network, predicting deviation data, and completing calibration of the deviation data.
Specifically, the process of the BP neural network algorithm based on genetic simulated annealing algorithm optimization is as follows: determining the node numbers of a neural network input layer, a hidden layer and an output layer, initializing relevant parameters to obtain network node codes, inputting sample data, carrying out normalization processing, carrying out selection, crossing, mutation, annealing and new individual fitness calculation operation of a genetic simulation annealing algorithm together with the network node codes, judging whether the fitness requirements are met or not, if not, returning to the crossing step, if so, obtaining the optimal weight of the node, then training the sample data through a BP neural network algorithm, calculating the mean square error of the training result, judging whether the error allowable range is met or not, if not, retraining the sample data, and if so, determining the optimal prediction network.
Specifically, the specific process of step S6 of the present invention is:
s61, mining the segmentation association rules of the two sequences with the strong association rules again by using the calibrated data to obtain the association rules of the two calibrated sequence data;
s62, comparing the correlation degree of the two calibrated sequence data deviation parts with a correlation degree threshold value, and checking the reliability of deviation calibration;
and S63, setting a correlation degree threshold, namely mining the overall correlation rule of the two normal sequence data to obtain the overall correlation degree of the two sequence data, and setting the correlation degree threshold as limit _ rev.
The method collects the online monitoring data of the transformer equipment, and realizes the detection of the data deviation by mining and analyzing the linear coupling relation through the correlation degree between the data aiming at the online monitoring data. Firstly, carrying out piecewise linearization on data, carrying out index analysis on curve characteristics, analyzing the overall characteristics of sequence data by constructing a box line graph of certain section of data population characteristics, then symbolizing online monitoring segmented data by using improved K-means clustering, mining the relevance between different indexes in the data by using an Apriori algorithm so as to discover the existence of abnormal data, constructing a BP neural network multi-index prediction model improved by a genetic simulation annealing algorithm, predicting deviation data and realizing the calibration of the deviation data.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is hydrogen indicator on-line data.
FIG. 3 is methane indicator online data.
FIG. 4 is a comparison of hydrogen indicators before and after fitting.
FIG. 5 is a comparison graph of methane indicator fit.
FIG. 6 is a flowchart of BP neural network algorithm optimized based on genetic simulated annealing algorithm.
FIG. 7 is a graph comparing deviation data with raw data.
FIG. 8 is a graph of deviation data prediction.
FIG. 9 is a population histogram characterization of hydrogen band deviation data.
FIG. 10 is a population box plot characterization of hydrogen normal data.
FIG. 11 is a population box plot characterization of methane normal data.
In the figure, IQR refers to a quartile range.
Detailed Description
The invention is explained in more detail below with reference to the figures and examples.
Referring to fig. 1, a method for identifying and calibrating group deviation of online monitoring data of a main transformer comprises the following specific steps:
s1, data collection: the online monitoring data plays an important role in the real-time state representation of the transformer equipment, is an important basis for the state evaluation and fault identification of the transformer equipment, and collects the online monitoring data of the equipment and the transformer fault information related to the online monitoring data to form a perfect data system;
s2, carrying out piecewise linearization on the online monitoring data, and extracting the characteristics of a line segment curve: according to the time sequence of the data points, regarding continuous data points which can be linearly fitted as a same set, judging the data points as fitting error threshold values, and stopping adding the data points when the error approaches the threshold values; selecting a line segment with the fitting error of 0.8 and fitting the line segment by the characteristics of the slope, the constant term, the line segment span, the line segment fitting error, the change of the front and rear mean values of the line segment and the like;
s3, constructing a segmented association mining model, symbolizing a line segment set characterized by curve characteristics, mining the association of different indexes of DGA by using an Apriori algorithm, and finding an abnormal numerical value: extracting boxplot characteristics of line segment group characteristics, analyzing overall characteristics of sequence data through boxplots, constructing a line segment group similarity model, classifying the line segments by using an improved K-means clustering algorithm based on the maximum and minimum distances, setting the number of classes as 6 according to the overall classification of curve characteristics, and giving symbols to the line segments of the same classes to finish the symbolization of online monitoring data; based on the idea of Apriori algorithm, setting minimum confidence and support degree, mining frequent item sets existing among different sequences, quantifying the relevance among different sequences, and obtaining the data support degree and confidence degree of an index level; the line segment group similarity model is insensitive to the identification of single deviation data and sensitive to the overall offset response of the data;
s4, taking the time interval characteristic into consideration, acquiring the support degree change, and identifying the data deviation: judging abnormal deviation existing in the tail end time in the data according to the relevance strength change of the online monitoring data on the time sequence, and finishing the tracking of the data deviation; identifying data deviation according to the change of the correlation degree of the two sequence data;
s5, constructing a multi-index prediction model of the BP neural network algorithm based on genetic simulated annealing algorithm optimization by using the strongly-associated index data, predicting deviation data, and completing calibration of the deviation data: building a multi-index prediction model of a BP neural network algorithm based on genetic simulated annealing algorithm optimization by utilizing the strong association rule sequence data, and calibrating deviation data by predicting deviation data through the multi-index prediction model;
and S6, mining association rules again for the two sequence data segments by using the calibrated data, judging whether the association degree of the calibrated deviation data segment meets the association degree threshold value, and checking the reliability of deviation calibration.
In step S1, according to the state evaluation guide rule of oil-immersed transformers (reactors) in the power industry standard of china, the DGA online monitoring data indexes are selected to be hydrogen, methane, ethane, ethylene, acetylene and micro water by combining the actual production experience of enterprises and the existing fault defect records and referring to some research results of documents.
The specific steps for performing the piecewise linearization on the online monitoring data provided in the step S2 of the present invention are:
s21, regarding the online monitoring data of the device indexes similar to the DGA, the essence of the online monitoring data can be regarded as state index values which are acquired one by one according to a certain time interval sequence. Data is known to have strong temporal properties and can be equated to time series data.
S22, time sequence
Figure 100002_DEST_PATH_IMAGE003
In terms of length
Figure 100002_DEST_PATH_IMAGE004
The data points in the intercepted window are subjected to piecewise linear fitting based on the idea of a sliding window.
S23, taking the first data point in the window as a fitting starting point of the initial line segment, and enabling the point to be the fitting starting point of the initial line segment
Figure 100002_DEST_PATH_IMAGE005
Assuming that the fitting end point of the initial line segment is
Figure 100002_DEST_PATH_IMAGE006
Will this
Figure 100002_DEST_PATH_IMAGE007
The data points are fitted to a line segment.
S24. then for such a line segment, it can be expressed by the following equation:
Figure 100002_DEST_PATH_IMAGE008
(1)
Figure 100002_DEST_PATH_IMAGE009
(2)
in the formula
Figure 100002_DEST_PATH_IMAGE010
In the representation of time series
Figure 100002_DEST_PATH_IMAGE011
The value of the sample at the time of day,
Figure 100002_DEST_PATH_IMAGE012
representing the number of numerical points contained in the fitted line segment,
Figure 100002_DEST_PATH_IMAGE013
a dependent variable representing a fitted line segment equation;
from actual data points to fitted line segmentsThe distance is used as a fitting error, and the fitting accuracy of the fitting line segment to the actual numerical value points is improved; calculating the distance from all actual data points in the step length of the fitted line segment to the line segment
Figure 100002_DEST_PATH_IMAGE014
The sum of the error values is used as the overall error of the fitting of the line segment
Figure 100002_DEST_PATH_IMAGE015
Figure 100002_DEST_PATH_IMAGE016
(3)
Figure 100002_DEST_PATH_IMAGE017
(4)
In the formula (I), the compound is shown in the specification,t n represents a time step; setting a fitting error threshold value of
Figure 100002_DEST_PATH_IMAGE018
If, if
Figure 100002_DEST_PATH_IMAGE019
If the point is short, the fitting point can still be continuously increased to make
Figure 100002_DEST_PATH_IMAGE020
Repeating the above steps; if there is
Figure 100002_DEST_PATH_IMAGE021
Then give an order
Figure 100002_DEST_PATH_IMAGE022
Taking the point as a line segment fitting end point to generate a line segment; if there is
Figure 100002_DEST_PATH_IMAGE023
If so, judging that the line segment can not be fitted, and saving the fitting end point of the current line segment as
Figure 100002_DEST_PATH_IMAGE024
Recording the data sampling time, and returning to step S23 to reset the parameters
Figure 100002_DEST_PATH_IMAGE025
And performing data fitting of the next part by taking the current fitting end point as a fitting starting point of the next line segment until all data points in the sequence are fitted.
The slope of the fitted line segment is assumed to be
Figure 100002_DEST_PATH_IMAGE026
The number of fitting numerical points in the line segment is
Figure 100002_DEST_PATH_IMAGE027
Then the actual growth rate of the line segment fit data can be expressed as:
Figure 100002_DEST_PATH_IMAGE028
the slope growth rate of each line segment relative to the previous line segment can be expressed as:
Figure DEST_PATH_IMAGE029
and the first quartile of the m-number-point box diagram of each line segment is expressed as
Figure DEST_PATH_IMAGE030
And the third quartile of the m-number-point box diagram of each line segment is expressed as
Figure DEST_PATH_IMAGE031
To do so by
Figure DEST_PATH_IMAGE032
Figure DEST_PATH_IMAGE033
Six-element group of six-element line segments
Figure DEST_PATH_IMAGE034
By the numberThe group represents a line segment that is fitted.
The main steps of constructing the segmentation association mining model in the step S3 of the invention are as follows:
s31, in order to remove magnitude order difference between different indexes in online monitoring data, firstly, the attributes of all line segments in the same sequence are shaped as
Figure DEST_PATH_IMAGE035
The normalization operation of (1) maps the values to the (0, 1) interval.
S32, in clustering analysis, a standard for measuring the line segment category characteristics needs to be established; the online monitoring data reflects real-time state information quantity of equipment, and the change trend and the form of parameters can reflect the change of the running state of the equipment most, so that when the characteristic of measuring line segment types is established, different considerations are needed for different attributes of line segments. The rate of change of slope describes the degree of fluctuation of the sequence data; meanwhile, the online monitoring data of the transformer has more noise data, the first quantile, the third quantile and the median of the box line graph can roughly reflect the dispersion degree of the data, and the influence of abnormal noise data is well eliminated.
The established line segment group similarity model is shown as the following formula:
Figure DEST_PATH_IMAGE036
(5)
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE037
the similarity of the line segment groups is shown,
Figure DEST_PATH_IMAGE038
respectively representing the slope, span, growth rate, first quartile and third quartile in the line segment population similarity modelAnd the ratio of the slope change rate to the weight;
and S33, based on the line segment group similarity model, performing clustering analysis on the line segment set by using an improved K-means algorithm based on the maximum and minimum distances, and dividing similar line segments into the same category.
S34, the maximum and minimum distances are also based on Euclidean distances, and the method is different from the K-means algorithm in that objects with the largest quantity are taken as clustering centers. For a sample set, a scaling factor is given
Figure DEST_PATH_IMAGE039
Arbitrarily take a sample set
Figure DEST_PATH_IMAGE040
Is the initial cluster center and is recorded as
Figure DEST_PATH_IMAGE041
Is left after taking
Figure DEST_PATH_IMAGE042
Distance in one sample
Figure 459752DEST_PATH_IMAGE041
The farthest sample is the second cluster center, and is recorded as
Figure DEST_PATH_IMAGE043
Calculate the remaining
Figure DEST_PATH_IMAGE044
A sample and
Figure 41912DEST_PATH_IMAGE041
and
Figure 387443DEST_PATH_IMAGE043
and finding the minimum value among them, namely:
Figure DEST_PATH_IMAGE045
(6)
Figure DEST_PATH_IMAGE046
(7)
in the formula (I), the compound is shown in the specification,D ij representing the euclidean distance between samples i, j,D i represents the sample i and the cluster center
Figure 147589DEST_PATH_IMAGE041
And
Figure 70414DEST_PATH_IMAGE043
the minimum value of the distance between the two,x i representing samples i, zjRepresenting the jth cluster center.
If it is
Figure DEST_PATH_IMAGE047
(8)
ziRepresents the center of the i-th cluster,
Figure DEST_PATH_IMAGE048
coefficient of proportionality
Figure DEST_PATH_IMAGE049
Selecting corresponding samples
Figure DEST_PATH_IMAGE050
As a third cluster center
Figure DEST_PATH_IMAGE051
Suppose there is
Figure DEST_PATH_IMAGE052
A cluster center, thereby calculating the rest
Figure DEST_PATH_IMAGE053
Distance of individual sample to cluster center
Figure DEST_PATH_IMAGE054
And has the following components:
Figure DEST_PATH_IMAGE055
(9)
Figure DEST_PATH_IMAGE056
representing the Euclidean distance between the samples i and k, the corresponding sample
Figure DEST_PATH_IMAGE057
Is as follows
Figure DEST_PATH_IMAGE058
A cluster center, note
Figure DEST_PATH_IMAGE059
(ii) a And the process is continuously cycled until no new cluster centers appear.
When no new cluster center is present, the samples are assigned to each class according to the minimum distance principle. The improved K-means clustering algorithm based on the maximum and minimum distances has the advantages that the clustering centers are consistent during each clustering analysis, the randomness of selecting the clustering centers by the traditional K-means algorithm is eliminated, and the accuracy and the speed of the clustering analysis can be effectively improved.
S35, setting parameters of minimum support degree and minimum confidence degree; judging the basis of sequence association and frequent item set when the confidence coefficient and the support degree threshold value exist, wherein a proper threshold value parameter is favorable for enhancing the reliability of the association relation, and the minimum support degree threshold value of the frequent item set 1 and the frequent item set 2 is recorded as
Figure DEST_PATH_IMAGE060
And
Figure DEST_PATH_IMAGE061
the minimum confidence threshold in the sequence association mining is
Figure DEST_PATH_IMAGE062
S36, generation of frequent item sets: using the summed two-signed sequence as a transaction set, denoted
Figure DEST_PATH_IMAGE063
Wherein
Figure DEST_PATH_IMAGE064
Wherein nA and nB are the number of line segments divided by the sequence A and the sequence B respectively,
Figure DEST_PATH_IMAGE065
to symbolize a sequence
Figure DEST_PATH_IMAGE066
The symbol elements of (a) are,
Figure DEST_PATH_IMAGE067
for another symbolized sequence
Figure DEST_PATH_IMAGE068
A symbol element of (a); all symbol categories corresponding to the two sequences are:
Figure DEST_PATH_IMAGE069
and
Figure DEST_PATH_IMAGE070
wherein mA and mB are respectively the symbol types of two symbol sets A and B, and a frequent item set of the sequence is obtained by scanning two phases of the transaction set based on an Apriori algorithm. The confidence for each symbol in the sequence is calculated according to equation (10):
Figure DEST_PATH_IMAGE071
(10)
in the formula
Figure DEST_PATH_IMAGE072
And
Figure DEST_PATH_IMAGE073
is represented byTwo index objects that need to mine association rules,
Figure DEST_PATH_IMAGE074
representing the number of transaction sets, namely the number of elements in the sequence, and representing the degree of occupation of the items in the transaction sets by the support degree, wherein the support degree is greater than that of the frequent item set 1 when the frequent item set is explored
Figure DEST_PATH_IMAGE075
Into the collection of frequent itemsets 1.
The sets of the frequent item set 1 of the two sequences in the correlation mining are respectively
Figure DEST_PATH_IMAGE076
Figure DEST_PATH_IMAGE077
Pairing items in the set pairwise according to the index parameters to form the form
Figure DEST_PATH_IMAGE078
A frequent item set 2 in a form, calculating the support degree of each item in the frequent item set 2, and enabling the support degree to be larger than the support degree
Figure DEST_PATH_IMAGE079
The items of (2) are divided into frequent item sets 2, which are recorded as
Figure DEST_PATH_IMAGE080
S37, mining sequence relevance; combining all the sequences pairwise, and respectively counting the support degree of the items in the frequent item set 2 and the confidence degree between the corresponding association mining sequences;
firstly, the support degrees of all frequent item sets 2 between two index parameters are accumulated according to the formula (11), and the support degrees are used as the support degree counts of the two parameter sequences in all multivariate sequences.
Figure DEST_PATH_IMAGE081
(11)
Figure DEST_PATH_IMAGE082
(12)
Figure DEST_PATH_IMAGE083
(13)
Wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE084
the total number of the classified line segment categories after the two-sequence clustering analysis. At the same time, the minimum support threshold of the index sequence level is
Figure DEST_PATH_IMAGE085
If the support degree of the parameter index level is larger than the set threshold value, calculating the confidence degree of the symbol item set combination in the two sequences
Figure DEST_PATH_IMAGE086
As shown in formula (14):
Figure DEST_PATH_IMAGE087
(14)
when the confidence is larger than the set minimum confidence threshold, the association rule is reserved
Figure DEST_PATH_IMAGE088
And describing the strength of the association between the two indexes by using the confidence coefficient, and judging that the two indexes have strong association.
The main process of step S4 of the present invention is:
s41, based on the segmented association mining model in the step S3, a frequent item set 2 between the two types of data can be obtained as an association rule; and comprehensively calculating the data association degree of the index level through the support degree and the confidence degree of the association rule.
S42, dividing the data sequence into a plurality of data paragraphs, respectively mining association rules of the data segments, acquiring association degree change of two sequence data in time sequence, and realizing dataAnd tracking the relevance. Defining minimum support threshold of index level
Figure DEST_PATH_IMAGE089
And minimum confidence threshold
Figure DEST_PATH_IMAGE090
At deviation from the minimum confidence threshold
Figure 531964DEST_PATH_IMAGE090
When the data is associated with the original data, the data is considered to be associated with the original data.
The specific idea of step S5 of the present invention is:
1) the BP neural network is a multilayer feedforward neural network with error back propagation, and is the most widely applied neural network at present. The network structure generally comprises an input layer, a hidden layer and an output layer. The input and the output are connected. The back propagation of the error is the core of the BP neural network model algorithm, the error is continuously reduced along with the increase of the iteration times, the training is stopped when the manually set iteration times or the minimum error is reached, and the optimal connection weight and the threshold between the neurons are determined.
2) The simulated annealing algorithm inspired by the process of temperature change, which resulted in the change in energy after heating and cooling of the solid. The simulated annealing algorithm has strong local searching capability. An increase in the objective function may be accepted according to the probabilistic rules of the Metropolis standard. This criterion relies on control parameters, similar to physical annealing, and is therefore referred to as the system temperature. And the simulated annealing algorithm obtains the global optimal solution of the problem by repeated sampling in the process of continuously reducing the temperature.
3) The genetic simulated annealing algorithm is an improved genetic algorithm. Genetic algorithm is an algorithm derived from natural essence processes and becomes an outstanding random search technique based on biology. The genetic algorithm first starts with an initial set of random solutions, encoding each object as a chromosome, representing the solution to the problem. In the continuous iteration process, the chromosome evolves continuously, and whether the chromosome is the optimal solution is judged based on the fitness standard. Its progeny are formed by merging the two chromosomes of the current generation using a crossover operator or by modifying the chromosomes using a mutation operator.
4) The genetic simulated annealing algorithm combines the two algorithms for the purpose of improving efficiency. The algorithm not only retains the advantages of global search of the genetic algorithm, but also combines the advantages of local search of the simulated annealing algorithm. The genetic simulated annealing algorithm essentially starts with a set of randomly generated initial solutions and performs an optimal solution search. Firstly, a new group of individuals is generated based on a series of genetic operations, then an annealing process is introduced for correction, the process is similar to a random gradient descent method, and finally the annealing result is used as the new individuals.
5) The BP neural network principle optimized by the genetic simulated annealing algorithm is as follows: the BP neural network algorithm optimized by the genetic simulated annealing algorithm firstly determines a neural network structure, secondly optimizes by utilizing the genetic simulated annealing algorithm and finally predicts by the BP neural network. The input layer with BP neural network hasnA node, a hidden layer isqA node, an output layer hasmIndividual node, input layer nodeiTo hidden layer nodej The weight value between isv ij Hiding layer nodesjAnd output layer nodekThe weight value between is
Figure DEST_PATH_IMAGE091
In the forward propagation stage, the hidden layer j Output of one nodeo j And the kth node output of the output layero k Respectively as follows:
Figure DEST_PATH_IMAGE092
(15)
in the formula:x i inputting layer sample data;
Figure DEST_PATH_IMAGE093
Figure DEST_PATH_IMAGE094
respectively a hidden layer jNet inputs to individual nodes and the kth node of the output layer;
Figure DEST_PATH_IMAGE095
Figure DEST_PATH_IMAGE096
respectively a hidden layer j Individual node and output layerk A threshold for each node;f (net) is sigmod functionI.e. by
Figure DEST_PATH_IMAGE097
(16)
eRepresenting natural logarithm, net indicates net input;
in the error inversion stage, the error signal is:
Figure DEST_PATH_IMAGE098
in the formula:Eis the total error;d k is the desired output of the sample;
Figure DEST_PATH_IMAGE099
Figure DEST_PATH_IMAGE100
respectively when net input is
Figure DEST_PATH_IMAGE101
Figure DEST_PATH_IMAGE102
Is/are as followssigmodA derivative of the function;
the weight correction quantity is:
Figure DEST_PATH_IMAGE103
in the formula:
Figure DEST_PATH_IMAGE104
the learning rate;o i as nodes of the input layeriAn output of (d);
Figure DEST_PATH_IMAGE105
as nodes of the input layer i To hidden layer node jThe amount of correction of the weight value in between,
Figure DEST_PATH_IMAGE106
for hiding layer nodes j And output layer node k The weight correction amount between.
According to the formula (18), due to the certainty of the weight, the extreme value of the neural network algorithm is easy to appear early, and the neural network algorithm has low local searching capability, but the global searching capability of the algorithm is strong. In contrast, the genetic simulated annealing algorithm has a certain local searching capability, but the optimal process can be entered only through repeated annealing operations, so that the operation efficiency is not high. Therefore, if the weight of the neural network node is corrected through the GSA, the operation efficiency and the accuracy are greatly improved. The following adjustments are made to equation (18) including
Figure DEST_PATH_IMAGE107
In the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE108
hidden layer node before genetic simulation annealing method algorithm is introduced i To the output layer node jThe weight value between;
Figure DEST_PATH_IMAGE109
hiding layer nodes after genetic simulation annealing method algorithm is introduced i To the output layer node jThe weight correction quantity between;
Figure DEST_PATH_IMAGE110
for hiding layer nodes i To the output layer node jThe amount of correction of the weight value in between,
Figure DEST_PATH_IMAGE111
for hiding layer nodes i To the output layer node jThe weight value between;
Figure DEST_PATH_IMAGE112
is a correction coefficient;
Figure DEST_PATH_IMAGE113
(20)
in the formula:
Figure DEST_PATH_IMAGE114
is a random number;T to an annealing temperature of
Figure DEST_PATH_IMAGE115
(21)
In the formula:T 0 the initial temperature of simulated annealing;tthe number of annealing times, at this time, the degree of adaptabilityF i The calculation formula of (A) is as follows:
Figure DEST_PATH_IMAGE116
(22)
in the formula (I), the compound is shown in the specification,erepresenting the natural logarithm.
The BP neural network algorithm optimized by the genetic simulated annealing algorithm has the advantages that:
the network node connection weight is corrected through a genetic simulated annealing algorithm, so that the network is prevented from falling into a minimum value point, and the overfitting problem of the deviation data prediction model at the current stage is solved. And the method avoids complex parameter setting and has wide development and application prospects.
Due to the wide distribution range, the complex network structure, the severe operation environment and the large fluctuation of the transformer online monitoring equipment, the restriction of the precision of the equipment and other factors, a large amount of abnormal data, especially some noise data with large fluctuation exist in the actual operation process, and the segmentation condition of a data sequence is easily influenced in the process of mining the segmentation association rule, so that the association rule of the sequence is influenced. The BP neural network algorithm optimized by the genetic simulated annealing algorithm can well cope with the influence of noise data, and the prediction precision is improved.
The step S5 of the present invention specifically includes:
s51, based on the segmented association mining result in the step S3, constructing a multi-index prediction model of a BP neural network algorithm based on genetic simulation annealing algorithm optimization by using strong association rule sequence data, taking sequence data with strong association with a deviation data sequence as an input layer of the neural network, and taking deviation data as an output layer to construct the neural network; the process of the BP neural network algorithm based on genetic simulated annealing algorithm optimization is shown in FIG. 6, the number of nodes of a neural network input layer, a hidden layer and an output layer is determined, relevant parameters are initialized, network node codes are obtained, sample data is input, normalization processing is carried out, the network node codes and the node codes are subjected to selection, crossing, variation, annealing and new individual fitness calculation operation through a genetic simulated annealing algorithm (GSA), whether fitness requirements are met or not is judged, if the fitness requirements are not met, the crossing step is returned, if the fitness requirements are met, the optimal weight of the nodes is obtained, then the sample data is trained through the BP neural network algorithm (BP), the mean square error of the training result is calculated, whether the error allowable range is met or not is judged, if the fitness requirements are not met, the sample data is retrained, and if the optimal prediction network is determined.
S52, correcting the network node connection weight through a genetic simulation annealing algorithm, substituting the optimal weight into the network node connection weight, obtaining normal data of the standard oil sample, completing training and testing of the neural network, predicting deviation data, and completing calibration of the deviation data.
The specific process of step S6 of the present invention is:
s61, mining the segmentation association rules of the two sequences with the strong association rules again by using the calibrated data to obtain the association rules of the two calibrated sequence data;
s62, comparing the correlation degree of the two calibrated sequence data deviation parts with a correlation degree threshold value, and checking the reliability of deviation calibration;
and S63, setting a correlation degree threshold, namely mining the overall correlation rule of the two normal sequence data to obtain the overall correlation degree of the two sequence data, and setting the correlation degree threshold as limit _ rev.
One practical application of the present invention is as follows:
and S1, collecting data.
The indexes of hydrogen and methane in DGA historical online monitoring data of certain main transformer equipment are taken as research objects, and the online monitoring data of the oil chromatogram is taken as a sampling period in a day as a general way, namely one index sampling data in a day; therefore, the present invention uses the number of sampling points (700 points) close to two years as the data window length, and the data curves are shown in fig. 2 and 3. From the visualization result of the gas indexes, the online data of all the indexes can be equivalent to a data curve fluctuating along with the sampling points; therefore, by setting a reasonable fitting error threshold, it is feasible to perform segment fitting on the index sequence.
And S2, carrying out piecewise linearization on the online monitoring data, and extracting the characteristics of the line segment curve.
The method proposed by the present invention is used for performing piecewise linear fitting on the continuous data, and it should be noted here that: since different index data are different in magnitude, when piecewise linearization fitting is performed by using the method provided by the invention, appropriate fitting error thresholds should be selected for different index data, and the specific fitting result of each index data is shown in fig. 4 and 5 below. As can be seen from fig. 4 and 5, the index fitting of the online DGA data is successful, a line segment formed by connecting two end points represents all data points in the line segment span, and the correlation properties of a part of the line segment to which the line segment is fitted are shown in table 1 by taking hydrogen as an example.
Figure DEST_PATH_IMAGE117
Table 5 further demonstrates the feasibility of the online data piecewise linearization algorithm provided by the present invention, the fitting error of each line segment is smaller than the set fitting error threshold, and the fitted line segment can better reflect the variation trend of the online data points in the fitting interval, and the validity of the algorithm is verified.
S3, constructing a segmented association mining model, symbolizing a line segment set characterized by curve characteristics, mining the association of different indexes of the DGA by using an Apriori algorithm, and finding an abnormal numerical value.
And extracting the box line graph characteristics of the line segment group characteristics, wherein the box line graph characteristics of the hydrogen with deviation data are shown in fig. 9, the box line graph characteristics of the hydrogen normal data are shown in fig. 10, and the box line graph characteristics of the methane normal data are shown in fig. 11.
Mining the association relation of continuous data: after the corresponding frequent item set is obtained, the relevance between the two indexes is analyzed by using the method provided by the invention to express the strength of the relevance relation by the support degree and the confidence coefficient, and the result is obtained
Figure DEST_PATH_IMAGE118
The support degree and the confidence degree of (2) are 0.5076 and 0.5641 respectively, which are both larger than the set related minimum threshold value, and indicate that the rule is a strong association rule, which indicates that a strong association relationship exists between the hydrogen and methane indexes.
In the current research results, hydrogen is the DGA gas with the most deviation problems, and is often positive deviation, and the grade deviation value is larger than the original value. A continuously rising error was added to the last 200 data points of the hydrogen data to form the deviation data, which is shown in fig. 7.
At this time, the relevance mining is carried out on the deviation data, the obtained support degree and confidence degree are 0.3939 and 0.6420, the support degree is obviously reduced, corresponding classification combinations on the curve characteristics of the data on the line segment exist, the reduction and the deficiency of frequent item sets exist, and the reduction of the overall support degree requires the subsection data relevance mining and retrieval.
Data segmentation association mining analysis: grouping association mining analysis is performed on the data according to 100 points as a group, and the line segment number, the support degree and the confidence coefficient are shown in table 2:
Figure DEST_PATH_IMAGE119
as can be seen from the above table, the deviation affects the association degree of the data segment where the deviation is located, but has no effect on the rest of the segments, and the change of the association degree of the data can be obtained by setting a reasonable data point group, so as to find the position of the deviation. The support threshold value set by the method is 0.5, the confidence threshold value is 0.6, and it can be known that the new deviation causes no strong association relation in the data.
S4, considering the time interval characteristic to obtain the support degree change, analyzing the data correlation change, and showing that the data deviation causes the sequence data correlation degree to continuously decrease, and no strong correlation exists among 100 data at the sequence data end.
S5, constructing a multi-index prediction model of a BP neural network algorithm based on genetic simulated annealing algorithm optimization by using strongly-associated index data, completing calibration of deviation data, performing segmented association mining through the step S3 to obtain a sequence with strong association with hydrogen, constructing a multi-index prediction model, predicting deviation data, and completing calibration of data deviation;
by mining the association rule in the above step S3, it can be seen that the association between hydrogen and other sequence data is shown in table 3:
Figure DEST_PATH_IMAGE120
as can be seen from the above table, hydrogen has a strong correlation with methane only;
constructing a multi-index prediction model of a BP neural network algorithm based on genetic simulated annealing algorithm optimization; and (3) constructing a neural network prediction model by taking methane as input and hydrogen as output, wherein the number of hidden layers is 8. Deviation data is predicted by a neural network prediction model, and the prediction result is shown in fig. 8.
And S6, mining association rules again for the two sequence data segments by using the calibrated data, judging whether the association degree of the calibrated deviation data segment meets the association degree threshold value, and checking the reliability of deviation calibration.
Figure DEST_PATH_IMAGE121
As can be seen from the above table, after the data deviation is calibrated, the support degree and the confidence degree of the two sequences both satisfy the corresponding threshold values, and the reliability of the data calibration is verified.

Claims (7)

1. A main transformer on-line monitoring data group deviation identification and calibration method is characterized by comprising the following steps:
s1, data collection: the online monitoring data plays an important role in the real-time state representation of the transformer equipment, is an important basis for the state evaluation and fault identification of the transformer equipment, and collects the online monitoring data of the equipment and the transformer fault information related to the online monitoring data to form a perfect data system;
s2, carrying out piecewise linearization on the online monitoring data, and extracting line segment curve characteristics;
s3, constructing a segmented association mining model, symbolizing a line segment set characterized by curve characteristics, mining the association of different indexes of DGA by using an Apriori algorithm, and finding an abnormal numerical value: extracting boxplot characteristics of line segment group characteristics, analyzing overall characteristics of sequence data through boxplots, constructing a line segment group similarity model, classifying the line segments by using an improved K-means clustering algorithm based on the maximum and minimum distances, setting the number of classes as 6 according to the overall classification of curve characteristics, and giving symbols to the line segments of the same classes to finish the symbolization of online monitoring data; based on the idea of Apriori algorithm, setting minimum confidence and support degree, mining frequent item sets existing among different sequences, quantifying the relevance among different sequences, and obtaining the data support degree and confidence degree of an index level;
s4, taking the time interval characteristic into consideration, acquiring the support degree change, and identifying the data deviation: judging abnormal deviation existing in the tail end time in the data according to the relevance strength change of the online monitoring data on the time sequence, and finishing the tracking of the data deviation;
s5, constructing a BP neural network algorithm multi-index prediction model improved based on a genetic simulated annealing algorithm by using the strongly-associated index data, and completing calibration of deviation data;
and S6, mining association rules again for the two sequence data segments by using the calibrated data, judging whether the association rules of the calibrated deviation data segments meet the association degree threshold value, and checking the reliability of deviation calibration.
2. The method for identifying and calibrating the deviation of the online monitoring data population of the main transformer according to claim 1, wherein the step S5 comprises the following steps:
s51, based on the segmented association mining result in the step S3, constructing a multi-index prediction model of a BP neural network algorithm based on genetic simulation annealing algorithm optimization by using strong association rule sequence data, taking sequence data with strong association with a deviation data sequence as an input layer of the neural network, and taking deviation data as an output layer to construct the neural network;
s52, correcting the network node connection weight through a genetic simulation annealing algorithm, substituting the optimal weight into the network node connection weight, obtaining normal data of the standard oil sample, completing training and testing of the neural network, predicting deviation data, and completing calibration of the deviation data.
3. The method for identifying and calibrating the deviation of the main transformer online monitoring data population according to claim 2, wherein the BP neural network algorithm optimized based on the genetic simulated annealing algorithm comprises the following steps: determining the node numbers of a neural network input layer, a hidden layer and an output layer, initializing relevant parameters to obtain network node codes, inputting sample data, carrying out normalization processing, carrying out selection, crossing, mutation, annealing and new individual fitness calculation operation of a genetic simulation annealing algorithm together with the network node codes, judging whether the fitness requirements are met or not, if not, returning to the crossing step, if so, obtaining the optimal weight of the node, then training the sample data through a BP neural network algorithm, calculating the mean square error of the training result, judging whether the error allowable range is met or not, if not, retraining the sample data, and if so, determining the optimal prediction network.
4. The method of claim 1A method for identifying and calibrating group deviation of online monitoring data of a main transformer is characterized in that an input layer provided with a BP neural network is provided withnA node, a hidden layer isqA node, an output layer hasmIndividual node, input layer nodeiTo hidden layer nodej The weight value between isv ij Hiding layer nodesjAnd output layer nodekThe weight value between is
Figure DEST_PATH_IMAGE001
In the forward propagation stage, the hidden layer j Output of one nodeo j And the kth node output of the output layero k Respectively as follows:
Figure DEST_PATH_IMAGE002
(15)
in the formula:x i inputting layer sample data;
Figure DEST_PATH_IMAGE003
Figure DEST_PATH_IMAGE004
respectively a hidden layer jNet inputs to individual nodes and the kth node of the output layer;
Figure DEST_PATH_IMAGE005
Figure DEST_PATH_IMAGE006
respectively a hidden layer j Individual node and output layerk A threshold for each node;f (net) is sigmod functionI.e. by
Figure DEST_PATH_IMAGE007
(16)
eRepresenting natural logarithm, net indicates net input;
in the error inversion stage, the error signal is:
Figure DEST_PATH_IMAGE008
(17)
in the formula:Eis the total error;d k is the desired output of the sample;
Figure DEST_PATH_IMAGE009
Figure DEST_PATH_IMAGE010
respectively when net input is
Figure DEST_PATH_IMAGE011
Figure DEST_PATH_IMAGE012
Is/are as followssigmodA derivative of the function;
the weight correction quantity is:
Figure DEST_PATH_IMAGE013
(18)
in the formula:
Figure DEST_PATH_IMAGE014
the learning rate;o i as nodes of the input layeriAn output of (d);
Figure DEST_PATH_IMAGE015
as nodes of the input layer i To hidden layer node jThe amount of correction of the weight value in between,
Figure DEST_PATH_IMAGE016
for hiding layer nodes j And output layer node k The weight correction quantity between;
the following adjustments are made to equation (18) including
Figure DEST_PATH_IMAGE017
(19)
In the formula:
Figure DEST_PATH_IMAGE018
hidden layer node before genetic simulation annealing method algorithm is introduced i To the output layer node jThe weight value between;
Figure DEST_PATH_IMAGE019
hiding layer nodes after genetic simulation annealing method algorithm is introduced i To the output layer node jThe weight correction quantity between;
Figure DEST_PATH_IMAGE020
for hiding layer nodes i To the output layer node jThe amount of correction of the weight value in between,
Figure DEST_PATH_IMAGE021
for hiding layer nodes i To the output layer node jThe weight value between;
Figure DEST_PATH_IMAGE022
is a correction coefficient;
Figure DEST_PATH_IMAGE023
(20)
in the formula:
Figure DEST_PATH_IMAGE024
is a random number;T to an annealing temperature of
Figure DEST_PATH_IMAGE025
(21)
In the formula:T 0 the initial temperature of simulated annealing;tis the number of annealing timesAt this time, the degree of adaptabilityF i The calculation formula of (A) is as follows:
Figure DEST_PATH_IMAGE026
(22)
in the formula (I), the compound is shown in the specification,erepresenting the natural logarithm.
5. The method for identifying and calibrating the deviation of the online monitoring data population of the main transformer as claimed in claim 1, wherein in step S2, the data points are determined as the data points fitting error threshold by regarding the continuous data points capable of linear fitting as the same set according to the time sequence of the data points, and the data points are stopped to be added when the error approaches the threshold; and selecting a line segment with the fitting error of 0.8 and fitting the line segment by the characteristics of the slope, the slope change rate, the constant term, the line segment span, the line segment fitting error, the change of the front and rear mean values of the line segment and the like.
6. The method for identifying and calibrating the deviation of the online monitoring data population of the main transformer according to claim 1, wherein the specific process of the step S4 is as follows:
s41, based on the segmented association mining model in the step S3, a frequent item set between the two types of data can be obtained as an association rule; comprehensively calculating the data association degree of the index level through the support degree and the confidence degree of the association rule;
s42, mining association rules of data segments respectively by dividing the data sequence into a plurality of data segments, acquiring association degree change of two sequence data in a time sequence, and realizing data association degree tracking; defining minimum support threshold of index level
Figure DEST_PATH_IMAGE027
And minimum confidence threshold
Figure DEST_PATH_IMAGE028
At deviation from the minimum confidence threshold
Figure 1095DEST_PATH_IMAGE028
When the data is associated with the original data, the data is considered to be associated with the original data.
7. The method for identifying and calibrating the deviation of the online monitoring data population of the main transformer according to claim 1, wherein the specific process of the step S6 is as follows:
s61, mining the segmentation association rules of the two sequences with the strong association rules again by using the calibrated data to obtain the association rules of the two calibrated sequence data;
s62, comparing the correlation degree of the two calibrated sequence data deviation parts with a correlation degree threshold value, and checking the reliability of deviation calibration;
and S63, setting a correlation degree threshold, namely mining the overall correlation rule of the two normal sequence data to obtain the overall correlation degree of the two sequence data, and setting the correlation degree threshold as limit _ rev.
CN202111615216.5A 2021-12-28 2021-12-28 Main transformer online monitoring data group deviation identification and calibration method Active CN113987033B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111615216.5A CN113987033B (en) 2021-12-28 2021-12-28 Main transformer online monitoring data group deviation identification and calibration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111615216.5A CN113987033B (en) 2021-12-28 2021-12-28 Main transformer online monitoring data group deviation identification and calibration method

Publications (2)

Publication Number Publication Date
CN113987033A true CN113987033A (en) 2022-01-28
CN113987033B CN113987033B (en) 2022-04-12

Family

ID=79734595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111615216.5A Active CN113987033B (en) 2021-12-28 2021-12-28 Main transformer online monitoring data group deviation identification and calibration method

Country Status (1)

Country Link
CN (1) CN113987033B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112514A (en) * 2023-10-23 2023-11-24 山东同利新材料有限公司 Recording and storing method based on p-chloromethyl styrene production data
CN117216469A (en) * 2023-09-03 2023-12-12 国网江苏省电力有限公司信息通信分公司 Big data processing method and system for real-time monitoring and prediction of power system
CN117235511A (en) * 2023-11-13 2023-12-15 北京市计量检测科学研究院 Secondary instrument calibration method
CN117249922A (en) * 2023-11-17 2023-12-19 山东盈动智能科技有限公司 Temperature calibration method and system for temperature tester
CN117476136A (en) * 2023-12-28 2024-01-30 山东松盛新材料有限公司 High-purity carboxylate synthesis process parameter optimization method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015176565A1 (en) * 2014-05-22 2015-11-26 袁志贤 Method for predicting faults in electrical equipment based on multi-dimension time series
CN112684379A (en) * 2020-11-25 2021-04-20 江苏科技大学 Transformer fault diagnosis system and method based on digital twinning
CN112711904A (en) * 2020-12-17 2021-04-27 玉溪矿业有限公司 Blasting vibration characteristic parameter prediction method based on SA-GA-BP
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method
CN113792754A (en) * 2021-08-12 2021-12-14 国网江西省电力有限公司电力科学研究院 Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015176565A1 (en) * 2014-05-22 2015-11-26 袁志贤 Method for predicting faults in electrical equipment based on multi-dimension time series
CN112684379A (en) * 2020-11-25 2021-04-20 江苏科技大学 Transformer fault diagnosis system and method based on digital twinning
CN112711904A (en) * 2020-12-17 2021-04-27 玉溪矿业有限公司 Blasting vibration characteristic parameter prediction method based on SA-GA-BP
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method
CN113792754A (en) * 2021-08-12 2021-12-14 国网江西省电力有限公司电力科学研究院 Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JINGFENG HUANG等: ""Monitoring of the Cross-Calibration Biases Between the S-NPP and NOAA-20 VIIRS Sensor Data Records Using Goes Advanced Baseline Imager as a Transfer"", 《 IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM》 *
陈金辉等: ""改进的BP神经网络在故障诊断中的应用"", 《河北科技大学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216469A (en) * 2023-09-03 2023-12-12 国网江苏省电力有限公司信息通信分公司 Big data processing method and system for real-time monitoring and prediction of power system
CN117216469B (en) * 2023-09-03 2024-03-15 国网江苏省电力有限公司信息通信分公司 Big data processing method and system for real-time monitoring and prediction of power system
CN117112514A (en) * 2023-10-23 2023-11-24 山东同利新材料有限公司 Recording and storing method based on p-chloromethyl styrene production data
CN117112514B (en) * 2023-10-23 2024-01-09 山东同利新材料有限公司 Recording and storing method based on p-chloromethyl styrene production data
CN117235511A (en) * 2023-11-13 2023-12-15 北京市计量检测科学研究院 Secondary instrument calibration method
CN117235511B (en) * 2023-11-13 2024-02-20 北京市计量检测科学研究院 Secondary instrument calibration method
CN117249922A (en) * 2023-11-17 2023-12-19 山东盈动智能科技有限公司 Temperature calibration method and system for temperature tester
CN117249922B (en) * 2023-11-17 2024-03-08 山东盈动智能科技有限公司 Temperature calibration method and system for temperature tester
CN117476136A (en) * 2023-12-28 2024-01-30 山东松盛新材料有限公司 High-purity carboxylate synthesis process parameter optimization method and system
CN117476136B (en) * 2023-12-28 2024-03-15 山东松盛新材料有限公司 High-purity carboxylate synthesis process parameter optimization method and system

Also Published As

Publication number Publication date
CN113987033B (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN113987033B (en) Main transformer online monitoring data group deviation identification and calibration method
CN108023876B (en) Intrusion detection method and intrusion detection system based on sustainability ensemble learning
CN107169628B (en) Power distribution network reliability assessment method based on big data mutual information attribute reduction
CN105930976B (en) Node voltage sag severity comprehensive evaluation method based on weighted ideal point method
CN109829497B (en) Supervised learning-based station area user identification and discrimination method
CN108985380B (en) Point switch fault identification method based on cluster integration
CN111079941B (en) Credit information processing method, credit information processing system, terminal and storage medium
CN107037306A (en) Transformer fault dynamic early-warning method based on HMM
CN108550400B (en) Method for evaluating influence of air pollutants on number of respiratory disease patients
CN111401785A (en) Power system equipment fault early warning method based on fuzzy association rule
CN113723010A (en) Bridge damage early warning method based on LSTM temperature-displacement correlation model
CN103443809A (en) Discriminant model learning device, method and program
CN108334894B (en) Unsupervised machine learning-based transformer oil temperature abnormity identification method
CN110245390B (en) Automobile engine oil consumption prediction method based on RS-BP neural network
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
CN113792754A (en) Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN116737510B (en) Data analysis-based intelligent keyboard monitoring method and system
Dong Combining unsupervised and supervised learning for asset class failure prediction in power systems
CN113780420A (en) Method for predicting concentration of dissolved gas in transformer oil based on GRU-GCN
CN115375026A (en) Method for predicting service life of aircraft engine in multiple fault modes
CN114372093A (en) Processing method of DGA (differential global alignment) online monitoring data of transformer
Zhang et al. Fault diagnosis of oil-immersed power transformer based on difference-mutation brain storm optimized catboost model
CN113392877B (en) Daily load curve clustering method based on ant colony algorithm and C-K algorithm
CN117743829A (en) Short-term power load quantity prediction method based on deep learning
CN116432856A (en) Pipeline dynamic early warning method and device based on CNN-GLSTM model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant