CN113792754A - Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing - Google Patents

Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing Download PDF

Info

Publication number
CN113792754A
CN113792754A CN202110922307.7A CN202110922307A CN113792754A CN 113792754 A CN113792754 A CN 113792754A CN 202110922307 A CN202110922307 A CN 202110922307A CN 113792754 A CN113792754 A CN 113792754A
Authority
CN
China
Prior art keywords
data
line segment
fitting
algorithm
particle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110922307.7A
Other languages
Chinese (zh)
Other versions
CN113792754B (en
Inventor
童超
朱自伟
张益宁
童军心
李唐兵
王华云
张宇
王鹏
万华
刘玉婷
徐碧川
童涛
曾磊磊
周友武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
Nanchang University
Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd
Original Assignee
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
Nanchang University
Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangxi Electric Power Co ltd, State Grid Corp of China SGCC, Nanchang University, Electric Power Research Institute of State Grid Jiangxi Electric Power Co Ltd filed Critical State Grid Jiangxi Electric Power Co ltd
Priority to CN202110922307.7A priority Critical patent/CN113792754B/en
Publication of CN113792754A publication Critical patent/CN113792754A/en
Application granted granted Critical
Publication of CN113792754B publication Critical patent/CN113792754B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Protection Of Transformers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for processing on-line monitoring data of a converter transformer DGA (differential global architecture) by removing different materials and then repairing, wherein the first stage introduces the idea of a sliding window algorithm, uses a piecewise linearization algorithm to divide sequence data into a plurality of line segments represented by slopes and spans, uses K-means clustering improved based on the maximum and minimum distances to symbolize the on-line monitoring data, and finally uses an APRIORI algorithm to mine the relevance among different indexes in the DGA and mine the abnormal values existing in the DGA; and in the second stage, providing an improved particle swarm optimization support vector regression algorithm according to the screened abnormal numerical sampling points, defining the distance between particle solution sets, dividing different types of particles by using a fuzzy inference rule, defining different updating formulas according to the different types of particles, ensuring the solving speed and solving diversity of the algorithm, optimizing key parameters in the support vector regression algorithm to repair the sampling points, and realizing the processing of online DGA monitoring data.

Description

Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
The technical field is as follows:
the invention relates to a DGA (differential global alignment) online monitoring data processing method for a converter transformer, which belongs to the field of power equipment data cleaning and requires the priority of a prior application CN202110330366.5 on 3, 25/2021.
Background art:
the power converter transformer is a hub device for converting and transmitting electric energy, and the safe and stable operation of the power converter transformer is an important guarantee for the power supply quality of users. DGA index online data of the converter transformer is used for monitoring the insulation performance of equipment in real time, and the real-time state of the converter transformer can be rapidly obtained based on the analysis of oil chromatographic data; meanwhile, indexes in the DGA data are more in dimensionality, and the data of different abnormal modes in the online data can be distinguished by mining the incidence relation of the indexes, so that the reliability of the comprehensive state evaluation result of the equipment can be enhanced.
Due to the operating environment of the equipment and the electromagnetic interference effect of the converter transformer, the online monitoring device is easy to have randomly distributed abnormal numerical points in the data acquisition and transmission process, and even has the situations of data drift and transmission interruption in severe cases. The background system can quickly distinguish the obvious data abnormal phenomena such as data drift, data interruption and the like and give an alarm aiming at the problems; however, for the abnormal numerical points randomly distributed in the normal online data, the real-time representation of the equipment state index is seriously interfered, the state evaluation work based on the index is also influenced, the situations of false report, false report and the like of the abnormal state of the equipment are easily caused, and the waste of the running maintenance resources of the equipment is caused.
The power converter transformer is important equipment for ensuring stable operation of a transmission and distribution network, and iron core grounding current monitoring data of the converter transformer is an important basis for state evaluation of the converter transformer. The monitoring data of a period of time, including the overall change trend, the extreme points and jump points in the change and the data statistical characteristics, can reflect the possible abnormal conditions in the power converter flow from multiple aspects.
After the long-term operation of the power equipment, the existing large-scale index data is stored in the power database and inevitably contains the index data of different abnormal modes, the existing index data is subjected to correlation analysis, the existing correlation relation is excavated, the data of different abnormal modes in the data is analyzed based on the correlation relation, and the data are effectively repaired, so that the comprehensive state evaluation system of the power equipment is favorably perfected, the abnormal state of the equipment device is found out in advance, the equipment overhaul efficiency is improved, and the operation and maintenance cost of the equipment is reduced.
The CN112800686A previously applied by the applicant discloses a method for judging the abnormal mode of DGA (differential global amplitude versus offset) online monitoring data of a converter transformer, which comprises the steps of importing the DGA online monitoring data, setting the length and the sliding step length of a sliding window, traversing an online data set by the sliding window with a certain step length, fitting each intercepted data window by using a sliding data piecewise linearization algorithm based on least square, representing the fitted line segment by using the slope of the line segment obtained by fitting, the actual growth rate containing data and the span of the line segment, constructing a model of the similarity of the described line segment, and carrying out clustering analysis on the line segment set by using a K-means algorithm; symbolizing the line segment set, and summarizing the number of elements in the set after symbolization of different sequences; based on the idea of Apriori algorithm, frequent item sets existing among different sequences are mined, the relevance among the different sequences is quantified, and the data of different abnormal modes are separated according to the strength of the relevance among the sequences and the type of abnormal numerical values existing in the judged data.
However, the on-line data of the power transformation equipment usually needs to be collected, converted and transmitted before being stored in the system database, and the data is used for real-time monitoring and displaying of the equipment state. However, due to the influence of human misoperation, severe operating environment, strong electromagnetic interference and other factors, the online data collected by the system usually has more problems, and the reliability of the online data on the representation of the equipment state is greatly influenced.
Disclosure of Invention
The invention solves the technical problem of providing a method for processing the on-line monitoring data of the DGA of the converter transformer by removing the difference and then repairing, wherein a first stage uses a piecewise linearization algorithm, and K-means clustering and APRIORI algorithm processing based on maximum and minimum distance improvement to discover abnormal values existing in the method; and in the second stage, the sampling points are repaired by using an improved particle swarm optimization support vector regression algorithm, so that the processing of the online DGA monitoring data of the power converter transformer is realized.
The invention is realized by the following technical scheme, and the method for processing the on-line monitoring data of the DGA of the converter transformer by removing the difference firstly and then repairing comprises the following steps:
s1, importing DGA online monitoring data, and setting the length and the sliding step length of a sliding window;
s2, piecewise linearization of sequence data: combining a variable number of points in the online data together according to a model by using a piecewise linearization algorithm of sequence data to form a multi-group data point set; the criteria for grouping data points are: wherein the error between the line segment fitted by all the points and the actual data point is less than a threshold value, and the fitted line segment is represented by using the slope and the line segment span of the line segment;
s3, constructing a model for describing the similarity of different line segments: constructing a similarity model based on the slope and span of the line segments, classifying the line segments by using a K-means clustering algorithm improved based on the maximum and minimum distances, giving symbols to the line segments of the same class, and completing the symbolization of sequence data;
s4, mining the relevance among different sequences: setting a minimum confidence coefficient and a support degree based on an Apriori algorithm, mining a frequent item set existing among different sequences, and quantifying the relevance among the different sequences;
s5, extracting and screening abnormal values existing in DGA online monitoring data: according to the strength of the correlation among the sequences, separating data of different abnormal modes from the abnormal numerical value types in the judged data;
s6, optimizing key parameters of support vector regression by improving a particle swarm algorithm, and repairing the screened abnormal numerical value points: defining the distance between particle solution sets, calculating the density of different particles based on the distance, and introducing an improved fuzzy inference rule according to the density to define different particle updating modes so as to improve the diversity and solving speed of particle swarm algorithm solution; and optimizing the key parameters supporting vector regression by using an improved particle swarm algorithm, improving the data regression precision, repairing the screened abnormal numerical points, and finishing the processing of DGA on-line monitoring data.
Further preferably, in step S1, DGA online monitoring data is imported, the length of the sliding window is set to L, and the sliding step length is set to L; traversing the online data set with a sliding window of a certain step size: dragging a sliding window to slide on the whole online monitoring data set by a sliding step length l until all data are traversed; let the length of the on-line monitoring data set be L1After traversal, get
Figure BDA0003207887480000031
A data window, deriving the data in all windows to form a data set DS to be analyzedi,i∈n。
Further preferably, the step S2 provides a piecewise linearization algorithm of the sequence data, which specifically comprises the following steps:
s2.1, for monitoring data XK={x1,x2,…,xkIntercepting data points by a window with the length of L (L < k), and carrying out piecewise linear fitting on the data points contained in the intercepted window on the basis of the idea of a sliding window;
s2.2, taking the first data point in the window as the fitting starting point of the initial line segment, and enabling the point to be xiAssuming that the fitting end point of the initial line segment is xi+m(m > 1), fitting the m +1 data points to a line segment;
the distance from the actual data points to the fitting line segment is used as a fitting error, and the fitting accuracy of the fitting line segment to the actual numerical points is improved; unlike the conventional least squares fitting, let dnAs index sequence number points XnAnd (3) calculating the linear distance from all actual data points in the step length of the fitting line segment to the fitting line segment, and taking the sum of the linear distances as the fitting overall error ER of the line segment:
Figure BDA0003207887480000032
Figure BDA0003207887480000033
Xithe sampling numerical value of the time i in the time sequence is represented, and m represents the number of numerical points contained in the fitting line segment; t is tnRepresents a time step;
s2.3, setting a fitting error threshold value to be ERrIf ER < ERrIf so, the line segment can still continue to increase the fitting points, and let m be m +1, and repeat the above steps; ER if anyrIf m is equal to m, taking the point as a line segment fitting end point to generate a line segment; if ER > ER is presentrIf the line segment can not be fitted, the fitting end point of the current line segment is stored as Xend=Xi+m-1And recording the data sampling time, returning to the step S2.2, resetting the parameter m, and fitting the next part of data by taking the current fitting end point as the fitting starting point of the next line segment until all data points in the sequence are fitted.
Preferably, in step S3, since there is a certain order of magnitude difference between different indicators in the DGA online monitoring data, all line segment triplets existing in the same sequence need to be shaped as
Figure BDA0003207887480000041
The standardization operation of (2);
during cluster analysis, establishing a standard for measuring the similarity of the line segments; describing the similarity between line segments by using Euclidean distance, wherein the consideration degree of different attributes of the line segments is expressed in a weight mode; the established line segment similarity model is shown as the following formula:
Figure BDA0003207887480000042
in the formula (ds)ijRepresenting line segment similarity, ωk、ωm、ωrAnd respectively representing the weight ratios occupied by the slope, the span and the growth rate in the line segment similarity model.
Further preferably, in step S3 of the present invention, the improved K-means algorithm based on the maximum and minimum distances includes the following main steps:
the maximum and minimum distances are also based on Euclidean distances, and the difference between the maximum and minimum distances and the K-means algorithm is that an object with a maximum distance is taken as a clustering center; for the sample set, a proportion coefficient theta (0 < theta < 1) is given, and the sample set s is taken arbitrarilynIs the initial clustering center, denoted as z1
Optionally taking the distance z of the remaining n-1 samples1The farthest sample is the second cluster center, denoted as z2
Calculate the remaining n-2 samples and z1And z2And finding the minimum value among them, namely:
Dij=||xi-zj||,j=1,2 (6)
Di=min(Di1,Di2),i=1,2,…,n (7)
if it is
Di=max{Di}>θ×||zi-z2|| (8)
Then select the corresponding sample siAs a third cluster center z3
Assuming that K cluster centers are provided, the distance from the rest n-K samples to the cluster centers is calculated, and
comprises the following steps:
Dr=max{min(Di1,Di2,…Dik)}>θ×||z1-z2|| (9)
then the corresponding sample xrIs the K +1 cluster center and is marked as zK+1(ii) a The process is continuously circulated until no new clustering center appears;
when no new cluster center is present, the samples are assigned to each class according to the minimum distance principle. The improved K-means clustering algorithm based on the maximum and minimum distances has the advantages that the clustering centers are consistent during each clustering analysis, the randomness of selecting the clustering centers by the traditional K-means algorithm is eliminated, and the accuracy and the speed of the clustering analysis can be effectively improved.
Further preferably, in step S4 of the present invention, the process of mining the association between different sequences is as follows:
s4.1, setting parameters of minimum support degree and minimum confidence degree; judging the basis of sequence association and frequent item sets when the confidence coefficient and the support degree threshold value exist, wherein a proper threshold value parameter is favorable for enhancing the reliability of the association relation, and the minimum support degree threshold value of frequent-1 and frequent-2 item sets is recorded as min sup1And min sup2The minimum confidence threshold in the sequence association mining is min con;
s4.2, generating a frequent item set; using the summed two-signed sequence as a transaction set, denoted
Figure BDA0003207887480000051
Wherein
Figure BDA0003207887480000052
All symbol categories corresponding to the two sequences are: { A1,A2,…,ACAAnd { B }1,B2,…,BCBAnd obtaining a frequent item set of the sequence by scanning the transaction set in two stages based on the basic idea of an Apriori algorithm. The confidence for each symbol in the sequence is calculated according to equation (10):
Figure BDA0003207887480000053
wherein X and Y represent two index objects needing to mine association rules, NtRepresenting the number of transaction sets, namely the number of elements in the sequence, representing the proportion of items in the transaction sets by the support degree, and when a frequent-1 item set is explored, the support degree is greater than min sup1The items of (a) are divided into a set of frequent-1 item sets;
the collection of frequent-1 item sets of two sequences in the association mining is recorded as PA、PBPairing the items in the set according to the index parameters to form the form (P)Ai,PBi) Format 2-item set, computing support of each item in the 2-item setDegree, will support more than min sup2Is divided into a frequent-2 item set, denoted as { PA,PB}freq
S4.3, mining of sequence relevance: combining all the sequences pairwise, and respectively counting the support degree of the frequent-2 item concentrated items in the sequences and the confidence degree between the corresponding association mining sequences;
and (3) firstly, accumulating the support degrees of all frequent-2 item sets between two index parameters according to the formula (7), and taking the accumulated support degrees as the support degrees of the two parameter sequences in all multivariate sequences.
Figure BDA0003207887480000061
σ(XA)=sum(σ(PA)) (12)
σ(XB)=sum(σ(PB)) (13)
Wherein, m is CA + CB, CA and CB are the total number of the line segment categories divided after the clustering analysis of the two sequences, and m is the number of the line segment categories after the two sequences are summed; at the same time, the minimum support threshold of the index sequence layer is min sup3If the support degree of the parameter index level is larger than the set threshold value, calculating the confidence degree con (X) of the combination of the symbol item sets in the two sequencesA→XB) As shown in formula (14):
Figure BDA0003207887480000062
when the confidence is greater than the set minimum confidence threshold, the association rule X is reservedA→XBAnd describing the strength of the association between the two indexes by using the confidence coefficient, and judging that the two indexes have strong association.
The improved particle swarm algorithm in the step S6 mainly comprises the following steps:
s6.1, defining the number m of variables, and generating N m-dimensional particles in a feasible solution space, StIs the t-th generation particle in the iteration, wherein the element is
Figure BDA0003207887480000063
Wherein the elements are expressed as
Figure BDA0003207887480000064
S6.2, determining the inertial weight: the self-adaptive weight method can better find a balance point between the two, and the inertia weight is properly increased when the target values of all particles tend to be consistent; when the target values of the particles are relatively dispersed, the inertia weight value is properly reduced, and the specific expression is as follows:
Figure BDA0003207887480000071
wherein, waAnd wzRepresenting maximum and minimum values of inertial weight, fz,fpjRespectively representing the fitness value of the particle, the minimum fitness value of all the particles and the average fitness value of all the particles;
s6.3, defining fuzzy inference rule input variables: the population density of the particles is expressed by Euclidean distance:
Figure BDA0003207887480000072
the calculation formula of the particle density is obtained:
Figure BDA0003207887480000073
niis the number of particles in the i particle population, N is the number of solution-concentrated particles generated, ciRecording the density of the particles, normalizing the density and the current iteration times k, taking the normalized density and the current iteration times k as two input variables of the fuzzy rule, and respectively calculating the membership degrees of the fuzzy rule to different states;
s6.4, fuzzy inference rules: separately defining fuzzy sets of normalization variables for inputsLow, medium, high density, membership function expression as shown in1-c3Is an interval threshold of the membership function,
Figure BDA0003207887480000074
Figure BDA0003207887480000075
Figure BDA0003207887480000081
wherein x can be two input variables of particle density and iteration times, the calculated membership degree of particle density and the membership degree of iteration times are combined in a cross way to form a particle state fuzzy matrix K with the dimensionality of 3 multiplied by 3, and the particle state fuzzy matrix K and a vector c formed by the membership degree of particle densitylMultiplying to obtain probability vector c of particle dependent different density intervalb
Figure BDA0003207887480000082
And taking the maximum density interval in which the current particles are positioned, and respectively formulating different particle updating modes.
S6.5, particle updating rule: two learning factors mu of the initialization algorithm1、μ2
When c is going toblAt maximum, the particles are solved only to the optimal direction of the particles, and the speed updating mode is as follows:
Figure BDA0003207887480000083
when c is going tobmOr cbhAnd when the maximum is reached, the algorithm is solved towards the global optimum and subgroup optimum directions, and the same updating mode as the traditional particle population algorithm is adopted.
The invention has the technical effects that a two-stage online DGA data processing method based on the thought of 'removing the difference firstly and then repairing' is provided, and the online data is equivalent to a time sequence according to the characteristics of the returned data; the method comprises the steps that the idea of a sliding window algorithm is introduced in the first stage, a piecewise linearization algorithm is used for dividing sequence data into a plurality of line segments characterized by slopes and spans, then K-means clustering based on maximum and minimum distance improvement is used for symbolizing online monitoring data, finally an APRIORI algorithm is used for mining the relevance among different indexes in DGA, and abnormal values existing in the DGA are mined; and in the second stage, according to the screened abnormal numerical sampling points, an improved particle swarm optimization support vector regression algorithm is provided, the distance between particle solution sets is defined, different types of particles are divided by using a fuzzy inference rule, different updating formulas are defined according to the different types of particles, the solving speed and solving diversity of the algorithm are guaranteed, key parameters in the support vector regression algorithm are optimized to repair the sampling points, and the processing of the on-line DGA monitoring data of the power converter transformer is realized.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a flow chart of solving particle swarm optimization (APSO-SVR).
Figure 3 is a comparison graph of hydrogen index fit.
FIG. 4 is a comparison graph of methane indicator fit.
FIG. 5 is a graph comparing the results of the regression model.
Fig. 6 is a data repair result.
Detailed Description
The present invention will be explained in further detail with reference to examples.
Referring to fig. 1, a method for processing on-line monitoring data of a converter transformer DGA with exception removal and repair, which comprises the following steps:
s1, importing DGA online monitoring data, and setting the length and the sliding step length of a sliding window;
s2, piecewise linearization of sequence data: since online data is usually a numerical variable, the method is not suitable for relevance mining of sequence data; combining a variable number of points in the online data together according to a model by using a piecewise linearization algorithm of sequence data to form a multi-group data point set; the data point grouping is normalized in that the error between the line segment fitted to all points and the actual data point is less than a threshold, and the fitted line segment is characterized using the slope and the line segment span of the line segment;
s3, constructing a model for describing the similarity of different line segments: constructing a similarity model based on the slope and span of the line segments, classifying the line segments by using a K-means clustering algorithm improved based on the maximum and minimum distances, giving symbols to the line segments of the same class, and completing the symbolization of sequence data;
s4, mining the relevance among different sequences: setting a minimum confidence coefficient and a support degree based on an Apriori algorithm, mining a frequent item set existing among different sequences, and quantifying the relevance among the different sequences;
s5, extracting and screening abnormal values existing in DGA online monitoring data: according to the strength of the correlation among the sequences, separating data of different abnormal modes from the abnormal numerical value types in the judged data;
s6, optimizing key parameters of support vector regression by improving a particle swarm algorithm, and repairing the screened abnormal numerical value points: defining the distance between particle solution sets, calculating the density of different particles based on the distance, and introducing an improved fuzzy inference rule according to the density to define different particle updating modes so as to improve the diversity and solving speed of particle swarm algorithm solution; and optimizing the key parameters supporting vector regression by using an improved particle swarm algorithm, improving the data regression precision, repairing the screened abnormal numerical points, and finishing the processing of DGA on-line monitoring data.
Specifically, in step S1, DGA online monitoring data is imported, the length of the sliding window is set to L, and the sliding step length is set to L; traversing the online data set with a sliding window of a certain step size: dragging a sliding window to slide on the whole online monitoring data set by a sliding step length l until all data are traversed; let the length of the on-line monitoring data set be L1After traversal, get
Figure BDA0003207887480000101
A data window, deriving the data in all windows to form a data set DS to be analyzedi,i∈n。
Specifically, the specific steps of the piecewise linearization algorithm of the sequence data set forth in step S2 are:
s2.1, for monitoring data XK={x1,x2,…,xkAnd intercepting data points by using a window with the length of L (L < k), and carrying out piecewise linear fitting on the data points contained in the intercepted data points on the basis of the idea of a sliding window.
S2.2, taking the first data point in the window as the fitting starting point of the initial line segment, and enabling the point to be xiAssuming that the fitting end point of the initial line segment is xi+m(m > 1), the m +1 data points are fitted to a line segment.
The distance from the actual data points to the fitting line segment is used as a fitting error, and the fitting accuracy of the fitting line segment to the actual numerical points is improved; unlike the conventional least squares fitting, let dnAs index sequence number points XnAnd (3) calculating the linear distance from all actual data points in the step length of the fitting line segment to the fitting line segment, and taking the sum of the linear distances as the fitting overall error ER of the line segment:
Figure BDA0003207887480000102
Figure BDA0003207887480000103
Xithe sampling numerical value of the time i in the time sequence is represented, and m represents the number of numerical points contained in the fitting line segment; t is tnRepresents a time step;
s2.3, setting a fitting error threshold value to be ERrIf ER < ERrIf so, the line segment can still continue to increase the fitting points, and let m be m +1, and repeat the above steps; ER if anyrIf m is equal to m, the point is used as the line segment fitting end pointGenerating a line segment; if ER > ER is presentrIf the line segment can not be fitted, the fitting end point of the current line segment is stored as Xend=Xi+m-1And recording the data sampling time, returning to the step S2.2, resetting the parameter m, and fitting the next part of data by taking the current fitting end point as the fitting starting point of the next line segment until all data points in the sequence are fitted.
Assume that the slope of the fitted line segment is kiThe number of fitting numerical points in the line segment is miThen the actual growth rate of the line segment fit data can be expressed as:
Figure BDA0003207887480000111
with ki,mi,riThe three elements constitute a triplet (k) of line segmentsi,mi,ri) And representing a fitted line segment by the array.
Since the piecewise linearization is a data fitting process, the quality of the fitting effect is related to the error magnitude. The present invention uses the slope k of a line segment in consideration of the characteristic properties of a general line segmentiLength of fit liAnd the growth rate r of the line segmentiFormed as { k }i,li,riThe array set represents each line segment.
In particular, in step S3, during the cluster analysis, a standard for measuring the similarity of the line segments needs to be established; the DGA online monitoring data reflects real-time indexes of equipment, the change trend and the form of the parameters can reflect the change of the running state of the equipment most, the invention extracts two key parameters of the slope and the span of a line segment, describes the similarity between the line segments by using Euclidean distance and defines a similarity model of the line segment. Based on a similarity model, performing clustering analysis on the line segment set by using a K-means algorithm improved based on the maximum and minimum distances, and dividing similar line segments into the same category.
In particular, in step S3, since there is a certain order of magnitude difference between different indicators in the online DGA monitoring data, all line segment triplets existing in the same sequence need to be shaped as
Figure BDA0003207887480000112
The standardization operation of (2);
specifically, in step S3, during the clustering analysis, a criterion for measuring the similarity of the line segments is established; describing the similarity between line segments by using Euclidean distance, wherein the consideration degree of different attributes of the line segments is expressed in a weight mode; the established line segment similarity model is shown as the following formula:
Figure BDA0003207887480000113
in the formula (ds)ijRepresenting line segment similarity, ωk、ωm、ωrAnd respectively representing the weight ratios occupied by the slope, the span and the growth rate in the line segment similarity model.
In step S3 of the present invention, the K-means algorithm improved based on the maximum and minimum distances includes the following main steps:
the maximum and minimum distances are also based on Euclidean distances, and the difference between the maximum and minimum distances and the K-means algorithm is that an object with a maximum distance is taken as a clustering center; for the sample set, a proportion coefficient theta (0 < theta < 1) is given, and the sample set s is taken arbitrarilynIs the initial clustering center, denoted as z1
Optionally taking the distance z of the remaining n-1 samples1The farthest sample is the second cluster center, denoted as z2
Calculate the remaining n-2 samples and z1And z2And finding the minimum value among them, namely:
Dij=||xi-zj||,j=1,2 (6)
Di=min(Di1,Di2),i=1,2,…,n (7)
if it is
Di=max{Di}>θ×||zi-z2|| (8)
Then select the corresponding sample siAs a third cluster center z3
Assuming that there are K cluster centers, the distance from the remaining n-K samples to the cluster centers is calculated, and the following steps are carried out:
Dr=max{min(Di1,Di2,…Dik)}>θ×||z1-z2|| (9)
then the corresponding sample xrIs the K +1 cluster center and is marked as zK+1(ii) a The process is continuously circulated until no new clustering center appears;
when no new cluster center is present, the samples are assigned to each class according to the minimum distance principle. The improved K-means clustering algorithm based on the maximum and minimum distances has the advantages that the clustering centers are consistent during each clustering analysis, the randomness of selecting the clustering centers by the traditional K-means algorithm is eliminated, and the accuracy and the speed of the clustering analysis can be effectively improved.
In step S4 of the present invention, the process of mining the association between different sequences is as follows:
s4.1, setting parameters of minimum support degree and minimum confidence degree; judging the basis of sequence association and frequent item sets when the confidence coefficient and the support degree threshold value exist, wherein a proper threshold value parameter is favorable for enhancing the reliability of the association relation, and the minimum support degree threshold value of frequent-1 and frequent-2 item sets is recorded as min sup1And min sup2The minimum confidence threshold in the sequence association mining is min con.
S4.2, generating a frequent item set; using the summed two-signed sequence as a transaction set, denoted
Figure BDA0003207887480000121
Wherein
Figure BDA0003207887480000122
All symbol categories corresponding to the two sequences are: { A1,A2,…,ACAAnd { B }1,B2,…,BCBBased on the basic idea of Apriori algorithm, the invention obtains a frequent item set of the sequence by scanning the transaction set in two stages. The confidence for each symbol in the sequence is calculated according to equation (10):
Figure BDA0003207887480000131
wherein X and Y represent two index objects needing to mine association rules, NtRepresenting the number of transaction sets, namely the number of elements in the sequence, representing the proportion of items in the transaction sets by the support degree, and when a frequent-1 item set is explored, the support degree is greater than min sup1The items of (a) are divided into a set of frequent-1 item sets.
The collection of frequent-1 item sets of two sequences in the association mining is recorded as PA、PBPairing the items in the set according to the index parameters to form the form (P)Ai,PBi) Form 2-item set, calculating the support degree of each item in the 2-item set, and enabling the support degree to be greater than min sup2Is divided into a frequent-2 item set, denoted as { PA,PB}freq
S4.3, mining of sequence relevance: combining all the sequences pairwise, and respectively counting the support degree of the frequent-2 item concentrated items in the sequences and the confidence degree between the corresponding association mining sequences;
and (3) firstly, accumulating the support degrees of all frequent-2 item sets between two index parameters according to the formula (7), and taking the accumulated support degrees as the support degrees of the two parameter sequences in all multivariate sequences.
Figure BDA0003207887480000132
σ(XA)=sum(σ(PA)) (12)
σ(XB)=sum(σ(PB)) (13)
And m is CA + CB, wherein CA and CB are the total number of the line segment categories divided after the two sequences are subjected to clustering analysis, and m is the number of the line segment categories after the two sequences are subjected to the total grouping. At the same time, the minimum support threshold of the index sequence layer is min sup3If the support degree of the parameter index level is larger than the set threshold value, calculating the symbolConfidence con (X) of item set combinations in two sequencesA→XB) As shown in formula (14):
Figure BDA0003207887480000133
when the confidence is greater than the set minimum confidence threshold, the association rule X is reservedA→XBAnd describing the strength of the association between the two indexes by using the confidence coefficient, and judging that the two indexes have strong association.
The invention sets minimum support threshold min sup of different levels for two sequences completing the total operation based on the idea of Apriori algorithmiAnd continuously mining a frequent item set existing among the sequences and finally judging the strength of the association relation among the indexes.
The main idea of improving particle swarm optimization support vector regression in step S6 of the invention is as follows: and for vacant numerical points caused by deletion of abnormal values, a support vector regression algorithm for improving particle swarm optimization is provided for repairing, and in the classification and regression problems of a support vector machine, a kernel function is introduced to convert the nonlinear problem of an input space into the linear problem of a high-dimensional space, so that the complexity of the algorithm can be effectively reduced. The present invention uses a Radial Basis (RBF) kernel. In order to obtain the optimal parameters of the RBF function, the mean square error is used as a fitness function, and the parameters C and gamma of the support vector machine are optimized by using an improved particle swarm optimization.
The particle swarm optimization is easy to converge prematurely and fall into local optimization, the density of different particles is defined through the Euclidean distance between the particles in the particle iteration process of the particle swarm optimization, the particles are updated in different updating modes according to the density of clusters to which the particles belong, the convergence speed of the algorithm solution can be guaranteed, the diversity of solution sets can be guaranteed, and the local optimization is avoided. The particle swarm optimization algorithm is improved by the following main steps:
1) defining the number m of variables, generating N m-dimensional particles in a space of feasible solutions,Stis the t-th generation particle in the iteration, wherein the element is
Figure BDA0003207887480000141
Wherein the elements are expressed as
Figure BDA0003207887480000142
2) An inertial weight is determined, which represents the magnitude of the particle's inheritance to the velocity at the last iteration. When the value is larger, the global optimizing capability of the population is stronger, and the local optimizing capability is weaker; the learning ability of the particles is strong when the value is small, and the particles can be converged to a local optimal value at a higher speed. The self-adaptive weight method can better find a balance point between the two, and the inertia weight is properly increased when the target values of all particles tend to be consistent; when the target values of the particles are relatively dispersed, the inertia weight value is properly reduced, and the specific expression is as follows:
Figure BDA0003207887480000143
wherein, waAnd wzRepresenting maximum and minimum values of inertial weight, fz,fpjRespectively, the fitness value of a particle, the minimum fitness value of all particles, and the average fitness value of all particles.
3) Defining fuzzy inference rule input variables, density of population where particles are located, and expressing the distance between each particle by Euclidean distance:
Figure BDA0003207887480000151
the calculation formula of the particle density is obtained:
Figure BDA0003207887480000152
niis the number of particles in the i particle population, N isNumber of particles in the generated solution, ciAnd (4) recording the density of the particles, normalizing the density and the current iteration times k, taking the normalized density and the current iteration times k as two input variables of the fuzzy rule, and respectively calculating the membership degrees of the fuzzy rule to different states.
4) The fuzzy inference rule defines low density (L), medium density (M) and high density (H) for the fuzzy set of the input normalization variables, and the expression of the membership function is shown as the following formula. c. C1-c3Is the interval threshold of the membership function.
Figure BDA0003207887480000153
Figure BDA0003207887480000154
Figure BDA0003207887480000155
Where x may be two input variables of particle density and iteration number. The calculated membership degree of the particle density and the membership degree of the iteration times are combined in a cross way to form a fuzzy matrix K of the particle state with the dimensionality of 3 multiplied by 3, and the fuzzy matrix K of the particle state and a vector c formed by the membership degree of the particle density are combinedlMultiplying to obtain probability vector c of particle dependent different density intervalb
Figure BDA0003207887480000156
And taking the maximum density interval in which the current particles are positioned, and respectively formulating different particle updating modes.
5) Two learning factors mu of particle updating rule and initialization algorithm1、μ2
When c is going toblAt maximum, the particles are solved only to the optimal direction of the particles, and the speed updating mode is as follows:
Figure BDA0003207887480000161
when c is going tobmOr cbhAnd when the maximum is reached, the algorithm is solved towards the global optimum and subgroup optimum directions, and the same updating mode as the traditional particle population algorithm is adopted. By self-optimal here is meant a locally optimal solution in a population of low density particles. The flow of the APSO-SVR algorithm is shown in FIG. 2.
Application case
1. The method comprises the steps of taking hydrogen and methane gas indexes in DGA historical online monitoring data of certain main transformer equipment as research objects, and considering that the online monitoring data of the oil chromatogram generally takes day as a sampling period; therefore, the present invention takes the number of sample points (720 points) that is approximately two years as the data window length and drags the data window across the entire historical data set with the number of sample points (90 points) that are quarterly as the step size.
2. Piecewise linearization of sequence data: the method provided by the invention is used for carrying out piecewise linearization fitting on the intercepted window sequence data, and the specific closing result of each index data is shown in the following figures 3 and 4. As can be seen from fig. 3 and 4, the indicator fitting of the online monitoring data of DGA is successful, and a line segment formed by connecting two end points represents all data points in a line segment span.
3. Constructing a model for describing the similarity of different line segments: and constructing a similarity model based on the slope and the span of the line segment, classifying the line segment by using a K-means clustering algorithm improved based on the maximum and minimum distances, giving symbols to the line segments of the same class, and completing the symbolization of the sequence data.
4. And (3) mining the relevance between different sequences: after the corresponding frequent item set is obtained, the relevance between the two indexes is analyzed by using the method provided by the invention, so that the support degree is convenient for representing the strength of the relevance relation by the confidence coefficient, and H is obtained2→CH4The support degree and the confidence degree of (2) are 0.5050 and 0.6804 respectively, which are both larger than the set related minimum threshold value, and indicate that the rule is a strong association rule, which indicates that a strong association relationship exists between the hydrogen and methane indexes.
5. Extracting and screening abnormal values existing in DGA online monitoring data: under the condition that the strong correlation between hydrogen and methane is known, abnormal value detection is carried out on the two index sequences, abnormal values of the hydrogen online data are found in the 42 th to 54 th, 85 th to 91 th and 201 th to 206 th numerical value sampling points, and the methane gas online data are not abnormal in the sampling time period near the points, so that the abnormal sampling points are judged to be caused by abnormal operation state of the monitoring device, and a cleaned data set is drawn as the basis for judging the operation state of the online monitoring device. And at 466 to 471 sampling points, the online monitoring data of methane is abnormal, at 466 to 473 sampling points, the online monitoring data of hydrogen is abnormal, the abnormal time periods of the online monitoring data of two indexes are similar, the index data of the nearby sampling time period is reserved and marked as the abnormal point of the running state of the equipment.
6. Improving a particle swarm optimization algorithm to optimize the key parameters of the support vector regression, and repairing the screened abnormal numerical value points: in order to verify the effectiveness of the APSO-SVR algorithm, a certain section of converter transformer online data running in a normal state is intercepted, and the data correction algorithm provided by the text is verified. A regression analysis model is constructed by respectively using a common PSO and APSO in the text, DGA online monitoring data is used as a verification object, hydrogen in the DGA online monitoring data is used as a test data set, and other four gases are used as training data sets, so that the optimization process and regression results of different models are obtained as shown in fig. 5.
By comparison of fig. 6, the prediction result of the SVR prediction model optimized by the APSO algorithm is closer to the actual value, and the relative prediction error is smaller; the effectiveness of the proposed data repair strategy herein is demonstrated. The results of on-line monitoring data of the DGA by using the support vector regression algorithm of the improved particle swarm optimization are shown in FIG. 6. It can be seen that the data points that were screened out, all values returned to normal levels after being repaired using the method herein by relying on several other characteristic gases, and the online data was effectively cleaned.

Claims (7)

1. A method for processing the DGA online monitoring data of a converter transformer with the functions of firstly removing the difference and then repairing is characterized by comprising the following steps:
s1, importing DGA online monitoring data, and setting the length and the sliding step length of a sliding window;
s2, piecewise linearization of sequence data: combining a variable number of points in the online data together according to a model by using a piecewise linearization algorithm of sequence data to form a multi-group data point set; the criteria for grouping data points are: wherein the error between the line segment fitted by all the points and the actual data point is less than a threshold value, and the fitted line segment is represented by using the slope and the line segment span of the line segment;
s3, constructing a model for describing the similarity of different line segments: constructing a similarity model based on the slope and span of the line segments, classifying the line segments by using a K-means clustering algorithm improved based on the maximum and minimum distances, giving symbols to the line segments of the same class, and completing the symbolization of sequence data;
s4, mining the relevance among different sequences: setting a minimum confidence coefficient and a support degree based on an Apriori algorithm, mining a frequent item set existing among different sequences, and quantifying the relevance among the different sequences;
s5, extracting and screening abnormal values existing in DGA online monitoring data: according to the strength of the correlation among the sequences, separating data of different abnormal modes from the abnormal numerical value types in the judged data;
s6, optimizing key parameters of support vector regression by improving a particle swarm algorithm, and repairing the screened abnormal numerical value points: defining the distance between particle solution sets, calculating the density of different particles based on the distance, and introducing an improved fuzzy inference rule according to the density to define different particle updating modes so as to improve the diversity and solving speed of particle swarm algorithm solution; and optimizing the key parameters supporting vector regression by using an improved particle swarm algorithm, improving the data regression precision, repairing the screened abnormal numerical points, and finishing the processing of DGA on-line monitoring data.
2. The method as claimed in claim 1, wherein in step S1, the DGA online monitoring data is imported and a sliding window is setThe length is L, and the sliding step length is set to be L; traversing the online data set with a sliding window of a certain step size: dragging a sliding window to slide on the whole online monitoring data set by a sliding step length l until all data are traversed; let the length of the on-line monitoring data set be L1After traversal, get
Figure FDA0003207887470000011
A data window, deriving the data in all windows to form a data set DS to be analyzedi,i∈n。
3. The method for processing the on-line monitoring data of the DGA of the converter transformer with the exception removal and the repair steps as claimed in claim 1, wherein the step S2 provides a piecewise linearization algorithm of the sequence data, which comprises the following specific steps:
s2.1, for monitoring data XK={x1,x2,…,xkIntercepting data points by a window with the length of L (L < k), and carrying out piecewise linear fitting on the data points contained in the intercepted window on the basis of the idea of a sliding window;
s2.2, taking the first data point in the window as the fitting starting point of the initial line segment, and enabling the point to be xiAssuming that the fitting end point of the initial line segment is xi+m(m > 1), fitting the m +1 data points to a line segment;
the distance from the actual data points to the fitting line segment is used as a fitting error, and the fitting accuracy of the fitting line segment to the actual numerical points is improved; unlike the conventional least squares fitting, let dnAs index sequence number points XnAnd (3) calculating the linear distance from all actual data points in the step length of the fitting line segment to the fitting line segment, and taking the sum of the linear distances as the fitting overall error ER of the line segment:
Figure FDA0003207887470000021
Figure FDA0003207887470000022
Xithe sampling numerical value of the time i in the time sequence is represented, and m represents the number of numerical points contained in the fitting line segment; t is tnRepresents a time step;
s2.3, setting a fitting error threshold value to be ERrIf ER < ERrIf so, the line segment can still continue to increase the fitting points, and let m be m +1, and repeat the above steps; ER if anyrIf m is equal to m, taking the point as a line segment fitting end point to generate a line segment; if ER > ER is presentrIf the line segment can not be fitted, the fitting end point of the current line segment is stored as Xend=Xi+m-1And recording the data sampling time, returning to the step S2.2, resetting the parameter m, and fitting the next part of data by taking the current fitting end point as the fitting starting point of the next line segment until all data points in the sequence are fitted.
4. The method as claimed in claim 1, wherein in step S3, since the different indexes in the DGA online monitoring data have a certain order of magnitude difference, all line segment triplets existing in the same sequence need to be shaped as if they are all line segment triplets existing in the same sequence
Figure FDA0003207887470000023
The standardization operation of (2);
during cluster analysis, establishing a standard for measuring the similarity of the line segments; describing the similarity between line segments by using Euclidean distance, wherein the consideration degree of different attributes of the line segments is expressed in a weight mode; the established line segment similarity model is shown as the following formula:
Figure FDA0003207887470000031
in the formula (ds)ijRepresenting line segment similarity, ωk、ωm、ωrAnd respectively representing the weight ratios occupied by the slope, the span and the growth rate in the line segment similarity model.
5. The method for processing the on-line monitoring data of the DGA of the converter transformer with the functions of firstly removing the difference and then repairing the converter transformer as claimed in claim 1, wherein in the step S3, the improved K-means algorithm based on the maximum and minimum distances comprises the following main steps:
the maximum and minimum distances are also based on Euclidean distances, and the difference between the maximum and minimum distances and the K-means algorithm is that an object with a maximum distance is taken as a clustering center; for the sample set, a proportion coefficient theta (0 < theta < 1) is given, and the sample set s is taken arbitrarilynIs the initial clustering center, denoted as z1
Optionally taking the distance z of the remaining n-1 samples1The farthest sample is the second cluster center, denoted as z2
Calculate the remaining n-2 samples and z1And z2And finding the minimum value among them, namely:
Dij=||xi-zj||,j=1,2 (6)
Di=min(Di1,Di2),i=1,2,…,n (7)
if it is
Di=max{Di}>θ×||zi-z2|| (8)
Then select the corresponding sample siAs a third cluster center z3
Assuming that there are K cluster centers, the distance from the remaining n-K samples to the cluster centers is calculated, and the following steps are carried out:
Dr=max{min(Di1,Di2,…Dik)}>θ×||z1-z2|| (9)
then the corresponding sample xrIs the K +1 cluster center and is marked as zK+1(ii) a And the process is continuously cycled until no new cluster centers appear.
6. The method for processing the on-line monitoring data of the DGA of the converter transformer with the functions of firstly removing the difference and then repairing the difference as claimed in claim 1, wherein in the step S4 of the present invention, the process of mining the correlation between different sequences is as follows:
s4.1, setting parameters of minimum support degree and minimum confidence degree; judging the basis of sequence association and frequent item sets when the confidence coefficient and the support degree threshold value exist, wherein a proper threshold value parameter is favorable for enhancing the reliability of the association relation, and the minimum support degree threshold value of frequent-1 and frequent-2 item sets is recorded as min sup1And min sup2The minimum confidence threshold in the sequence association mining is min con;
s4.2, generating a frequent item set; using the summed two-signed sequence as a transaction set, denoted
Figure FDA0003207887470000041
Wherein
Figure FDA0003207887470000042
All symbol categories corresponding to the two sequences are: { A1,A2,…,ACAAnd { B }1,B2,…,BCBAnd on the basis of the basic idea of Apriori algorithm, obtaining a frequent item set of the sequence by scanning the transaction set in two stages, and calculating the confidence coefficient of each symbol in the sequence according to formula (10):
Figure FDA0003207887470000043
wherein X and Y represent two index objects needing to mine association rules, NtRepresenting the number of transaction sets, namely the number of elements in the sequence, representing the proportion of items in the transaction sets by the support degree, and when a frequent-1 item set is explored, the support degree is greater than min sup1The items of (a) are divided into a set of frequent-1 item sets;
the collection of frequent-1 item sets of two sequences in the association mining is recorded as PA、PBThe items in the set are divided into two according to the index parameterTwo pairs, formed as (P)Ai,PBi) Form 2-item set, calculating the support degree of each item in the 2-item set, and enabling the support degree to be greater than min sup2Is divided into a frequent-2 item set, denoted as { PA,PB}freq
S4.3, mining of sequence relevance: combining all the sequences pairwise, and respectively counting the support degree of the frequent-2 item concentrated items in the sequences and the confidence degree between the corresponding association mining sequences;
firstly, accumulating the support degrees of all frequent-2 item sets between two index parameters according to an equation (7) and taking the accumulated support degrees as the support degrees of the two parameter sequences in all multivariate sequences,
Figure FDA0003207887470000044
σ(XA)=sum(σ(PA)) (12)
σ(XB)=sum(σ(PB)) (13)
wherein, m is CA + CB, CA and CB are the total number of the line segment categories divided after the clustering analysis of the two sequences, and m is the number of the line segment categories after the two sequences are summed; at the same time, the minimum support threshold of the index sequence layer is min sup3If the support degree of the parameter index level is larger than the set threshold value, calculating the confidence degree con (X) of the combination of the symbol item sets in the two sequencesA→XB) As shown in formula (14):
Figure FDA0003207887470000051
when the confidence is greater than the set minimum confidence threshold, the association rule X is reservedA→XBAnd describing the strength of the association between the two indexes by using the confidence coefficient, and judging that the two indexes have strong association.
7. The method for processing the on-line monitoring data of the DGA of the converter transformer repaired after the difference is removed according to claim 1, wherein the step S6 of improving the particle swarm optimization mainly comprises the following steps:
s6.1, defining the number m of variables, and generating N m-dimensional particles in a feasible solution space, StIs the t-th generation particle in the iteration, wherein the element is
Figure FDA0003207887470000052
Wherein the elements are expressed as
Figure FDA0003207887470000053
S6.2, determining the inertial weight: the self-adaptive weight method can better find a balance point between the two, and the inertia weight is properly increased when the target values of all particles tend to be consistent; when the target values of the particles are relatively dispersed, the inertia weight value is properly reduced, and the specific expression is as follows:
Figure FDA0003207887470000054
wherein, waAnd wzRepresenting maximum and minimum values of inertial weight, fz,fpjRespectively representing the fitness value of the particle, the minimum fitness value of all the particles and the average fitness value of all the particles;
s6.3, defining fuzzy inference rule input variables: the population density of the particles is expressed by Euclidean distance:
Figure FDA0003207887470000055
the calculation formula of the particle density is obtained:
Figure FDA0003207887470000056
niis the number of particles in the i particle population, N is the number of solution-concentrated particles generated, ciRecording the density of the particles, normalizing the density and the current iteration times k, taking the normalized density and the current iteration times k as two input variables of the fuzzy rule, and respectively calculating the membership degrees of the fuzzy rule to different states;
s6.4, fuzzy inference rules: defining low density, medium density and high density respectively for the fuzzy set of the input normalization variables, wherein the expression of membership function is shown as the following formula c1-c3Is an interval threshold of the membership function,
Figure FDA0003207887470000061
Figure FDA0003207887470000062
Figure FDA0003207887470000063
wherein x can be two input variables of particle density and iteration times, the calculated membership degree of particle density and the membership degree of iteration times are combined in a cross way to form a particle state fuzzy matrix K with the dimensionality of 3 multiplied by 3, and the particle state fuzzy matrix K and a vector c formed by the membership degree of particle densitylMultiplying to obtain probability vector c of particle dependent different density intervalb
Figure FDA0003207887470000064
Taking the maximum density interval in which the current particles are positioned, and respectively formulating different particle updating modes;
s6.5, particle updating rule: two learning factors mu of the initialization algorithm1、μ2
When c is going toblAt maximum time, particleSolving only to the optimal direction of the self, and updating the speed:
Figure FDA0003207887470000065
when c is going tobmOr cbhAnd when the maximum is reached, the algorithm is solved towards the global optimum and subgroup optimum directions, and the same updating mode as the traditional particle population algorithm is adopted.
CN202110922307.7A 2021-08-12 2021-08-12 Converter transformer DGA online monitoring data processing method for firstly removing abnormal state and then repairing Active CN113792754B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110922307.7A CN113792754B (en) 2021-08-12 2021-08-12 Converter transformer DGA online monitoring data processing method for firstly removing abnormal state and then repairing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110922307.7A CN113792754B (en) 2021-08-12 2021-08-12 Converter transformer DGA online monitoring data processing method for firstly removing abnormal state and then repairing

Publications (2)

Publication Number Publication Date
CN113792754A true CN113792754A (en) 2021-12-14
CN113792754B CN113792754B (en) 2024-08-16

Family

ID=78875928

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110922307.7A Active CN113792754B (en) 2021-08-12 2021-08-12 Converter transformer DGA online monitoring data processing method for firstly removing abnormal state and then repairing

Country Status (1)

Country Link
CN (1) CN113792754B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113987033A (en) * 2021-12-28 2022-01-28 国网江西省电力有限公司电力科学研究院 Main transformer online monitoring data group deviation identification and calibration method
CN114372093A (en) * 2021-12-15 2022-04-19 南昌大学 Processing method of DGA (differential global alignment) online monitoring data of transformer
CN114756557A (en) * 2022-06-15 2022-07-15 广州晨安网络科技有限公司 Data processing method of improved computer algorithm model
CN116484307A (en) * 2023-06-21 2023-07-25 深圳市魔样科技有限公司 Cloud computing-based intelligent ring remote control method
CN116776258A (en) * 2023-08-24 2023-09-19 北京天恒安科集团有限公司 Power equipment monitoring data processing method and system
CN117992895A (en) * 2024-04-03 2024-05-07 西安寰宇管道工程技术有限公司 Oil and gas pipeline area risk monitoring method and system based on big data

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229916A (en) * 2017-05-27 2017-10-03 南京航空航天大学 A kind of airport noise Monitoring Data restorative procedure based on depth noise reduction own coding
CN109657847A (en) * 2018-12-06 2019-04-19 华中科技大学 Failure prediction method in industrial production based on particle group optimizing support vector regression
CN110008253A (en) * 2019-03-28 2019-07-12 浙江大学 The industrial data association rule mining and unusual service condition prediction technique of strategy are generated based on two stages frequent item set
CN110580328A (en) * 2019-09-11 2019-12-17 江苏省地质工程勘察院 Method for repairing underground water level monitoring value loss
CN111444953A (en) * 2020-03-24 2020-07-24 东南大学 Sensor fault monitoring method based on improved particle swarm optimization algorithm
CN111522804A (en) * 2020-04-15 2020-08-11 国网江苏省电力有限公司 Cleaning method for abnormal data of transformer equipment state monitoring
CN112348084A (en) * 2020-11-08 2021-02-09 大连大学 Unknown protocol data frame classification method for improving k-means
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method
CN113157761A (en) * 2020-12-07 2021-07-23 南京邮电大学 Electric vehicle charging and discharging fault analysis method based on association rule mining

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229916A (en) * 2017-05-27 2017-10-03 南京航空航天大学 A kind of airport noise Monitoring Data restorative procedure based on depth noise reduction own coding
CN109657847A (en) * 2018-12-06 2019-04-19 华中科技大学 Failure prediction method in industrial production based on particle group optimizing support vector regression
CN110008253A (en) * 2019-03-28 2019-07-12 浙江大学 The industrial data association rule mining and unusual service condition prediction technique of strategy are generated based on two stages frequent item set
CN110580328A (en) * 2019-09-11 2019-12-17 江苏省地质工程勘察院 Method for repairing underground water level monitoring value loss
CN111444953A (en) * 2020-03-24 2020-07-24 东南大学 Sensor fault monitoring method based on improved particle swarm optimization algorithm
CN111522804A (en) * 2020-04-15 2020-08-11 国网江苏省电力有限公司 Cleaning method for abnormal data of transformer equipment state monitoring
CN112348084A (en) * 2020-11-08 2021-02-09 大连大学 Unknown protocol data frame classification method for improving k-means
CN113157761A (en) * 2020-12-07 2021-07-23 南京邮电大学 Electric vehicle charging and discharging fault analysis method based on association rule mining
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
何朋飞 等: "APSO_SVR模型在我国大豆价格预测的应用研究", 何朋飞 等, 31 July 2017 (2017-07-31), pages 632 - 638 *
徐大明;周超;孙传恒;杜永贵;: "基于粒子群优化BP神经网络的水产养殖水温及pH预测模型", 渔业现代化, no. 01, 20 February 2016 (2016-02-20), pages 30 - 35 *
陈敬龙 等: "基于自适应支持向量回归机的集输系统压力监测异常值识别", 《油气田地面工程》, 20 February 2017 (2017-02-20), pages 6 - 9 *
陈海燕;杜婧涵;张魏宁;: "基于深度降噪自编码网络的监测数据修复方法", 系统工程与电子技术, no. 02, 31 December 2018 (2018-12-31), pages 196 - 201 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114372093A (en) * 2021-12-15 2022-04-19 南昌大学 Processing method of DGA (differential global alignment) online monitoring data of transformer
CN113987033A (en) * 2021-12-28 2022-01-28 国网江西省电力有限公司电力科学研究院 Main transformer online monitoring data group deviation identification and calibration method
CN113987033B (en) * 2021-12-28 2022-04-12 国网江西省电力有限公司电力科学研究院 Main transformer online monitoring data group deviation identification and calibration method
CN114756557A (en) * 2022-06-15 2022-07-15 广州晨安网络科技有限公司 Data processing method of improved computer algorithm model
CN116484307A (en) * 2023-06-21 2023-07-25 深圳市魔样科技有限公司 Cloud computing-based intelligent ring remote control method
CN116484307B (en) * 2023-06-21 2023-09-19 深圳市魔样科技有限公司 Cloud computing-based intelligent ring remote control method
CN116776258A (en) * 2023-08-24 2023-09-19 北京天恒安科集团有限公司 Power equipment monitoring data processing method and system
CN116776258B (en) * 2023-08-24 2023-10-31 北京天恒安科集团有限公司 Power equipment monitoring data processing method and system
CN117992895A (en) * 2024-04-03 2024-05-07 西安寰宇管道工程技术有限公司 Oil and gas pipeline area risk monitoring method and system based on big data
CN117992895B (en) * 2024-04-03 2024-06-07 西安寰宇管道工程技术有限公司 Oil and gas pipeline area risk monitoring method and system based on big data

Also Published As

Publication number Publication date
CN113792754B (en) 2024-08-16

Similar Documents

Publication Publication Date Title
CN113792754A (en) Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN110263846B (en) Fault diagnosis method based on fault data deep mining and learning
CN109544399B (en) Power transmission equipment state evaluation method and device based on multi-source heterogeneous data
CN111429034A (en) Method for predicting power distribution network fault
CN113987033B (en) Main transformer online monitoring data group deviation identification and calibration method
CN107730084A (en) Repair of Transformer decision-making technique based on gray prediction and risk assessment
CN110223193A (en) The method of discrimination and system of operation of power networks state are used for based on fuzzy clustering and RS-KNN model
CN115905319B (en) Automatic identification method and system for abnormal electricity fees of massive users
CN116861331A (en) Expert model decision-fused data identification method and system
CN116823496A (en) Intelligent insurance risk assessment and pricing system based on artificial intelligence
CN114372093A (en) Processing method of DGA (differential global alignment) online monitoring data of transformer
CN116578436A (en) Real-time online detection method based on asynchronous multielement time sequence data
CN118037440B (en) Trusted data processing method and system for comprehensive credit system
Yang et al. Investigating black-box model for wind power forecasting using local interpretable model-agnostic explanations algorithm: Why should a model be trusted?
CN118041661A (en) Abnormal network flow monitoring method, device and equipment based on deep learning and readable storage medium
CN116701846A (en) Hydropower station dispatching operation data cleaning method based on unsupervised learning
CN117458437A (en) Short-term wind power prediction method, system, equipment and medium
CN116561569A (en) Industrial power load identification method based on EO feature selection and AdaBoost algorithm
CN115883182A (en) Method and system for improving network security situation element identification efficiency
CN115017818A (en) Power plant flue gas oxygen content intelligent prediction method based on attention mechanism and multilayer LSTM
Zhuang et al. Dynamic generative residual graph convolutional neural networks for electricity theft detection
CN116720662B (en) Distributed energy system applicability evaluation method based on set pair analysis
CN112836926B (en) Enterprise operation condition evaluation method based on electric power big data
Wang et al. Product Key Reliability Characteristics Identification Method Based on XGBoost in Manufacturing Process
Wu et al. Variation-Incentive Loss Re-weighting for Regression Analysis on Biased Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 330096 No. 88, Minqiang Road, private science and Technology Park, Qingshanhu District, Nanchang City, Jiangxi Province

Applicant after: STATE GRID JIANGXI ELECTRIC POWER COMPANY LIMITED Research Institute

Applicant after: STATE GRID CORPORATION OF CHINA

Applicant after: STATE GRID JIANGXI ELECTRIC POWER Co.,Ltd.

Applicant after: Nanchang University

Address before: 330096 No.88 Minqiang Road, private science and Technology Park, high tech Zone, Nanchang City, Jiangxi Province

Applicant before: STATE GRID JIANGXI ELECTRIC POWER COMPANY LIMITED Research Institute

Applicant before: STATE GRID CORPORATION OF CHINA

Applicant before: STATE GRID JIANGXI ELECTRIC POWER Co.,Ltd.

Applicant before: Nanchang University

GR01 Patent grant
GR01 Patent grant