CN114372093A

CN114372093A - Processing method of DGA (differential global alignment) online monitoring data of transformer

Info

Publication number: CN114372093A
Application number: CN202111534103.2A
Authority: CN
Inventors: 朱自伟; 张益宁; 周梦垚; 谢青; 徐松龄; 翟嘉璐; 王梦宇
Original assignee: Nanchang University
Current assignee: Nanchang University
Priority date: 2021-12-15
Filing date: 2021-12-15
Publication date: 2022-04-19

Abstract

The invention proposes a method for processing transformer DGA online monitoring data. According to the characteristics of the returned data, the online data is equivalent to a time series; the idea of sliding window algorithm is introduced in the first stage, and an improved sequence piecewise linearization is proposed. The algorithm divides the sequence data into several line segments characterized by slope and span, and then uses the improved K-means clustering to symbolize the online monitoring data. Finally, the APRIORI algorithm is used to mine the correlation between different indicators in the DGA, and the In the second stage, according to the filtered abnormal value sampling points, the improved support vector regression algorithm of particle swarm optimization is used to ensure the solution speed and solution diversity of the algorithm, and optimize the support vector regression algorithm. The key parameters repair these sampling points to complete the processing of transformer online DGA monitoring data.

Description

A method of processing transformer DGA online monitoring data

技术领域technical field

本发明涉及一种变压器DGA在线监测数据的处理方法，属于电力设备数据清洗领域。The invention relates to a method for processing transformer DGA online monitoring data, which belongs to the field of data cleaning of power equipment.

背景技术Background technique

电力变压器是电能转换与传输的枢纽设备，其安全稳定的运行是对用户供电质量的重要保障。变压器的DGA指标在线数据是对设备绝缘性能的实时监测，基于油色谱数据的分析，可以快速得出变压器所处的实时状态；同时DGA数据中指标维度较多，通过对其中指标的关联关系挖掘，有助于甄别在线数据中不同异常模式的数据，可以增强设备综合状态评价结果的可信度。Power transformers are pivotal equipment for power conversion and transmission, and their safe and stable operation is an important guarantee for the quality of power supply to users. The online data of the DGA indicators of the transformer is the real-time monitoring of the insulation performance of the equipment. Based on the analysis of the oil chromatography data, the real-time status of the transformer can be quickly obtained; at the same time, there are many indicators in the DGA data. , which helps to identify data with different abnormal patterns in online data, and can enhance the credibility of the comprehensive state evaluation results of the equipment.

由于设备所处运行环境以及变压器本身存在的一些电磁干扰作用，在线监测装置在数据的采集传输过程中容易出现随机分布的异常数值点，严重时甚至出现数据漂移，传输中断的情况。对数据漂移、数据中断等明显数据异常现象，后台系统可以很快的进行辨别，并针对问题进行报警；但对于那些随机分布于正常在线数据中的异常数值点，对设备状态指标的实时表征起到严重的干扰作用，也对基于指标的状态评价工作产生影响，容易造成设备异常状态的误报、错报等情况，导致设备的运行检修资源的浪费。Due to the operating environment of the equipment and some electromagnetic interference effects of the transformer itself, the online monitoring device is prone to randomly distributed abnormal value points in the process of data acquisition and transmission, and even data drift and transmission interruption in severe cases. For obvious data anomalies such as data drift and data interruption, the back-end system can quickly identify the problem and issue an alarm for the problem; but for those abnormal numerical points randomly distributed in normal online data, the real-time characterization of equipment status indicators is effective. It also affects the status evaluation work based on indicators, and it is easy to cause false alarms and false alarms of abnormal equipment status, resulting in a waste of equipment operation and maintenance resources.

电力变压器是保证输配电网稳定运行的重要设备，变压器的铁芯接地电流监测数据是对变压器进行状态评估的重要依据。一段时间的监测数据，包含其整体变化趋势、变化中的极值点及跃变点以及数据统计特征，可以从多方面反映电力变压器的内部可能存在的异常情况。Power transformers are important equipment to ensure the stable operation of the power transmission and distribution network. The monitoring data of the transformer's iron core grounding current is an important basis for evaluating the status of the transformer. The monitoring data for a period of time, including its overall change trend, extreme points and jump points in the change, and statistical characteristics of the data, can reflect the possible abnormal conditions inside the power transformer from many aspects.

经过电力设备的长期运行，已有较大规模的指标数据存储于电力数据库中，其中必然包含不同异常模式的指标数据，通过对已有的指标数据进行关联分析，挖掘出其中存在的关联关系，基于该关联关系分析数据中不同异常模式的数据，并对这些数据进行有效的修复，有利完善电力设备的综合状态评价体系，提早发现设备装置的异常状态，提高设备检修效率，降低设备的运维成本。After the long-term operation of power equipment, large-scale index data has been stored in the power database, which must contain index data of different abnormal patterns. Analyzing data of different abnormal patterns in the data based on this relationship, and effectively repairing these data, is conducive to improving the comprehensive state evaluation system of power equipment, discovering abnormal states of equipment devices early, improving equipment maintenance efficiency, and reducing equipment operation and maintenance. cost.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种变压器DGA在线监测数据的处理方法，以解决上述背景技术的问题。The purpose of the present invention is to provide a method for processing transformer DGA online monitoring data, so as to solve the above-mentioned problems of the background technology.

本发明通过以下技术方案来实现，一种变压器DGA在线监测数据的处理方法，包括如下步骤：The present invention is achieved through the following technical solutions, a method for processing transformer DGA online monitoring data, comprising the following steps:

S1、数据集的滑动窗口处理：引入滑动窗口的思想，使用长度为L的窗口截取在线数据集；S1. Sliding window processing of datasets: The idea of sliding windows is introduced, and a window of length L is used to intercept online datasets;

S2、以一定的步长滑动窗口遍历在线数据集：设置滑动步长为l，拖动窗口在整体数据集上滑动，直至遍历所有数据；令在线数据集长度为L₁，遍历之后得到

个数据窗口，导出所有窗口中的数据，构成待分析数据集DS_i，i∈n；S2. Traverse the online dataset by sliding the window with a certain step size: set the sliding step size to l, drag the window to slide on the overall dataset until all data is traversed; let the length of the online dataset be L ₁ , and get the result after traversing

data windows, export the data in all windows to form the data set DS _i to be analyzed, i∈n;

S3、序列数据的分段线性化：提出序列数据的分段线性化算法，将在线数据中不定量的点组合在一起，形成多组数据点集；数据点的分组的标准在于其中所有点拟合出的线段与实际数据点之间的误差小于阈值，且使用的线段的斜率与线段跨度表征拟合出的线段；S3. Piecewise linearization of sequence data: A piecewise linearization algorithm of sequence data is proposed, which combines indeterminate points in online data to form multiple sets of data points; the standard for grouping data points is that all points are The error between the combined line segment and the actual data point is less than the threshold, and the slope of the line segment and the line segment span are used to characterize the fitted line segment;

S4、构建描述不同线段相似度的模型：基于线段的斜率与跨度构建相似度模型，并使用基于最大最小距离改进的K-means聚类算法为线段划分类别，并为同类别线段赋予符号，完成序列数据的符号化；S4. Build a model describing the similarity of different line segments: build a similarity model based on the slope and span of the line segment, and use the improved K-means clustering algorithm based on the maximum and minimum distances to classify the line segments, and assign symbols to the same type of line segments, complete Symbolization of sequence data;

S5、挖掘不同序列之间的关联性：基于Apriori算法的思想，设置最小置信度与支持度，挖掘不同序列之间存在的频繁项集，量化不同序列之间的关联性；S5. Mining the correlation between different sequences: Based on the idea of Apriori algorithm, set the minimum confidence and support, mine the frequent itemsets existing between different sequences, and quantify the correlation between different sequences;

S6、提取筛除DGA在线监测数据中存在的异常值：根据序列之间的关联性强弱，对判定数据中存在的异常数值类型，分离出不同异常模式的数据；S6. Extract and screen out the abnormal values existing in the DGA online monitoring data: according to the strength of the correlation between the sequences, to determine the types of abnormal values existing in the data, separate data with different abnormal patterns;

S7、改进粒子群优化支持向量回归：定义粒子解集之间的距离，基于该距离计算不同粒子所处的密度，并根据密度定义粒子的更新方式；使用算法优化支持向量回归的关键参数，完成DGA在线数据的处理。S7. Improve particle swarm optimization support vector regression: define the distance between particle solution sets, calculate the density of different particles based on the distance, and define the update method of particles according to the density; use the algorithm to optimize the key parameters of support vector regression, complete Processing of DGA Online Data.

进一步的，S3中提出的序列数据的分段线性化算法的具体步骤是：Further, the specific steps of the piecewise linearization algorithm for sequence data proposed in S3 are:

1)对于类似DGA的设备指标在线监测数据，等效为时间序列数据；1) For online monitoring data of equipment indicators similar to DGA, it is equivalent to time series data;

2)对时间序列X_K＝{x₁,x₂,…,x_k}，以长度为L(L＜k)的窗口截取数据点，对截取窗口内的数据，基于滑动窗口的思想，对其中含有的数据点进行分段线性拟合；2) For the time series X _K ={x ₁ ,x ₂ ,...,x _k }, use a window of length L (L<k) to intercept data points, and for the data in the intercepted window, based on the idea of sliding windows, to The data points contained in it are subjected to piecewise linear fitting;

3)以窗口内的首个数据点为初始线段的拟合起点，令该点为x_i，假设初始线段的拟合终点为x_i+m(m＞1)，将这m+1个数据点拟合为一条线段；3) Take the first data point in the window as the fitting starting point of the initial line segment, let this point be _xi , and assuming that the fitting end point of the initial line segment is _xi+m (m>1), use this m+1 data points are fitted to a line segment;

4)那么对于这样一条线段，用如下所示的式子表达：4) Then for such a line segment, use the following formula to express:

my-(X_i+m-1-X_i)X-(m-1)X_i+X_i+m-1＝0 (2)my-(X _i+m-1 -X _i )X-(m-1)X _i +X _i+m-1 =0 (2)

以实际数据点至拟合线段的距离作为拟合误差；计算拟合线段步长内所有实际数据点至线段的距离，以其之和作为该线段的拟合整体误差ER：Take the distance from the actual data point to the fitted line segment as the fitting error; calculate the distance from all actual data points to the line segment within the step size of the fitted line segment, and use the sum as the overall fitting error ER of the line segment:

5)设置拟合误差阈值为ER_r，如果ER＜ER_r，则说明该线段仍然可以继续增加拟合点，令m＝m+1，并重复上述步骤；如果有ER＞ER_r，则判定该线段无法拟合，保存当前线段的拟合终点为X_end＝X_i+m-1，记录其数据采样时刻，之后回到步骤3)，重置参数m，并以当前拟合终点作为下一线段的拟合起点进行下一部分的数据拟合，直至该序列中所有的数据点都拟合完毕。5) Set the fitting error threshold to ER _r , if ER < ER _r , it means that the line segment can continue to add fitting points, let m=m+1, and repeat the above steps; if there is ER > ER _r , then determine The line segment cannot be fitted, save the fitting end point of the current line segment as X _end =X _i+m-1 , record its data sampling time, and then return to step 3), reset the parameter m, and use the current fitting end point as the lower The fitting starting point of a line segment performs the next part of data fitting until all data points in the series are fitted.

进一步的，S4中构建相似度模型，并基于此模型进行聚类分析的主要步骤是：Further, the main steps of constructing a similarity model in S4 and performing cluster analysis based on this model are:

1)对同一序列中存在的所有线段属性进行形如

的标准化操作；1) Form the attributes of all line segments existing in the same sequence as

standardized operations;

2)在聚类分析时，建立衡量线段相似度的标准；提取线段的斜率与跨度两个关键参数，使用欧式距离描述线段之间的相似度，在其中以权重的方式表示对线段不同属性的考虑程度；建立的线段相似度模型如下式所示：2) In the cluster analysis, establish a standard for measuring the similarity of line segments; extract two key parameters, the slope and the span of the line segment, use the Euclidean distance to describe the similarity between the line segments, and express the difference between the different attributes of the line segment in the form of weights. Consider the degree; the established line segment similarity model is as follows:

3)基于上述的线段相似度模型，对线段集合使用基于最大最小距离改进的K-means算法进行聚类分析，将相似的线段划分为同一类别。3) Based on the above-mentioned line segment similarity model, clustering analysis is performed on the line segment set using the improved K-means algorithm based on the maximum and minimum distances, and the similar line segments are divided into the same category.

进一步的，S4中基于最大最小距离改进的K-means算法，其主要步骤是：Further, the main steps of the improved K-means algorithm based on the maximum and minimum distances in S4 are:

1)最大最小距离同样以欧式距离为基础，其与K-means算法不同之处在于其取尽量远的对象作为聚类中心；对于样本集，给定一比例系数θ(0＜θ＜1)，任取样本集s_n中的任一样本为初始聚类中心，记为z₁；1) The maximum and minimum distances are also based on the Euclidean distance, which differs from the K-means algorithm in that it takes the object as far away as possible as the cluster center; for the sample set, a proportional coefficient θ (0<θ<1) is given. , any sample in the sample set _sn is taken as the initial cluster center, denoted as z ₁ ;

2)任取剩下n-1个样本中距离z₁最远的样本为第二个聚类中心，记为z₂；2) arbitrarily take the sample farthest from z ₁ in the remaining n-1 samples as the second cluster center, denoted as z ₂ ;

3)计算剩下n-2个样本与z₁与z₂的距离，并求出其中最小值，即：3) Calculate the distance between the remaining n-2 samples and z ₁ and z ₂ , and find the minimum value, namely:

D_ij＝||x_i-z_j||,j＝1,2 (6)D _ij =||x _i -z _j ||,j=1,2 (6)

D_i＝min(D_i1,D_i2),i＝1,2,…,n (7)D _i =min(D _i1 ,D _i2 ),i=1,2,...,n (7)

4)若4) If

D_i＝max{D_i}＞θ×||z_i-z₂|| (8)D _i =max{D _i }>θ×||z _i -z ₂ || (8)

则选取对应样本s_i作为第三个聚类中心z₃；Then select the corresponding sample _si as the third cluster center z ₃ ;

5)假设有K个聚类中心，以此计算剩下的n-K个样本至聚类中心的距离，并有：5) Assuming there are K cluster centers, calculate the distances from the remaining n-K samples to the cluster centers, and have:

D_r＝max{min(D_i1,D_i2,…D_ik)}＞θ×||z₁-z₂|| (9)D _r =max{min(D _i1 ,D _i2 ,...D _ik )}>θ×||z ₁ -z ₂ || (9)

则对应的样本x_r为第K+1个聚类中心，记为z_K+1；并不断循环这个过程，直至没有新的聚类中心出现；Then the corresponding sample x _r is the K+1th cluster center, denoted as z _K+1 ; and this process is repeated continuously until no new cluster center appears;

6)当没有新的聚类中心出现时，将样本按最小距离原则分配至各类中。6) When no new cluster centers appear, assign the samples to each category according to the principle of minimum distance.

进一步的，S5中序列关联性挖掘的主要过程为：Further, the main process of sequence correlation mining in S5 is:

1)最小支持度与最小置信度参数的设置；置信度与支持度阈值是判定序列关联与频繁项集的基础，记频繁-1与频繁-2项集的最小支持度度阈值为minsup₁与minsup₂，序列关联挖掘中的最小置信度阈值为mincon；1) The setting of the minimum support and minimum confidence parameters; the confidence and support thresholds are the basis for determining the sequence association and frequent itemsets, and the minimum support thresholds for frequent-1 and frequent-2 itemsets are minsup ₁ and minsup ₂ , the minimum confidence threshold in sequence association mining is mincon;

2)频繁项集的生成；使用经过归总之后的两符号化序列作为事务集，记为

其中

两序列对应的所有符号类别为：{A₁,A₂,…,A_CA}和{B₁,B₂,…,B_CB}，基于Apriori算法的基本思想，通过对事务集的两阶段扫描，得到序列的频繁项集；根据式(10)计算序列中每个符号的置信度：2) Generation of frequent itemsets; use the two symbolized sequences after summarization as transaction sets, denoted as

in

All symbol categories corresponding to the two sequences are: {A ₁ ,A ₂ ,…,A _CA } and {B ₁ ,B ₂ ,…,B _CB }, based on the basic idea of the Apriori algorithm, through two-stage scanning of the transaction set , get the frequent itemsets of the sequence; calculate the confidence of each symbol in the sequence according to formula (10):

式中N^t表示事务集的个数，即序列中元素的个数，支持度表示的是项在事务集中的占比程度，在发掘频繁-1项集时，将支持度大于minsup₁的项划分至频繁-1项集的集合中；In the formula, N ^t represents the number of transaction sets, that is, the number of elements in the sequence, and the support degree represents the proportion of items in the transaction set. When excavating frequent-1 itemsets, the items whose support degree is greater than minsup ₁ are selected. Divide into a set of frequent-1 itemsets;

记关联挖掘中两序列的频繁-1项集的集合分别为P_A、P_B，根据指标参数将集合中的项两两配对，构成形如(P_Ai,P_Bi)形式2-项集，计算每个项在该2-项集中的支持度，将支持度大于minsup₂的项划分至频繁-2项集，记为{P_A,P_B}_freq；Denote the sets of frequent-1 itemsets of two sequences in association mining as P _A and P _B respectively. According to the index parameters, the items in the sets are paired in pairs to form 2-itemsets of the form (P _Ai , P _Bi ), Calculate the support degree of each item in the 2-item set, and divide the items whose support degree is greater than minsup ₂ into frequent-2 itemsets, denoted as {P _A , P _B } _freq ;

3)序列关联性的挖掘；将所有序列进行两两组合，分别统计其中存在的频繁-2项集中项的支持度以及对应关联挖掘序列之间的置信度；3) Mining of sequence associations; all sequences are combined in pairs, and the support of items in the frequent-2 item set and the confidence between the corresponding association mining sequences are counted respectively;

根据式(11)对所有频繁-2项集在两指标参数之间的支持度累加，并以此作为这两个参数序列在所有多元序列中的支持度计数；According to formula (11), the support degrees of all frequent-2 itemsets between the two index parameters are accumulated, and the support degrees of these two parameter sequences in all multivariate sequences are counted;

σ(X^A)＝sum(σ(P_A)) (12)σ(X ^A )=sum(σ(P _A )) (12)

σ(X^B)＝sum(σ(P_B)) (13)σ(X ^B )=sum(σ(P _B )) (13)

其中m＝CA+CB，为对两序列聚类分析之后的所划分出的线段类别总数；同时记指标序列层面的最小支持度阈值为minsup₃，若参数指标层面的支持度大于设置的阈值，则计算符号项集组合在两序列中的置信度con(X^A→X^B)，如式(14)所示：Where m=CA+CB, is the total number of line segment categories divided after the cluster analysis of the two sequences; at the same time, the minimum support threshold at the index sequence level is minsup ₃ , if the support at the parameter index level is greater than the set threshold, Then calculate the confidence degree con(X ^A →X ^B ) of the combination of symbolic itemsets in the two sequences, as shown in formula (14):

当置信度大于所设置的最小置信度阈值时，保留关联规则X^A→X^B，使用置信度描述两指标之间的关联强度，判定两指标存在强关联。When the confidence is greater than the set minimum confidence threshold, the association rule X ^A → X ^B is retained, the confidence is used to describe the strength of the association between the two indicators, and it is determined that there is a strong correlation between the two indicators.

进一步的，S7中的改进的粒子群优化支持向量回归，其主要为：对于由于异常值删除而导致的空缺数值点，使用改进粒子群优化的支持向量回归算法进行修复；主要步骤如下：Further, the improved particle swarm optimization support vector regression in S7 is mainly: for the vacant numerical points caused by the deletion of outliers, use the improved particle swarm optimization support vector regression algorithm to repair; the main steps are as follows:

1)明确变量个数m，在可行解的空间中生成N个m维的粒子，S^t为迭代中的第t代粒子，其中元素为

其中元素表达为

1) Specify the number of variables m, and generate N m-dimensional particles in the space of feasible solutions. S ^t is the t-th generation particle in the iteration, where the elements are

where the elements are expressed as

2)确定惯性权重,具体其表达式为：2) Determine the inertia weight, and its specific expression is:

其中，w_a和w_z代表惯性权重的最大值和最小值，f,f_z,f_pj分别表示粒子的适应度值、所有粒子的最小适应度值，所有粒子的平均适应度值。Among them, w _a and w _z represent the maximum and minimum values of inertia weights, f, f _z , and f _pj respectively represent the fitness value of the particle, the minimum fitness value of all particles, and the average fitness value of all particles.

3)划分粒子种群的类型，以欧式距离表示每个粒子之间的距离：3) Divide the type of particle population, and express the distance between each particle by Euclidean distance:

定义一个标准距离：Define a standard distance:

式中，r为划分半径，计算i粒子的密度c_i：In the formula, r is the dividing radius, and the density c _i of the i particle is calculated:

n_i为i粒子群落中粒子数目，N为生成的解集中粒子数目。n _i is the number of particles in the i particle community, and N is the number of particles in the generated solution set.

4)粒子根据所属种群的类别，初始化算法的两个学习因子μ₁、μ₂；当粒子密度c_i大于一定阈值时，更新方式为：4) The particles initialize the two learning factors μ ₁ and μ ₂ of the algorithm according to the category of the population they belong to; when the particle density c _i is greater than a certain threshold, the update method is:

当粒子密度c_i小于一定阈值时，更新方式为：When the particle density c _i is less than a certain threshold, the update method is:

本发明的有益效果是：The beneficial effects of the present invention are:

通过序列分段以及关联分析算法挖掘变压器油色谱不同指标之间的关联性，分辨变压器DGA在线监测数据中的异常点，并根据回归算法对这些异常点进行修复，有效提高变压器DGA在线检测数据处理速度。Through sequence segmentation and correlation analysis algorithm, the correlation between different indicators of transformer oil chromatography is mined, the abnormal points in the transformer DGA online monitoring data are identified, and these abnormal points are repaired according to the regression algorithm, which effectively improves the transformer DGA online detection data processing. speed.

附图说明Description of drawings

图1是本发明方法流程图；Fig. 1 is the flow chart of the method of the present invention;

图2是序列分段算法流程图；Fig. 2 is the sequence segmentation algorithm flow chart;

图3是改进粒子群算法求解流程图；Fig. 3 is the solution flow chart of the improved particle swarm algorithm;

图4是氢气指标拟合对比图；Figure 4 is a hydrogen index fitting comparison diagram;

图5是甲烷指标拟合对比图；Fig. 5 is the methane index fitting comparison chart;

图6是氢气与甲烷序列检测出的异常点；Figure 6 is the abnormal point detected by the hydrogen and methane sequence;

图7是数据修复结果图；Fig. 7 is a data repair result diagram;

具体实施方式Detailed ways

下面结合实施例和附图对本发明的一种变压器DGA在线监测数据的处理方法做出详细说明。A method for processing transformer DGA on-line monitoring data of the present invention will be described in detail below with reference to the embodiments and the accompanying drawings.

一种变压器DGA在线监测数据的处理方法，如图1所示，包括如下步骤：A method for processing transformer DGA online monitoring data, as shown in Figure 1, includes the following steps:

S1、DGA在线数据的导入与滑动窗口算法的基本参数设置：在线监测数据的意义在于对设备指标的实时反映，设备经过长期的运行，其在线数据集的规模普遍较为庞大，对数据集整体进行分析复杂程度较高而不具备可行性，且在线数据具有时效性，即在分析某采样点时，距离改点越近的采样点对分析的意义越大，反之越小。本发明引入滑动窗口的思想，使用长度为L的窗口截取在线数据集，对窗口内数据的进行分析以降低过程的复杂程度。S1. Import of DGA online data and basic parameter setting of sliding window algorithm: The significance of online monitoring data lies in the real-time reflection of equipment indicators. After long-term operation of equipment, the scale of online data sets is generally relatively large. The analysis is complex and unfeasible, and the online data is time-sensitive, that is, when analyzing a sampling point, the closer the sampling point is to the changed point, the greater the significance of the analysis, and vice versa. The invention introduces the idea of sliding window, uses a window of length L to intercept the online data set, and analyzes the data in the window to reduce the complexity of the process.

个数据窗口，导出所有窗口中的数据，构成待分析数据集DS_i，i∈n。S2. Traverse the online dataset by sliding the window with a certain step size: set the sliding step size to l, drag the window to slide on the overall dataset until all data is traversed; let the length of the online dataset be L ₁ , and get the result after traversing

There are data windows, and the data in all windows are exported to form the data set DS _i to be analyzed, i∈n.

S3、序列数据的分段线性化：由于在线数据通常为数值型变量，不适用于序列数据的关联性挖掘；本发明提出一种序列数据的分段线性化算法，根据模型将在线数据中不定量的点组合在一起，形成多组数据点集；数据点的分组的标准在于其中所有点拟合出的线段与实际数据点之间的误差小于阈值，且使用的线段的斜率与线段跨度表征拟合出的线段。S3. Piecewise linearization of sequence data: since online data is usually a numerical variable, it is not suitable for correlation mining of sequence data; the present invention proposes a piecewise linearization algorithm of sequence data, which will Quantitative points are combined to form multiple sets of data points; the standard of grouping of data points is that the error between the line segment fitted by all points and the actual data point is less than the threshold, and the slope of the line segment and the line segment span are used to characterize The fitted line segment.

S4、构建描述不同线段相似度的模型：基于线段的斜率与跨度构建相似度模型，并使用基于最大最小距离改进的K-means聚类算法为线段划分类别，并为同类别线段赋予符号，完成序列数据的符号化。S4. Build a model describing the similarity of different line segments: build a similarity model based on the slope and span of the line segment, and use the improved K-means clustering algorithm based on the maximum and minimum distances to classify the line segments, and assign symbols to the same type of line segments, complete Symbolization of sequence data.

S5、挖掘不同序列之间的关联性：基于Apriori算法的思想，设置最小置信度与支持度，挖掘不同序列之间存在的频繁项集，量化不同序列之间的关联性。S5. Mining the correlation between different sequences: Based on the idea of Apriori algorithm, set the minimum confidence and support, mine the frequent itemsets existing between different sequences, and quantify the correlation between different sequences.

S6、提取筛除DGA在线监测数据中存在的异常值：根据序列之间的关联性强弱，对判定数据中存在的异常数值类型，分离出不同异常模式的数据。S6. Extract and screen out the abnormal values existing in the DGA online monitoring data: According to the strength of the correlation between the sequences, the types of abnormal values existing in the judgment data are separated, and the data of different abnormal patterns are separated.

S7、改进粒子群优化支持向量回归：定义粒子解集之间的距离，基于该距离计算不同粒子所处的密度，并根据密度定义粒子的更新方式，以提高算法的求解速度与求解的多样性；使用算法优化支持向量回归的关键参数，提高数据回归精度，完成DGA在线数据的处理。S7. Improve particle swarm optimization support vector regression: define the distance between particle solution sets, calculate the density of different particles based on the distance, and define the update method of particles according to the density, so as to improve the solution speed of the algorithm and the diversity of solutions ; Use algorithms to optimize key parameters of support vector regression, improve data regression accuracy, and complete DGA online data processing.

本发明方法所研究的对象为某主变设备的DGA在线监测数据。The object studied by the method of the present invention is the DGA online monitoring data of a certain main transformer equipment.

如图2所示，S3中提出的序列数据的分段线性化算法的具体步骤是：As shown in Figure 2, the specific steps of the piecewise linearization algorithm for sequence data proposed in S3 are:

1)对于类似DGA的设备指标在线监测数据，其本质可以看作为按着一定的时间间隔顺序，一个个采集的状态指标数值。可知数据具有很强的时间属性，可以等效为时间序列数据。1) For the online monitoring data of equipment indicators similar to DGA, its essence can be regarded as the state indicator values collected one by one according to a certain time interval sequence. It can be seen that the data has a strong time attribute and can be equivalent to time series data.

2)对时间序列X_K＝{x₁,x₂,…,x_k}，以长度为L(L＜k)的窗口截取数据点，对截取窗口内的数据，基于滑动窗口的思想，对其中含有的数据点进行分段线性拟合。2) For the time series X _K ={x ₁ ,x ₂ ,...,x _k }, use a window of length L (L<k) to intercept data points, and for the data in the intercepted window, based on the idea of sliding windows, to The data points contained therein are fitted with a piecewise linear fit.

3)以窗口内的首个数据点为初始线段的拟合起点，令该点为x_i，假设初始线段的拟合终点为x_i+m(m＞1)，将这m+1个数据点拟合为一条线段。3) Take the first data point in the window as the fitting starting point of the initial line segment, let this point be _xi , and assuming that the fitting end point of the initial line segment is _xi+m (m>1), use this m+1 data Points fit as a line segment.

4)那么对于这样一条线段，其可以用如下所示的式子表达：4) Then for such a line segment, it can be expressed by the following formula:

以实际数据点至拟合线段的距离作为拟合误差，提高拟合线段对实际数值点的拟合准确度；计算拟合线段步长内所有实际数据点至线段的距离，以其之和作为该线段的拟合整体误差ER：Use the distance from the actual data point to the fitting line segment as the fitting error to improve the fitting accuracy of the fitting line segment to the actual value point; The fitted overall error ER for this line segment:

S4中构建相似度模型，并基于此模型进行聚类分析的主要步骤是：The main steps to build a similarity model in S4 and perform cluster analysis based on this model are:

1)由于DGA在线监测中不同指标之间存在一定的数量级差异，首先需要对同一序列中存在的所有线段属性进行形如

的标准化操作。1) Since there are certain order of magnitude differences between different indicators in DGA online monitoring, it is first necessary to form the attributes of all line segments existing in the same sequence as

standardized operation.

2)在聚类分析时，需要建立衡量线段相似度的标准；DGA在线数据反映的是设备实时指标，而其中参数的变化趋势和形态最能体现设备运行状态的变化，因此，在建立衡量线段相似度模型时，对线段不同属性需要有不同的考虑，本发明提取线段的斜率与跨度两个关键参数，使用欧式距离描述线段之间的相似度，在其中以权重的方式表示对线段不同属性的考虑程度；建立的线段相似度模型如下式所示：2) In cluster analysis, it is necessary to establish a standard for measuring the similarity of line segments; DGA online data reflects the real-time indicators of equipment, and the change trend and shape of parameters can best reflect the change of equipment operating status. Therefore, when establishing a measurement line segment In the similarity model, different attributes of line segments need to be considered differently. The present invention extracts two key parameters, the slope and the span of the line segment, and uses the Euclidean distance to describe the similarity between the line segments, in which the different attributes of the line segments are expressed in the form of weights. The degree of consideration; the established line segment similarity model is as follows:

S4中基于最大最小距离改进的K-means算法，其主要步骤是：The main steps of the improved K-means algorithm based on the maximum and minimum distance in S4 are:

D_ij＝||x_i-z_j||,j＝1,2 (6)D _ij =||x _i -z _j ||,j=1,2 (6)

D_i＝min(D_i1,D_i2),i＝1,2,…,n (7)D _i =min(D _i1 ,D _i2 ),i=1,2,...,n (7)

4)若4) If

D_i＝max{D_i}＞θ×||z_i-z₂|| (8)D _i =max{D _i }>θ×||z _i -z ₂ || (8)

6)当没有新的聚类中心出现时，将样本按最小距离原则分配至各类中。基于最大最小距离改进的K-means聚类算法其优势在于保证了每次聚类分析时聚类中心一致，去除了传统K-means算法选取聚类中心的随机性，能有效提高聚类分析的准确度与速度。6) When no new cluster centers appear, assign the samples to each category according to the principle of minimum distance. The advantage of the improved K-means clustering algorithm based on the maximum and minimum distance is that it ensures that the cluster centers are consistent in each clustering analysis, removes the randomness of the traditional K-means algorithm to select the clustering centers, and can effectively improve the efficiency of clustering analysis. Accuracy and Speed.

S5中序列关联性挖掘的主要过程为：The main process of sequence correlation mining in S5 is as follows:

1)最小支持度与最小置信度参数的设置；置信度与支持度阈值是判定序列关联与频繁项集的基础，合适的阈值参数有利于增强关联关系的可信度，记频繁-1与频繁-2项集的最小支持度度阈值为minsup₁与minsup₂，序列关联挖掘中的最小置信度阈值为mincon。1) The setting of the minimum support and minimum confidence parameters; the confidence and support thresholds are the basis for judging the sequence association and frequent itemsets. Appropriate threshold parameters are beneficial to enhance the credibility of the association relationship, and record frequent -1 and frequent The minimum support thresholds of -2 itemsets are minsup ₁ and minsup ₂ , and the minimum confidence threshold in sequence association mining is mincon.

其中

两序列对应的所有符号类别为：{A₁,A₂,…,A_CA}和{B₁,B₂,…,B_CB}，基于Apriori算法的基本思想，本发明通过对事务集的两阶段扫描，得到序列的频繁项集；根据式(10)计算序列中每个符号的置信度：2) Generation of frequent itemsets; use the two symbolized sequences after summarization as transaction sets, denoted as

in

All symbol categories corresponding to the two sequences are: {A ₁ , A ₂ ,...,A _CA } and {B ₁ ,B ₂ ,...,B _CB }. Based on the basic idea of the Apriori algorithm, the present invention uses two Step scan to get the frequent itemsets of the sequence; calculate the confidence of each symbol in the sequence according to formula (10):

式中N^t表示事务集的个数，即序列中元素的个数，支持度表示的是项在事务集中的占比程度，在发掘频繁-1项集时，将支持度大于minsup₁的项划分至频繁-1项集的集合中。In the formula, N ^t represents the number of transaction sets, that is, the number of elements in the sequence, and the support degree represents the proportion of items in the transaction set. When excavating frequent-1 itemsets, the items whose support degree is greater than minsup ₁ are selected. Divide into sets of frequent-1 itemsets.

σ(X^A)＝sum(σ(P_A)) (12)σ(X ^A )=sum(σ(P _A )) (12)

σ(X^B)＝sum(σ(P_B)) (13)σ(X ^B )=sum(σ(P _B )) (13)

S7中的改进的粒子群优化支持向量回归，其主要为：对于由于异常值删除而导致的空缺数值点，本发明提出一种改进粒子群优化的支持向量回归算法进行修复。如图3所示，主要步骤如下：The improved particle swarm optimization support vector regression in S7 mainly includes: the present invention proposes an improved particle swarm optimization support vector regression algorithm to repair the vacant numerical points caused by the deletion of outliers. As shown in Figure 3, the main steps are as follows:

其中元素表达为

where the elements are expressed as

2)确定惯性权重，具体其表达式为：2) Determine the inertia weight, and its specific expression is:

定义一个标准距离：Define a standard distance:

4)初始化算法的两个学习因子μ₁、μ₂；当粒子密度c_i大于一定阈值时，更新方式为：4) Two learning factors μ ₁ and μ ₂ of the initialization algorithm; when the particle density c _i is greater than a certain threshold, the update method is:

下面给出具体实例：Specific examples are given below:

一种变压器DGA在线监测数据的处理方法，步骤如下：A method for processing transformer DGA online monitoring data, the steps are as follows:

S1、数据集的滑动窗口处理：电力变压器经过多年的运行，其DGA在线监测数据通常具有较大的规模，同时对整个数据集进行处理通常会加大算法的复杂程度及服务器的运行压力，可行性较低；提出一种基于滑动窗口思想的DGA在线数据处理方法，建立长度为L的数据窗口，使用该窗口在数据集中截取数据。S1. Sliding window processing of data sets: After years of operation of power transformers, the DGA online monitoring data of power transformers usually have a large scale, and processing the entire data set at the same time usually increases the complexity of the algorithm and the operating pressure of the server, which is feasible. It has low performance; a DGA online data processing method based on the idea of sliding window is proposed, a data window of length L is established, and the data is intercepted in the data set using this window.

S2、按照一定的步长截取数据集：以长为l的步长，拖动数据窗口于长度为L₁在线监测数据集中滑动，可以得到截取出

个数据窗口，将所得的窗口数据导出，得到待处理的数据窗口集合{DS_i},i∈n，数据处理将以数据窗口作为分析的基本单位。S2. Intercept the data set according to a certain step size: with a step size of l, drag the data window to slide in the online monitoring data set of length L ₁ , and the intercepted data can be obtained.

A data window is obtained, and the obtained window data is derived to obtain a set of data windows to be processed {DS _i }, i ∈ n. The data processing will take the data window as the basic unit of analysis.

S3、窗口内序列数据的分段线性化处理：对于截取的数据窗口W_i，根据DGA监测指标分别提取其对应的序列数据，本实例主要研究的是DGA中H₂、CH₄两类气体，因此在数据窗口W_i中可以得到对应的2个序列，对序列进行分段线性化。S3. Piecewise linearization of the sequence data in the window: For the intercepted data window W _i , extract the corresponding sequence data according to the DGA monitoring indicators. This example mainly studies two types of gases, H ₂ and CH ₄ in the DGA. Therefore, two corresponding sequences can be obtained in the data window Wi, and the sequence is _piecewise linearized.

S4、线段集合的聚类分析：对以数组形式表达的线段集合，本实例基于其中的相关参数使用欧式距离的方法建立描述线段相似度的模型ds_ij，并根据此相似度模型，使用基于最大最小距离改进的K-means聚类算法对线段集合进行聚类分析，将相似程度较高的线段合并为一个类别，并为每个类别线段赋予符号，完成序列数据的符号化。S4. Cluster analysis of line segment sets: For the line segment set expressed in the form of an array, this example uses the Euclidean distance method to establish a model ds _ij describing the similarity of the line segments based on the relevant parameters in it, and according to this similarity model, use the maximum The minimum distance improved K-means clustering algorithm performs cluster analysis on the line segment set, merges the line segments with high similarity into one category, and assigns a symbol to each category line segment to complete the symbolization of the sequence data.

S5、序列之间的关联性挖掘：对完成归总操作的两条序列，基于Apriori算法的思路，通过设置的不同的层次的最小支持度阈值minsup_i，以及指标层面的最小置信度阈值mincon，不断的挖掘序列之间的存在的频繁项集，最终判定指标之间的关联关系强弱。S5. Correlation mining between sequences: For the two sequences that complete the summarization operation, based on the idea of the Apriori algorithm, the minimum support threshold minsup _i at different levels and the minimum confidence threshold mincon at the index level are set. The frequent itemsets that exist between the sequences are continuously mined, and the relationship between the indicators is finally determined.

S6、基于关联关系的异常数据提取筛除：根据序列之间的关联关系，筛选提取出其中存在的无效异常数据。S6. Extraction and screening of abnormal data based on association relationship: According to the association relationship between sequences, the invalid abnormal data existing in the sequence is filtered and extracted.

S7、数据的修复：使用改进粒子群优化支持向量回归修复DGA在线监测数据，完成DGA在线数据的处理工作。S7. Data repair: use the improved particle swarm optimization support vector regression to repair the DGA online monitoring data, and complete the processing of the DGA online data.

以某台主变设备的DGA历史在线监测数据中的氢气与甲烷气体指标为研究对象，使用本发明提出的方法对以上窗口序列数据进行分段线性化拟合，此处应注意：由于不同指标数据所处的数量级不同，所以在使用本发明提出的方式进行分段线性化拟合时，对不同的指标数据应该选取适当的拟合误差阈值，各个指标数据的具体拟合结果如图4和图5所示。Taking the hydrogen and methane gas indicators in the DGA historical online monitoring data of a main transformer as the research object, the method proposed in the present invention is used to perform piecewise linearization fitting on the above window sequence data. It should be noted here: due to different indicators The order of magnitude of the data is different, so when using the method proposed by the present invention to perform piecewise linearization fitting, an appropriate fitting error threshold should be selected for different index data. The specific fitting results of each index data are shown in Figures 4 and 4. shown in Figure 5.

拟合结果证明了本发明提出的在线数据分段线性化算法的可行性，每条线段拟合的误差均小于设置的拟合误差阈值，且拟合的线段能较好的反映拟合区间内在线数据点的变化趋势，算法的有效性得到验证。The fitting result proves the feasibility of the online data piecewise linearization algorithm proposed by the present invention, the fitting error of each line segment is less than the set fitting error threshold, and the fitted line segment can better reflect the fitting interval The changing trend of online data points, the validity of the algorithm is verified.

序列关联关系的挖掘：得到对应的频繁项集之后，使用本发明提出的方法分析两指标之间的关联性，以支持度便于置信度表示关联关系的强弱，得到H₂→CH₄的支持度与置信度分别为0.5050与0.6804，均大于所设置的相关最小阈值，表示该规则为强关联规则，说明氢气与甲烷指标之间存在强关联关系。检测结果如图6所示。修复DGA在线数据结果如图7所示。Sequence correlation mining: After obtaining the corresponding frequent itemsets, the method proposed in the present invention is used to analyze the correlation between the two indicators, and the support degree is convenient for the confidence degree to express the strength of the correlation relationship, which is supported by H ₂ →CH ₄ The degree and confidence are 0.5050 and 0.6804 respectively, which are both greater than the set minimum relevant threshold, indicating that the rule is a strong association rule, indicating that there is a strong correlation between hydrogen and methane indicators. The detection results are shown in Figure 6. The result of repairing DGA online data is shown in Figure 7.

可见被筛除的数据点，依靠其他的几种特征气体使用本发明方法进行修复后，所有数值回归正常水平，在线数据得到有效的清洗。It can be seen that after the screened out data points are repaired by the method of the present invention relying on several other characteristic gases, all the values return to the normal level, and the online data is effectively cleaned.

Claims

1. A processing method of transformer DGA online monitoring data is characterized in that: the method comprises the following steps:

s1, sliding window processing of the data set: introducing a sliding window idea, and intercepting an online data set by using a window with the length of L;

s2, traversing the online data set by sliding a window with a certain step size: setting the sliding step length as l, dragging the window to slide on the whole data set until all data are traversed; let the length of the online data set be L₁After traversal, get

A data window, deriving the data in all windows to form a data set DS to be analyzed_i，i∈n；

S3, piecewise linearization of sequence data: providing a piecewise linearization algorithm of sequence data, and combining a variable number of points in online data to form a multi-group data point set; the grouping of data points is normalized in that the error between the line segment fitted to all points and the actual data points is less than a threshold, and the slope and line segment span of the line segment used characterize the fitted line segment;

s4, constructing a model for describing the similarity of different line segments: constructing a similarity model based on the slope and span of the line segments, classifying the line segments by using a K-means clustering algorithm improved based on the maximum and minimum distances, giving symbols to the line segments of the same class, and completing the symbolization of sequence data;

s5, mining the relevance among different sequences: based on the idea of Apriori algorithm, setting minimum confidence and support degree, mining frequent item sets existing among different sequences, and quantifying the relevance among different sequences;

s6, extracting and screening abnormal values existing in DGA online monitoring data: according to the strength of the correlation among the sequences, separating data of different abnormal modes from the abnormal numerical value types in the judged data;

s7, improving particle swarm optimization support vector regression: defining the distance between the particle solution sets, dividing different particle categories based on the distance, and defining a particle updating mode; and optimizing the key parameters supporting vector regression by using an algorithm to complete the processing of DGA online data.

2. The processing method of the on-line monitoring data of the DGA of the transformer as claimed in claim 1, wherein: the specific steps of the piecewise linearization algorithm of the sequence data set forth in S3 are:

1) for the online monitoring data of the equipment indexes similar to DGA, equivalent to time sequence data;

2) for time series X_K＝{x₁,x₂,…,x_kIntercepting data points by a window with the length of L (L < k), and carrying out piecewise linear fitting on the data points contained in the intercepted window on the basis of the idea of a sliding window;

3)the first data point in the window is taken as the fitting starting point of the initial line segment, and the point is taken as x_iAssuming that the fitting end point of the initial line segment is x_i+m(m > 1), fitting the m +1 data points to a line segment;

4) then for such a line segment, it is expressed by the following equation:

my-(X_i+m-1-X_i)X-(m-1)X_i+X_i+m-1＝0 (2)

taking the distance from the actual data point to the fitting line segment as a fitting error; calculating the distances from all actual data points in the step length of the fitted line segment to the line segment, and taking the sum of the distances as the overall fitting error ER of the line segment:

5) setting the fitting error threshold to ER_rIf ER < ER_rIf so, the line segment can still continue to increase the fitting point, let m be m +1, and repeat the above steps; if ER > ER is present_rIf the line segment can not be fitted, the fitting end point of the current line segment is stored as X_end＝X_i+m-1And recording the data sampling time, returning to the step 3), resetting the parameter m, and fitting the next part of data by taking the current fitting endpoint as the fitting starting point of the next line segment until all data points in the sequence are fitted.

3. The processing method of the on-line monitoring data of the DGA of the transformer as claimed in claim 1, wherein: the similarity model is constructed in S4, and the main steps of cluster analysis based on the similarity model are as follows:

1) form all line segment attributes present in the same sequence as

The standardization operation of (2);

2) during cluster analysis, establishing a standard for measuring the similarity of the line segments; extracting two key parameters of the slope and the span of the line segment, describing the similarity between the line segments by using Euclidean distance, and expressing the consideration degree of different attributes of the line segment in a weight mode; the established line segment similarity model is shown as the following formula:

3) based on the line segment similarity model, the line segment set is subjected to clustering analysis by using a K-means algorithm improved based on the maximum and minimum distances, and similar line segments are divided into the same category.

4. The method for processing the on-line monitoring data of the DGA of the transformer as claimed in claim 3, wherein: in S4, the improved K-means algorithm based on the maximum and minimum distance mainly comprises the following steps:

1) the maximum and minimum distances are also based on Euclidean distances, and the difference between the maximum and minimum distances and the K-means algorithm is that an object with a maximum distance is taken as a clustering center; for the sample set, a proportion coefficient theta (0 < theta < 1) is given, and the sample set s is taken arbitrarily_nIs the initial clustering center, denoted as z₁；

2) Optionally taking the distance z of the remaining n-1 samples₁The farthest sample is the second cluster center, denoted as z₂；

3) Calculate the remaining n-2 samples and z₁And z₂And finding the minimum value among them, namely:

D_ij＝||x_i-z_j||,j＝1,2 (6)

D_i＝min(D_i1,D_i2),i＝1,2,…,n (7)

4) if it is

D_i＝max{D_i}＞θ×||z_i-z₂|| (8)

Then select the corresponding sample s_iAs a third cluster center z₃；

5) Assuming that there are K cluster centers, the distance from the remaining n-K samples to the cluster centers is calculated, and the following steps are carried out:

D_r＝max{min(D_i1,D_i2,…D_ik)}＞θ×||z₁-z₂|| (9)

then the corresponding sample x_rIs the K +1 cluster center and is marked as z_K+1(ii) a The process is continuously circulated until no new clustering center appears;

6) when no new cluster center is present, the samples are assigned to each class according to the minimum distance principle.

5. The processing method of the on-line monitoring data of the DGA of the transformer as claimed in claim 1, wherein: the main process of sequence association mining in S5 is as follows:

1) setting parameters of minimum support degree and minimum confidence degree; the confidence coefficient and the support threshold are the basis for judging the sequence association and the frequent item set, and the minimum support threshold of the frequent-1 and frequent-2 item sets is recorded as min₁And min₂The minimum confidence threshold in the sequence association mining is mincon;

2) generating a frequent item set; using the summed two-signed sequence as a transaction set, denoted

Wherein

All symbol categories corresponding to the two sequences are: { A₁,A₂,…,A_CAAnd { B }₁,B₂,…,B_CBObtaining a frequent item set of the sequence by scanning the transaction set in two stages based on the basic idea of an Apriori algorithm; the confidence for each symbol in the sequence is calculated according to equation (10):

in the formula N^tRepresenting the number of transaction sets, namely the number of elements in the sequence, representing the proportion of items in the transaction sets by the support degree, and when a frequent-1 item set is mined, the support degree is greater than the minimum₁The items of (a) are divided into a set of frequent-1 item sets;

the collection of frequent-1 item sets of two sequences in the association mining is recorded as P_A、P_BPairing the items in the set according to the index parameters to form the form (P)_Ai,P_Bi) Form 2-item set, calculating the support degree of each item in the 2-item set, and enabling the support degree to be greater than min₂Is divided into a frequent-2 item set, denoted as { P_A,P_B}_freq；

3) Mining sequence relevance; combining all the sequences pairwise, and respectively counting the support degree of the frequent-2 item concentrated items in the sequences and the confidence degree between the corresponding association mining sequences;

accumulating the support degrees of all frequent-2 item sets between two index parameters according to the formula (11), and taking the accumulated support degrees as the support degree counts of the two parameter sequences in all multivariate sequences;

σ(X^A)＝sum(σ(P_A)) (12)

σ(X^B)＝sum(σ(P_B)) (13)

wherein m is CA + CB, and is the total number of line segment categories divided after the two-sequence clustering analysis; while recording the minimum of the index sequence levelSupport degree threshold value is minisu₃If the support degree of the parameter index level is larger than the set threshold value, calculating the confidence degree con (X) of the combination of the symbol item sets in the two sequences^A→X^B) As shown in formula (14):

when the confidence is greater than the set minimum confidence threshold, the association rule X is reserved^A→X^BAnd describing the strength of the association between the two indexes by using the confidence coefficient, and judging that the two indexes have strong association.

6. The processing method of the on-line monitoring data of the DGA of the transformer as claimed in claim 1, wherein: the improved particle swarm optimization support vector regression in S7, which is mainly: for vacant numerical points caused by deletion of abnormal values, repairing the vacant numerical points by using a support vector regression algorithm for improving particle swarm optimization; the method mainly comprises the following steps:

1) defining the number m of variables, generating N m-dimensional particles in a feasible solution space, S^tIs the t-th generation particle in the iteration, wherein the element is

Wherein the elements are expressed as

2) Determining an inertia weight, wherein the inertia weight represents the inheritance degree of the particle to the speed during the last iteration; the specific expression is as follows:

wherein, w_aAnd w_zRepresenting maximum and minimum values of inertial weight, f_z,f_pjRespectively representThe fitness value of the particle, the minimum fitness value of all the particles and the average fitness value of all the particles;

3) the types of the particle populations are divided, and the distance between each particle is expressed by Euclidean distance:

define a standard distance:

wherein r is a dividing radius, and the density c of i particles is calculated_i：

n_iThe number of particles in the particle swarm is i, and N is the number of generated solution concentration particles;

4) two learning factors mu of the initialization algorithm₁、μ₂(ii) a When the particle density c_iWhen the value is larger than a certain threshold value, the updating mode is as follows:

when the particle density c_iWhen the value is less than a certain threshold value, the updating mode is as follows: