CN116031879A - A hybrid intelligent feature selection method for power system transient voltage stability assessment - Google Patents
A hybrid intelligent feature selection method for power system transient voltage stability assessment Download PDFInfo
- Publication number
- CN116031879A CN116031879A CN202310174560.8A CN202310174560A CN116031879A CN 116031879 A CN116031879 A CN 116031879A CN 202310174560 A CN202310174560 A CN 202310174560A CN 116031879 A CN116031879 A CN 116031879A
- Authority
- CN
- China
- Prior art keywords
- feature
- feature selection
- algorithm
- sample
- power system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001052 transient effect Effects 0.000 title claims abstract description 23
- 238000010187 selection method Methods 0.000 title claims abstract description 18
- 238000013097 stability assessment Methods 0.000 title claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 49
- 238000005457 optimization Methods 0.000 claims abstract description 30
- 238000000034 method Methods 0.000 claims abstract description 21
- 238000011156 evaluation Methods 0.000 claims abstract description 20
- 238000012216 screening Methods 0.000 claims abstract description 18
- 238000005259 measurement Methods 0.000 claims abstract description 11
- 238000004088 simulation Methods 0.000 claims abstract description 4
- 238000010276 construction Methods 0.000 claims abstract 2
- 238000000556 factor analysis Methods 0.000 claims abstract 2
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000013145 classification model Methods 0.000 claims description 2
- 238000002372 labelling Methods 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 230000005284 excitation Effects 0.000 claims 1
- 238000013210 evaluation model Methods 0.000 abstract description 4
- 230000008569 process Effects 0.000 abstract description 3
- 238000004458 analytical method Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 241000238814 Orthoptera Species 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000013517 stratification Methods 0.000 description 2
- 241000254032 Acrididae Species 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000019637 foraging behavior Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及电力系统技术领域,尤其是一种适应电力系统暂态电压稳定评估的混合智能特征选择方法。The invention relates to the technical field of power systems, and in particular to a hybrid intelligent feature selection method adapted to transient voltage stability assessment of power systems.
背景技术Background Art
规模化的新能源并网和多回特高压直流输电工程的投运,使得新型电力系统“双高一低”及负荷中心“空心化”特征明显,源-荷不确定性加剧。高度电力电子化致使电网动态特性深刻变化,同时故障耐受能力降低,加剧了系统暂态电压失稳的风险。随着大数据理论和人工智能技术的发展成熟,以及电网量测装置的迅速普及,响应数据驱动的人工智能方法为快速实现暂态电压稳定评估提供了新的思路。然而实际电力系统中各类元件数量众多,电网规模庞大,电气特征为高维数据,并且特征间的冗余性会极大影响评估模型的分类性能和效率。近年来,为充分挖掘系统受扰后关键特征的机理演化信息从而提高稳定评估准确率,部分文献提出使用动态时序数据进行暂态稳定性分析,这无疑加剧了模型的计算负担和过拟合风险。因此,探索合适的特征选择方案来降低原始特征维度是人工智能方法在电力系统暂态电压稳定评估中应用的关键问题。The large-scale grid connection of new energy and the commissioning of multiple UHV DC transmission projects have made the new power system "double high and one low" and the "hollowing out" of load centers obvious, and the uncertainty of source-load has intensified. The high degree of power electronics has caused profound changes in the dynamic characteristics of the power grid, and at the same time, the fault tolerance has been reduced, which has increased the risk of transient voltage instability in the system. With the development and maturity of big data theory and artificial intelligence technology, as well as the rapid popularization of power grid measurement devices, the response data-driven artificial intelligence method has provided a new idea for the rapid realization of transient voltage stability assessment. However, there are a large number of various components in the actual power system, the scale of the power grid is huge, the electrical characteristics are high-dimensional data, and the redundancy between features will greatly affect the classification performance and efficiency of the assessment model. In recent years, in order to fully explore the mechanism evolution information of key features after the system is disturbed and thus improve the accuracy of stability assessment, some literatures have proposed using dynamic time series data for transient stability analysis, which undoubtedly increases the computational burden and overfitting risk of the model. Therefore, exploring appropriate feature selection schemes to reduce the original feature dimension is a key issue in the application of artificial intelligence methods in transient voltage stability assessment of power systems.
以往研究中,多采用人工方式进行关键特征选择,即依靠专家经验选取符合先验知识的变量特征进行稳定性分析;当前许多学者进行了基于数据挖掘的特征选择研究,主要有过滤式、嵌入式和封装式三种。过滤式方法往往通过计算特征变量与标签之间的关联度来筛选与目标属性最相关的特征,典型的相关性度量准则包括Relief统计、信息度量、Fisher统计等。嵌入式方法是在学习模型训练的过程中同步实现特征选择,典型的嵌入式特征选择模型有决策树。封装式方法是通过寻优算法与学习模型的相互配合,以迭代寻优的方式找到与之相匹配的最佳特征子集。In previous studies, manual methods were often used to select key features, that is, relying on expert experience to select variable features that conform to prior knowledge for stability analysis; currently, many scholars have conducted research on feature selection based on data mining, mainly in three types: filtering, embedded, and encapsulated. The filtering method often selects the features most relevant to the target attribute by calculating the correlation between the feature variable and the label. Typical correlation measurement criteria include relief statistics, information measurement, Fisher statistics, etc. The embedded method is to realize feature selection synchronously during the training of the learning model. Typical embedded feature selection models include decision trees. The encapsulated method is to find the best feature subset that matches it through the mutual cooperation of the optimization algorithm and the learning model in an iterative optimization manner.
由于当下电力系统中许多新型动态响应设备故障特性不明,暂态电压失稳的机理分析也不明确,因此仅依靠人工经验选择变量特征已难以适应复杂大电网特性且易造成信息缺漏。而在基于数据挖掘的特征选择研究中,过滤式方法虽然计算速度快,可操作性强,但其特征筛选原则较为单一,不能有效降低特征冗余性,并且不能有效应用于时序特征选择;嵌入式方法又过于依赖学习模型,难以适应暂态电压稳定快速评估的要求;封装式方法所选特征评估准确率较高,但算法计算成本高,运算效率有待提升。因此现有特征选择方法在兼顾筛选效率和筛选准确性方面还存在不足。Since the fault characteristics of many new dynamic response devices in the current power system are unclear, and the mechanism analysis of transient voltage instability is also unclear, it is difficult to adapt to the characteristics of complex large power grids and easily cause information omissions by relying solely on manual experience to select variable features. In the feature selection research based on data mining, although the filtering method has fast calculation speed and strong operability, its feature screening principle is relatively simple, and it cannot effectively reduce feature redundancy, and cannot be effectively applied to time series feature selection; the embedded method is too dependent on the learning model and is difficult to adapt to the requirements of rapid evaluation of transient voltage stability; the encapsulation method has a high accuracy rate in feature evaluation, but the algorithm calculation cost is high, and the operation efficiency needs to be improved. Therefore, the existing feature selection methods are still insufficient in terms of balancing screening efficiency and screening accuracy.
发明内容Summary of the invention
针对当前电力系统分析特征选择方法中存在的特征子集筛选效率及分类性能难以兼得的问题,本发明提供一种适应电力系统暂态电压稳定评估的混合智能特征选择方法。In view of the problem that it is difficult to achieve both feature subset screening efficiency and classification performance in current power system analysis feature selection methods, the present invention provides a hybrid intelligent feature selection method suitable for power system transient voltage stability assessment.
本发明提供的适应电力系统暂态电压稳定评估的混合智能特征选择方法,步骤如下:The hybrid intelligent feature selection method adapted to transient voltage stability assessment of power systems provided by the present invention comprises the following steps:
S1、样本生成:S1. Sample generation:
通过进行稳定性相关因素分析和原始特征集构建及多场景暂态时域仿真,得到高维时序样本数据集。时域仿真获取电力系统受扰后暂态电压样本数据,并依据工程实用判据完成样本稳定性标注,从而构建特征选择与评估模型的训练和测试数据集,即高维时序样本数据集。By analyzing stability-related factors, constructing original feature sets, and simulating transient time domains in multiple scenarios, a high-dimensional time series sample data set is obtained. The time domain simulation obtains transient voltage sample data after the power system is disturbed, and completes sample stability labeling based on practical engineering criteria, thereby constructing training and test data sets for feature selection and evaluation models, namely, high-dimensional time series sample data sets.
S2、基于T-Relief算法的特征有效性度量及初步筛选。S2. Feature validity measurement and preliminary screening based on T-Relief algorithm.
Relief算法具备运算高效的优点,但不直接适用于时序输入特征选择。因此通过时序分层处理对其进行改进,得到T-Relief算法。将此算法用于原始特征的初步筛选,一方面计算出特征分类有效性度量值去增强后续步骤S3的搜索性能,另一方面实现初步降维,降低步骤S3的迭代运算成本。该步骤具体包括以下子步骤:The Relief algorithm has the advantage of high computational efficiency, but it is not directly applicable to time series input feature selection. Therefore, it is improved by time series hierarchical processing to obtain the T-Relief algorithm. This algorithm is used for the preliminary screening of original features. On the one hand, the feature classification effectiveness metric is calculated to enhance the search performance of the subsequent step S3. On the other hand, it realizes preliminary dimensionality reduction and reduces the iterative computation cost of step S3. This step specifically includes the following sub-steps:
S21、假设原始特征数据集共有M个样本,每个样本具有d个特征属性,数据记录的时间点个数为N;对具有N个时间步的M个时序高维特征数据进行时序分层,形成N个分层高维特征矩阵;S21. Assume that the original feature data set has a total of M samples, each sample has d feature attributes, and the number of time points of data records is N; perform time series stratification on M time series high-dimensional feature data with N time steps to form N stratified high-dimensional feature matrices;
S22、分别计算N个分层高维特征矩阵中M个样本与其他样本之间的欧式距离,构建欧式距离矩阵;S22, respectively calculating the Euclidean distances between M samples and other samples in N hierarchical high-dimensional feature matrices, and constructing a Euclidean distance matrix;
S23、依据欧式距离矩阵,寻找各分层高维特征样本的近邻样本;计算对应时刻下各个特征的相关统计量;S23, according to the Euclidean distance matrix, find the neighboring samples of the high-dimensional feature samples of each layer; calculate the relevant statistics of each feature at the corresponding time;
S24、对不同时刻分层下求得的相关统计量求平均,得到综合时序信息的各个特征有效性度量值δ;S24, averaging the relevant statistics obtained at different time layers to obtain the effectiveness measurement value δ of each feature of the comprehensive time series information;
S25、设定阈值τ,筛选出满足条件的有效特征进行下一步骤。设定的阈值τ可以为筛选特征的个数或相关统计量特定值等。S25, setting a threshold τ, screening out valid features that meet the conditions and proceeding to the next step. The set threshold τ may be the number of screened features or a specific value of a related statistic, etc.
S3、基于改进群体智能算法的特征选择及稳定性评估。S3. Feature selection and stability evaluation based on improved swarm intelligence algorithm.
本步骤通过步骤S2得到的特征有效性度量值,对群智能优化算法(BGOA)进行搜索性能增强,得到改进的群智能优化算法(IBGOA)。基于此算法搭建封装式特征选择方案,并嵌入ConvGRU评估模型作为子集评价器,能充分考虑特征的时序演变信息和内部组合关系,在步骤S2初筛特征基础上进一步实现特征子集寻优,降低特征冗余性。This step uses the feature validity metric obtained in step S2 to enhance the search performance of the swarm intelligence optimization algorithm (BGOA) to obtain the improved swarm intelligence optimization algorithm (IBGOA). Based on this algorithm, a packaged feature selection scheme is built, and the ConvGRU evaluation model is embedded as a subset evaluator, which can fully consider the temporal evolution information and internal combination relationship of the features, and further realize feature subset optimization based on the initial screening features in step S2 to reduce feature redundancy.
该步骤包括以下子步骤:This step includes the following sub-steps:
S31、采用改进的二进制蝗虫优化算法进行特征选择;S31, using improved binary locust optimization algorithm for feature selection;
通过步骤S24得到的有效性度量值δ,对二进制蝗虫优化算法进行搜索性能增强,得到改进的二进制蝗虫优化算法;改进的二进制蝗虫优化算法中,改进蝗虫位置初始化公式及迭代更新公式如下:The effectiveness metric value δ obtained in step S24 is used to enhance the search performance of the binary locust optimization algorithm to obtain an improved binary locust optimization algorithm; in the improved binary locust optimization algorithm, the improved locust position initialization formula and iterative update formula are as follows:
式中,a∈(0,1)为权重系数,r∈[0,1]为均匀分布随机数;Round(.)表示四舍五入取整函数;In the formula, a∈(0,1) is the weight coefficient, r∈[0,1] is a uniformly distributed random number; Round(.) represents the rounding function;
式中,β、η、γ均为权重系数,r∈[0,1]为均匀分布随机数;In the formula, β, η, γ are weight coefficients, r∈[0,1] is a uniformly distributed random number;
S32、嵌入ConvGRU的多维时序二分类模型对特征子集进行分类性能评估,通过适应度函数作为综合评价指标,判断综合评价指标是否满足迭代次数要求,若满足迭代次数要求,则输出最优特征子集,若不满足迭代次数要求,则再次重复采用改进的二进制蝗虫优化算法进行特征选择。S32. The multi-dimensional time series binary classification model embedded with ConvGRU evaluates the classification performance of the feature subset. The fitness function is used as a comprehensive evaluation index to determine whether the comprehensive evaluation index meets the iteration number requirement. If the iteration number requirement is met, the optimal feature subset is output. If the iteration number requirement is not met, the improved binary locust optimization algorithm is repeatedly used for feature selection.
该步骤中,ConvGRU的计算公式如下:In this step, the calculation formula of ConvGRU is as follows:
Rt=σ(Wxr*Xt+Whr*Ht-1)R t =σ(W xr *X t +W hr *H t-1 )
Zt=σ(Wxz*Xt+Whz*Ht-1)Z t =σ(W xz *X t +W hz *H t-1 )
式中,σ和tanh表示激励函数;⊙为按元素相乘;*为卷积运算;Rt示意重置门;Zt示意更新门;Xt表示当前时刻输入数据;表示当前候选值;Ht-1表示上一时刻隐藏状态;Wxr、Whr、Wxz、Whz、Wxh、Whh分别表示对应连接的权重系数。Wherein, σ and tanh represent activation functions; ⊙ represents element-wise multiplication; * represents convolution operation; R t represents the reset gate; Z t represents the update gate; X t represents the input data at the current moment; represents the current candidate value; H t-1 represents the hidden state at the previous moment; W xr , W hr , W xz , W hz , W xh , and W hh represent the weight coefficients of the corresponding connections respectively.
采用的适应度函数如下:The fitness function used is as follows:
式中,β∈(0,1)为权重系数,|R|为对应特征子集的维度,|N|为原始特征维度;Pc为综合错判率指标;In the formula, β∈(0,1) is the weight coefficient, |R| is the dimension of the corresponding feature subset, |N| is the original feature dimension; P c is the comprehensive misjudgment rate indicator;
综合错判率指标Pc计算公式如下:The calculation formula of the comprehensive misjudgment rate index Pc is as follows:
Pc=αPfs+(1-α)Pfus P c = αP fs + (1-α)P fus
其中,漏判率 Among them, the missed rate
误判率式中,α为权重系数,Fs表示错误判定为稳定的样本数(漏判);Fus表示错误判定为失稳的样本数(误判);Ts表示正确判定为稳定的样本数;Tus表示正确判定为失稳的样本数。False positive rate Where α is the weight coefficient, Fs represents the number of samples incorrectly judged as stable (missed judgment); Fus represents the number of samples incorrectly judged as unstable (misjudgment); Ts represents the number of samples correctly judged as stable; Tus represents the number of samples correctly judged as unstable.
与现有技术相比,本发明的有益之处在于:Compared with the prior art, the present invention is beneficial in that:
(1)通过对Relief算法进行时序改进,得到T-Relief算法,该方法适用于高维时序特征的有效性度量,且仍具备运算高效的优势;(1) By improving the time series of the Relief algorithm, we obtain the T-Relief algorithm, which is suitable for measuring the effectiveness of high-dimensional time series features and still has the advantage of high computational efficiency.
(2)融合特征有效性度量值,改进了群智能优化算法的初始化及迭代更新公式,有效提升了基于此智能算法进行封装式特征选择的效率和性能;(2) The feature validity metric is integrated to improve the initialization and iterative update formula of the swarm intelligent optimization algorithm, effectively improving the efficiency and performance of encapsulated feature selection based on this intelligent algorithm;
(3)基于双重筛选的混合智能特征选择方法能有效实现特征降维。初筛阶段提供特征有效性度量值并初步降维,提升了后续子集寻优的效率。基于改进群智能优化算法的封装式阶段嵌入ConvGRU时序评估模型能充分考虑特征的时变特性,具有更高的分类精度。与其他常用特征选择方法相比,本发明的方法所选特征子集分类性能更优。(3) The hybrid intelligent feature selection method based on double screening can effectively achieve feature dimensionality reduction. The initial screening stage provides feature validity measurement and preliminary dimensionality reduction, which improves the efficiency of subsequent subset optimization. The encapsulated stage embedded ConvGRU timing evaluation model based on the improved group intelligence optimization algorithm can fully consider the time-varying characteristics of the features and has higher classification accuracy. Compared with other commonly used feature selection methods, the feature subset classification performance selected by the method of the present invention is better.
本发明的其它优点、目标和特征将部分通过下面的说明体现,部分还将通过对本发明的研究和实践而为本领域的技术人员所理解。Other advantages, objectives and features of the present invention will be embodied in part through the following description, and in part will be understood by those skilled in the art through study and practice of the present invention.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1、混合智能特征选择算法框图。Figure 1. Block diagram of hybrid intelligent feature selection algorithm.
图2、本发明改进的T-Relief算法流程图。FIG2 is a flow chart of the improved T-Relief algorithm of the present invention.
图3、ConvGRU单元结构示意图。Figure 3. Schematic diagram of the ConvGRU unit structure.
图4、IBGOA与BGOA迭代收敛曲线对比效果图。Figure 4. Comparison of iterative convergence curves of IBGOA and BGOA.
具体实施方式DETAILED DESCRIPTION
以下结合附图对本发明的优选实施例进行说明,应当理解,此处所描述的优选实施例仅用于说明和解释本发明,并不用于限定本发明。The preferred embodiments of the present invention are described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described herein are only used to illustrate and explain the present invention, and are not used to limit the present invention.
如图1所示,本发明提供的适应电力系统暂态电压稳定评估的混合智能特征选择方法,主要包括样本生成、基于T-Relief算法的特征有效性度量及初步筛选、基于群智能优化算法的特征子集寻优及稳定性评估三个部分。As shown in Figure 1, the hybrid intelligent feature selection method adapted to the transient voltage stability assessment of the power system provided by the present invention mainly includes three parts: sample generation, feature validity measurement and preliminary screening based on the T-Relief algorithm, and feature subset optimization and stability assessment based on the swarm intelligence optimization algorithm.
其中,T-Relief算法原理如下:The principle of the T-Relief algorithm is as follows:
Relief算法是针对二分类问题提出的一种基于过滤式的特征选择方法,该方法根据特征对近距离样本的区分能力计算相关统计量,从而对不同特征进行有效性评估。具体步骤如下:The Relief algorithm is a filtering-based feature selection method proposed for binary classification problems. This method calculates relevant statistics based on the ability of features to distinguish close-range samples, thereby evaluating the effectiveness of different features. The specific steps are as follows:
(1)以欧式距离为依据,在同类样本中寻找与指定样本距离最近的样本为“猜中近邻”,在非同类样本中寻找与指定样本距离最近的样本为“猜错近邻”,然后根据选出的“猜中近邻”和“猜错近邻”计算各个特征的相关统计量,最后对训练集中所有样本的相关统计量结果求和可得整体特征重要性度量结果,计算公式如下:(1) Based on the Euclidean distance, the sample closest to the specified sample is found among the same samples as the "guessed neighbor", and the sample closest to the specified sample is found among the different samples as the "wrongly guessed neighbor". Then, the relevant statistics of each feature are calculated based on the selected "guessed neighbors" and "wrongly guessed neighbors". Finally, the relevant statistics of all samples in the training set are summed to obtain the overall feature importance measurement result. The calculation formula is as follows:
式中,上标j表示对应样本的第j维特征,δ为所有特征的相关统计量列表,xi为当前计算样本,xi,nh为当前样本的“猜中近邻”,xi,nm为当前样本的“猜错近邻”。由于电力系统响应数据为连续型变量,故的取值为:In the formula, the superscript j represents the j-th dimension feature of the corresponding sample, δ is the list of relevant statistics of all features, xi is the current calculation sample, xi ,nh is the "guessed neighbor" of the current sample, and xi ,nm is the "wrongly guessed neighbor" of the current sample. Since the power system response data is a continuous variable, The value of is:
(2)对得到的各个特征相关统计量进行排序,选定合适的阈值,选择重要性更高的特征进行分类任务。(2) Sort the relevant statistics of each feature, select a suitable threshold, and choose features with higher importance for classification tasks.
本发明方法中通过时序分层处理,对Relief算法进行时序改进,得到改进后得到的T-Relief算法,使其能直接进行高维时序特征的筛选。假设原始特征数据集共有M个样本,每个样本具有d个特征属性,数据记录的时间点个数为N,则T-Relief算法的计算流程如图2所示,具体如下:In the method of the present invention, the Relief algorithm is improved in time series by time series layering processing, and the improved T-Relief algorithm is obtained, so that it can directly screen high-dimensional time series features. Assuming that the original feature data set has a total of M samples, each sample has d feature attributes, and the number of time points of data records is N, the calculation process of the T-Relief algorithm is shown in Figure 2, which is as follows:
①对具有N个时间步的M个时序高维特征数据进行时序分层,形成N个分层高维特征矩阵;① Perform time-series stratification on M time-series high-dimensional feature data with N time steps to form N stratified high-dimensional feature matrices;
②分别计算N个分层高维特征矩阵中M个样本与其他样本之间的欧式距离,构建欧式距离矩阵;② Calculate the Euclidean distances between M samples and other samples in N hierarchical high-dimensional feature matrices respectively, and construct a Euclidean distance matrix;
③依据欧式距离矩阵,寻找各分层高维特征样本的近邻样本,归一化原始特征数据,计算对应时刻下各个特征的相关统计量;计算公式如下:③ Based on the Euclidean distance matrix, find the neighboring samples of each layered high-dimensional feature sample, normalize the original feature data, and calculate the relevant statistics of each feature at the corresponding time; the calculation formula is as follows:
④对不同时刻分层下求得的相关统计量求平均,得到综合时序信息的各个特征有效性度量值;④Average the relevant statistics obtained at different time levels to obtain the effectiveness measurement values of each feature of the comprehensive time series information;
⑤设定阈值τ,筛选出满足条件的有效特征进行下一步筛选研究。本方法中的阈值τ可以是筛选特征的个数或相关统计量特定值。⑤ Set a threshold value τ to screen out valid features that meet the conditions for the next step of screening research. The threshold value τ in this method can be the number of screened features or a specific value of related statistics.
改进群智能优化算法原理如下:The principle of the improved swarm intelligence optimization algorithm is as follows:
基于群体智能的封装式特征选择方法,是目前寻找最优特征子集近似解的重要手段。本发明选用较为稳健可靠的二进制蝗虫优化算法(Binary grasshopper optimizationalgorithm,BGOA)作为封装式特征选择的搜索策略。The encapsulated feature selection method based on swarm intelligence is an important means to find the approximate solution of the optimal feature subset. The present invention selects the relatively robust and reliable Binary Grasshopper Optimization Algorithm (BGOA) as the search strategy for encapsulated feature selection.
BGOA是模拟蝗虫觅食行为提出的依据个体间社会作用力进行数学建模的一种群体智能优化算法,算法结构简单,性能稳定,数学模型为:BGOA is a swarm intelligence optimization algorithm proposed to simulate locust foraging behavior and mathematically model the social forces between individuals. The algorithm has a simple structure and stable performance. The mathematical model is:
Xi=Si+Gi+Ai Xi = Si + Gi + Ai
式中,下标i表示种群中的第i个蝗虫;Xi表示其位置;Si为其他代理对其的社会作用力;Gi为其自身重力;Ai为其所受风力,实际应用中常忽略重力和风力作用。其中Si的计算公式如下:In the formula, the subscript i represents the i-th locust in the population; Xi represents its position; Si is the social force exerted on it by other agents; Gi is its own gravity; Ai is the wind force it is subjected to, and gravity and wind are often ignored in practical applications. The calculation formula of Si is as follows:
式中,dij=|xi-xj|表示两个蝗虫之间的距离,s(.)为社会力强度函数,正数表示吸引,负数表示排斥,其计算公式如下:In the formula, d ij = | xi -xj | represents the distance between two locusts, s(.) is the social force strength function, positive numbers represent attraction, and negative numbers represent repulsion. The calculation formula is as follows:
式中,f为引力强度参数,l为引力范围参数,f=0.5,l=1.5,r是一个自变量,没有特别的含义,等同于上面公式中dij的代号。In the formula, f is the gravitational intensity parameter, l is the gravitational range parameter, f=0.5, l=1.5, r is an independent variable with no special meaning, which is equivalent to the code of dij in the above formula.
在特征选择问题中,BGOA的位置向量Xi为长度为D的二进制序列,表示D个特征的选取情况,即Xid=1表示第d维特征被选中,反之则相反。因此结合社会力作用,定义蝗虫位置更新的步进向量为Δxi,并利用S型传递函数将其转换成选中的概率,具体计算方式如下:In the feature selection problem, the position vector Xi of BGOA is a binary sequence of length D, indicating the selection of D features, that is, Xid = 1 means that the d-th dimension feature is selected, and vice versa. Therefore, combined with the social force, the step vector of locust position update is defined as Δxi , and the S-type transfer function is used to convert it into the probability of selection. The specific calculation method is as follows:
式中,ud,ld分别为特征取值的上、下限;t表示迭代轮次;T(.)是一个函数;c为自适应调节系数,调节方式如下:In the formula, ud and ld are the upper and lower limits of the feature value respectively; t represents the iteration round; T(.) is a function; c is the adaptive adjustment coefficient, and the adjustment method is as follows:
c=cmax-t(cmax-cmin)/Lc=c max −t(c max −c min )/L
式中,cmax,cmin分别为最大调节系数和最小调节系数,L表示最大迭代次数。Wherein, c max and c min are the maximum adjustment coefficient and the minimum adjustment coefficient respectively, and L represents the maximum number of iterations.
本发明中,为融合T-Relief阶段特征有效性度量值对蝗虫迭代更新的引导作用,改进蝗虫位置初始化公式和迭代更新公式如下:In the present invention, in order to integrate the guiding role of the feature validity metric value of the T-Relief stage on the iterative update of locusts, the locust position initialization formula and iterative update formula are improved as follows:
式中,a∈(0,1)为权重系数,r∈[0,1]为均匀分布随机数。Round(.)表示四舍五入取整函数,In the formula, a∈(0,1) is the weight coefficient, r∈[0,1] is a uniformly distributed random number. Round(.) represents the rounding function.
式中,β、η、γ均为权重系数,r∈[0,1]为均匀分布随机数。In the formula, β, η, and γ are weight coefficients, and r∈[0,1] is a uniformly distributed random number.
ConvGRU单元结构如图3所示。评价待选特征子集的优劣主要从两个方面考虑,分别是使用该特征子集进行稳定评估的分类效果以及特征子集的维度。The structure of the ConvGRU unit is shown in Figure 3. The evaluation of the quality of the feature subset to be selected mainly considers two aspects, namely, the classification effect of using the feature subset for stable evaluation and the dimension of the feature subset.
为评估分类器性能,定义准确率Pacc、漏判率Pfs、误判率Pfus指标来评估暂稳分类器(ConvGRU)的性能。同时引入权重系数α>1来平衡误判、漏判的重要程度,定义了综合错判率指标Pc来评估特征子集的分类表现。In order to evaluate the performance of the classifier, the accuracy rate P acc , missed detection rate P fs , and false positive rate P fus indicators are defined to evaluate the performance of the temporary stable classifier (ConvGRU). At the same time, the weight coefficient α>1 is introduced to balance the importance of false positives and missed detections, and the comprehensive false positive rate indicator P c is defined to evaluate the classification performance of the feature subset.
Pc=αPfs+(1-α)Pfus P c = αP fs + (1-α)P fus
为实现分类精度的最大化和特征子集维度的最小化,定义稳定评估特征选择的目标函数(即改进BGOA特征选择算法的寻优适应度函数)为:In order to maximize the classification accuracy and minimize the dimension of the feature subset, the objective function of the stable evaluation feature selection (that is, the optimization fitness function of the improved BGOA feature selection algorithm) is defined as:
式中,β∈(0,1)为权重系数,|R|为对应特征子集的维度,|N|为原始特征维度。Where β∈(0,1) is the weight coefficient, |R| is the dimension of the corresponding feature subset, and |N| is the original feature dimension.
性能对比分析:Performance comparison analysis:
(1)改进BGOA算法(IBGOA)与原始BGOA算法性能对比(1) Performance comparison between the improved BGOA algorithm (IBGOA) and the original BGOA algorithm
为验证本发明的改进BGOA算法(IBGOA)在子集迭代寻优过程中具有更好的性能表现,对两者进行对比实验。实验设置:分别使用BGOA和IBGOA对原始特征集合进行特征筛选,迭代次数均为50次,多次试验求取每一次迭代的适应度平均值。最终得适应度值迭代收敛曲线如图4所示。由图中可以明显看出,IBGOA方法下,子集的优化搜索效率更高,并且最优子集的适应度值更低,充分证明了本发明对BGOA进行改进的有效性。In order to verify that the improved BGOA algorithm (IBGOA) of the present invention has better performance in the subset iterative optimization process, a comparative experiment is conducted on the two. Experimental setup: BGOA and IBGOA are used to perform feature screening on the original feature set, and the number of iterations is 50 times. The average fitness value of each iteration is obtained by multiple experiments. The final fitness value iterative convergence curve is shown in Figure 4. It can be clearly seen from the figure that under the IBGOA method, the optimization search efficiency of the subset is higher, and the fitness value of the optimal subset is lower, which fully proves the effectiveness of the present invention in improving BGOA.
(2)特征选择前后暂态电压稳定评估效果对比(2) Comparison of transient voltage stability assessment results before and after feature selection
选取稳定评估准确率、漏判率、误判率指标、特征维度及模型训练时间角度对特征选择的效果进行对比论证。The effects of feature selection were compared and demonstrated from the perspectives of stable evaluation accuracy, missed detection rate, misjudgment rate, feature dimension and model training time.
实验细节:原始特征(370维)经T-Relief计算后,保留有效性度量值大于0.6的有效特征131维;再经IBGOA迭代寻优后,最终保留关键特征18维(多次试验取平均值,每次试验迭代50轮),实验结果如表1所示。由表1可知,采用本发明方法进行特征选择后,最终保留的关键特征仅有18个,模型训练时间甚至减少了76.41%。在模型复杂度加深,样本量急剧增加的情况下,这将极大降低模型训练的难度。另外,虽然特征的维度大幅降低,但模型评估的准确率却上升了2.73%,误判率和漏判率均降低1.365%,这说明本发明所提特征选择方法能有效去除高维特征中的无效特征,从而提高暂态电压稳定评估模型的精度。Experimental details: After the original features (370 dimensions) were calculated by T-Relief, 131 dimensions of effective features with effectiveness metrics greater than 0.6 were retained; after IBGOA iterative optimization, 18 dimensions of key features were finally retained (average values were taken from multiple experiments, and each experiment was iterated 50 times). The experimental results are shown in Table 1. As can be seen from Table 1, after the feature selection method of the present invention was used, only 18 key features were finally retained, and the model training time was even reduced by 76.41%. This will greatly reduce the difficulty of model training when the model complexity increases and the sample size increases sharply. In addition, although the dimension of the features was greatly reduced, the accuracy of the model evaluation increased by 2.73%, and the false positive rate and missed positive rate were both reduced by 1.365%. This shows that the feature selection method proposed in the present invention can effectively remove invalid features in high-dimensional features, thereby improving the accuracy of the transient voltage stability assessment model.
表1、特征选择前后稳定评估效果对比Table 1. Comparison of stable evaluation effects before and after feature selection
以上所述,仅是本发明的较佳实施例而已,并非对本发明作任何形式上的限制,虽然本发明已以较佳实施例揭露如上,然而并非用以限定本发明,任何熟悉本专业的技术人员,在不脱离本发明技术方案范围内,当可利用上述揭示的技术内容作出些许更动或修饰为等同变化的等效实施例,但凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与修饰,均仍属于本发明技术方案的范围内。The above description is only a preferred embodiment of the present invention and does not limit the present invention in any form. Although the present invention has been disclosed as a preferred embodiment as above, it is not used to limit the present invention. Any technician familiar with this profession can make some changes or modify the technical contents disclosed above into equivalent embodiments without departing from the scope of the technical solution of the present invention. However, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solution of the present invention still fall within the scope of the technical solution of the present invention.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310174560.8A CN116031879A (en) | 2023-02-28 | 2023-02-28 | A hybrid intelligent feature selection method for power system transient voltage stability assessment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310174560.8A CN116031879A (en) | 2023-02-28 | 2023-02-28 | A hybrid intelligent feature selection method for power system transient voltage stability assessment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116031879A true CN116031879A (en) | 2023-04-28 |
Family
ID=86076116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310174560.8A Pending CN116031879A (en) | 2023-02-28 | 2023-02-28 | A hybrid intelligent feature selection method for power system transient voltage stability assessment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116031879A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117574116A (en) * | 2024-01-15 | 2024-02-20 | 湖南大学 | A hybrid feature selection method for transient stability assessment of submarine DC systems |
CN117909333A (en) * | 2024-02-02 | 2024-04-19 | 深圳天朴科技有限公司 | Screening method and system for realizing data based on big data combined with artificial intelligence |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102324743A (en) * | 2011-09-21 | 2012-01-18 | 国网电力科学研究院 | Predictive fault screening method for power system on-line transient security and stability assessment |
CN111478314A (en) * | 2020-03-20 | 2020-07-31 | 广西电网有限责任公司电力科学研究院 | Transient stability assessment method for power system |
CN111523778A (en) * | 2020-04-10 | 2020-08-11 | 三峡大学 | Power grid operation safety assessment method based on particle swarm algorithm and gradient lifting tree |
CN112632840A (en) * | 2020-12-04 | 2021-04-09 | 山东大学 | Power grid transient stability evaluation method based on adaptive differential evolution algorithm and ELM |
CN113054653A (en) * | 2019-12-27 | 2021-06-29 | 南京理工大学 | Power system transient stability evaluation method based on VGGNet-SVM |
CN113468817A (en) * | 2021-07-13 | 2021-10-01 | 淮阴工学院 | Ultra-short-term wind power prediction method based on IGOA (optimized El-electric field model) |
CN114881101A (en) * | 2022-03-21 | 2022-08-09 | 武汉大学 | Power system typical scene associated feature selection method based on bionic search |
-
2023
- 2023-02-28 CN CN202310174560.8A patent/CN116031879A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102324743A (en) * | 2011-09-21 | 2012-01-18 | 国网电力科学研究院 | Predictive fault screening method for power system on-line transient security and stability assessment |
CN113054653A (en) * | 2019-12-27 | 2021-06-29 | 南京理工大学 | Power system transient stability evaluation method based on VGGNet-SVM |
CN111478314A (en) * | 2020-03-20 | 2020-07-31 | 广西电网有限责任公司电力科学研究院 | Transient stability assessment method for power system |
CN111523778A (en) * | 2020-04-10 | 2020-08-11 | 三峡大学 | Power grid operation safety assessment method based on particle swarm algorithm and gradient lifting tree |
CN112632840A (en) * | 2020-12-04 | 2021-04-09 | 山东大学 | Power grid transient stability evaluation method based on adaptive differential evolution algorithm and ELM |
CN113468817A (en) * | 2021-07-13 | 2021-10-01 | 淮阴工学院 | Ultra-short-term wind power prediction method based on IGOA (optimized El-electric field model) |
CN114881101A (en) * | 2022-03-21 | 2022-08-09 | 武汉大学 | Power system typical scene associated feature selection method based on bionic search |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117574116A (en) * | 2024-01-15 | 2024-02-20 | 湖南大学 | A hybrid feature selection method for transient stability assessment of submarine DC systems |
CN117574116B (en) * | 2024-01-15 | 2024-04-19 | 湖南大学 | A hybrid feature selection method for transient stability assessment of submarine DC systems |
CN117909333A (en) * | 2024-02-02 | 2024-04-19 | 深圳天朴科技有限公司 | Screening method and system for realizing data based on big data combined with artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104636801B (en) | A kind of prediction transmission line of electricity audible noise method based on Optimized BP Neural Network | |
CN116031879A (en) | A hybrid intelligent feature selection method for power system transient voltage stability assessment | |
CN112039687A (en) | A fault diagnosis method based on improved generative adversarial network for small sample features | |
CN109215344B (en) | Method and system for urban road short-time traffic flow prediction | |
CN114462718A (en) | CNN-GRU wind power prediction method based on time sliding window | |
CN113807040B (en) | An Optimal Design Method for Microwave Circuits | |
CN115659254A (en) | A Method for Analysis of Power Quality Disturbance in Distribution Network Based on Dual-modal Feature Fusion | |
CN112578089B (en) | Air pollutant concentration prediction method based on improved TCN | |
CN117559388A (en) | Short-term load prediction method based on VMD-BiGRU | |
CN107392090A (en) | Optimize Classification of Power Quality Disturbances device ELM method | |
CN113393317A (en) | Automobile financial loan wind control system based on algorithm, big data and block chain | |
CN106453294A (en) | Security situation prediction method based on niche technology with fuzzy elimination mechanism | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
CN110321390A (en) | Based on the load curve data visualization method for thering is supervision and unsupervised algorithm to combine | |
CN118780438A (en) | Adaptive hybrid rime optimization cooling load prediction method for public buildings and related devices | |
Jing | Neural network-based pattern recognition in the framework of edge computing | |
CN118152859A (en) | Multi-working condition harmonic analysis method and system for vibrating table power amplifier | |
CN117892197A (en) | Photovoltaic power generation power prediction method | |
CN117093519A (en) | Electric energy quality disturbance identification method based on improved one-dimensional depth residual error shrinkage network | |
CN114372495B (en) | Electric energy quality disturbance classification method and system based on deep space residual error learning | |
CN113746813B (en) | Network attack detection system and method based on two-stage learning model | |
CN115051834A (en) | Novel power system APT attack detection method based on STSA-transformer algorithm | |
CN115270983A (en) | Switch cabinet fault prediction method based on AdaBoost-RBF algorithm | |
Zhang et al. | Network Traffic Prediction Based on Improved GA-Elman Neural Network | |
CN118734035B (en) | Deep learning radiation source individual identification method based on complex pulse nerves |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20230428 |