CN111614491B - Power monitoring system oriented safety situation assessment index selection method and system - Google Patents

Power monitoring system oriented safety situation assessment index selection method and system Download PDF

Info

Publication number
CN111614491B
CN111614491B CN202010370523.0A CN202010370523A CN111614491B CN 111614491 B CN111614491 B CN 111614491B CN 202010370523 A CN202010370523 A CN 202010370523A CN 111614491 B CN111614491 B CN 111614491B
Authority
CN
China
Prior art keywords
index
monitoring system
power monitoring
indexes
candidate characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010370523.0A
Other languages
Chinese (zh)
Other versions
CN111614491A (en
Inventor
王梓
杨维永
朱世顺
黄益彬
刘苇
黄天明
朱江
韩勇
程长春
郑卫波
祁龙云
魏兴慎
李牧野
景娜
张林霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Fujian Electric Power Co Ltd
Nari Information and Communication Technology Co
State Grid Electric Power Research Institute
Original Assignee
State Grid Corp of China SGCC
State Grid Fujian Electric Power Co Ltd
Nari Information and Communication Technology Co
State Grid Electric Power Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Fujian Electric Power Co Ltd, Nari Information and Communication Technology Co, State Grid Electric Power Research Institute filed Critical State Grid Corp of China SGCC
Priority to CN202010370523.0A priority Critical patent/CN111614491B/en
Publication of CN111614491A publication Critical patent/CN111614491A/en
Application granted granted Critical
Publication of CN111614491B publication Critical patent/CN111614491B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The invention discloses a method and a system for selecting safety situation assessment indexes of a power monitoring system, wherein the method comprises the following steps: acquiring historical data capable of reflecting the safety state of the power monitoring system, and constructing a training sample data set after standardized processing; analyzing the correlation between candidate indexes used for safety evaluation in the sample data set and the safety problem of the power monitoring system, and determining whether each candidate index can judge the priority of the correlation of the safety state of the power monitoring system; and calculating the redundancy between every two candidate indexes, removing characteristic indexes with high redundancy from the correlation priority sorting set, and finally selecting an optimal index set for evaluating the safety situation of the power monitoring system by comprehensively considering the classification accuracy and the scale of the index set. According to the method and the device, the evaluation result accuracy is improved, the expenses of data acquisition, storage, transmission and processing resources are reduced, and the performance of evaluating the safety situation of the power monitoring system is improved by selecting the characteristics of the evaluation indexes of the safety situation of the power monitoring system.

Description

Power monitoring system oriented safety situation assessment index selection method and system
Technical Field
The invention belongs to the technical field of electric power information safety, and particularly relates to a method and a system for selecting safety situation evaluation indexes of an electric power monitoring system.
Background
In recent years, along with continuous expansion and deepening of various networked applications, malicious network attack behaviors such as computer viruses, trojans, hacking and the like become rampant increasingly, network security events enter high-frequency and frequent periods, and national and group network attack behaviors such as network information wars, network terrorism and the like have serious influence on production and life of the whole society. The electric network is an important infrastructure related to the country-level citizen, the core of the electric network is an electric power monitoring system for monitoring and controlling the stable operation of the intelligent electric network, and the electric power monitoring system is extremely easy to become an adversarial force attack target because the electric power monitoring system has a complex and diverse structure, wide distribution and high importance and is greatly influenced once being paralyzed. Since the twenty-first century, many network attacks against power monitoring systems have occurred at home and abroad, so that related countries suffer great loss, similar accidents occur in China one after another, and the production and life of people are greatly influenced.
With the overall promotion of energy internet, the levels of intellectualization, networking and interaction of the power monitoring system serving as a power grid dispatching control center are continuously improved, and meanwhile, a novel attack means represented by advanced persistent network Attack (APT) is continuously evolved, which brings a serious challenge to the safety protection work of the power monitoring system. Therefore, the safety state of the power monitoring system needs to be monitored in real time, the change trend of the safety state is mastered, the safety problem is timely detected before the occurrence of the safety problem, and corresponding countermeasures are taken. With the continuous expansion of the scale of a power grid, huge information data interaction in the power monitoring system and the continuous access of an intelligent terminal, the safety complexity is continuously improved, and the existing regular manual evaluation mode cannot meet the lean, intelligent and real-time safety management requirements of the power monitoring system in a new form. With the development of artificial intelligence technology, machine learning technology is widely used in evaluation work in various fields, thereby reducing the cost of artificial evaluation. Therefore, a high-quality safety situation evaluation index of the power monitoring system needs to be selected to provide service for subsequent real-time and accurate evaluation.
At present, methods for selecting safety situation assessment indexes of a power monitoring system are generally divided into two modes, namely a Filter characteristic selection method and a Wrapper characteristic selection method.
The Fliter feature selection method is independent of any machine learning algorithm, selects features according to scores in various statistical tests and correlation, and has index evaluation methods such as information measurement, dependency measurement, consistency measurement and the like. However, the method can only calculate the influence of a single feature on the classification result, and the problems of uneven distribution of a data set and sparsity of the classification feature easily cause that important features are mistaken as useless features.
The Wrapper feature selection method is usually closely linked with a classification learning algorithm, and usually combines an index selection problem with a genetic algorithm, machine learning, a neural network and the like. The method for selecting the features based on the classification learning algorithm mainly obtains a classification model with high classification performance, can well identify key features related to a classification result, but still cannot well remove redundancy among the features, and has poor generalization capability and high time complexity.
Therefore, when the method is adopted to select the safety evaluation indexes of the electric power monitoring system, the selected indexes have the problems of independence from safety evaluation, redundancy among the indexes and the like, so that the expenses of acquisition, transmission and processing of the irrelevant indexes and the redundant indexes are generated when the safety situation of the electric power monitoring system is evaluated, the performance of real-time evaluation is reduced, and meanwhile, the accuracy of evaluation is also reduced. Therefore, when the safety situation evaluation index of the power monitoring system is selected, the correlation and redundancy of the index need to be analyzed, and the existing index feature selection methods have some problems, so that a better method is needed to solve the selection problem of the evaluation index feature.
Disclosure of Invention
The invention provides a method and a system for selecting indexes for evaluating safety situations of a power monitoring system.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
an embodiment of the present invention provides a method for selecting an evaluation index for a security situation of an electric power monitoring system, including:
acquiring candidate characteristic indexes for safety evaluation of the power monitoring system;
constructing a training sample data set based on the candidate characteristic indexes;
calculating the relevance of the candidate characteristic indexes used for evaluating the safety state of the power monitoring system based on the training sample data set to obtain a candidate characteristic index relevance ranking set;
removing the redundant indexes in the candidate characteristic index relevance sorting set to form a candidate characteristic index relevance sorting set with the redundant indexes removed;
and selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index correlation sorting set with the redundant indexes removed.
Further, the obtaining of the candidate characteristic index for the safety evaluation of the power monitoring system includes:
and selecting a plurality of candidate characteristic indexes capable of reflecting the safety state of the power monitoring system from the flow characteristic, the equipment running state and the alarm log.
Further, the constructing a training sample data set based on the candidate feature indexes includes:
collecting candidate characteristic index historical data for safety evaluation of a power monitoring system to form an original sample data set D:
Figure GDA0003810387740000021
wherein, X i ,i∈[1,m]The ith historical data record reflecting the safety state of the power monitoring system, m is the number of the collected historical data records, x ij Represents the value corresponding to the jth candidate characteristic index in the ith historical data record, j belongs to [1, n ]]N is the number of candidate characteristic indexes;
normalizing the original sample data set D to generate a normalized sample data set D':
Figure GDA0003810387740000031
wherein: x' i Is X i Standardized sample data record, x' ij Is x ij A normalized value;
x′ ij the calculation is as follows:
Figure GDA0003810387740000032
and (3) manually labeling the standardized original sample data set D', and generating a training sample data set T:
Figure GDA0003810387740000033
wherein, T i ,i∈[1,m]For the ith training sample record, y i ,y i And e {0,1} represents the safety state of the power monitoring system corresponding to the ith training sample record, the value 1 represents the safety state, and the value 0 represents the dangerous state.
Further, the calculating the candidate feature indexes for evaluating the correlation of the safety state of the power monitoring system based on the training sample data set to obtain a candidate feature index correlation ranking set includes:
training a support vector machine model by adopting a training sample data set to obtain the weight of the candidate characteristic index;
calculating the correlation evaluation scores of the candidate characteristic indexes and the safety evaluation of the power monitoring system according to the weights of the candidate characteristic indexes;
and circularly processing the candidate characteristic index set by using a sequence backward selection heuristic search algorithm, removing one candidate characteristic index with the minimum correlation evaluation score from the candidate characteristic index set every time, and sequentially putting the candidate characteristic index into a candidate characteristic index correlation sorting set to finally obtain a sorting set with the candidate characteristic index correlation from large to small.
Further, the training of the support vector machine model by using the training sample data set to obtain the weight of the candidate feature index includes:
the classification function of the support vector machine is:
Figure GDA0003810387740000034
wherein the independent variable X represents a standardized data record to be measured reflecting the safety state of the power monitoring system, f (X) represents a classification result obtained according to the independent variable X, and alpha i Is Lagrange multiplier, sgn (·) is sign function, b is threshold value, alpha i And b is obtained by model training, X' i For standardized training sample data records, y i Representing the safety state of the power monitoring system corresponding to the ith training sample record;
training a support vector machine model by using a training sample data set T to obtain a coefficient omega of an independent variable X:
Figure GDA0003810387740000041
then the jth element of the vector omega corresponds to
Figure GDA0003810387740000042
Weight ω representing jth candidate feature index j ,x′ ij The number is a numerical value corresponding to the jth candidate characteristic index in the ith standardized training data record, m is the number of the training data records, and n is the number of the candidate characteristic indexes.
Further, the calculating the correlation evaluation score of the candidate characteristic index and the safety evaluation of the power monitoring system according to the weight of the candidate characteristic index includes:
c j =ω j 2
wherein, c j Is the correlation evaluation score, ω, of the jth candidate feature index j Is the weight of the jth candidate feature index.
Further, the removing of the redundant indexes in the candidate feature index relevance ranking set to form a candidate feature index relevance ranking set from which the redundant indexes are removed includes:
and (3) taking the Pearson correlation coefficient as the redundancy among indexes in the candidate characteristic index correlation sorting set:
Figure GDA0003810387740000043
wherein Pearson (s, g) is a Pearson correlation coefficient of two candidate characteristic indexes s, g in the candidate characteristic index correlation sorting set,
Figure GDA0003810387740000044
respectively representing the average values, s, of the two candidate characteristic indexes in the training sample data set T i ,g i Respectively representing the values of the two candidate characteristic indexes in the ith training sample record in the training sample data set T;
and eliminating the redundancy index with the redundancy rate larger than a preset redundancy threshold value.
Further, selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index relevance ranking set with the redundant indexes removed, wherein the optimal index set comprises:
the objective function selected by designing the optimal index set is as follows:
Figure GDA0003810387740000051
s is a subset of a candidate characteristic index relevance ordering set Q with redundant indexes removed, P (S) is the classification accuracy of a support vector machine classifier constructed according to an index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weight factor used for balancing the classification accuracy and the index number;
and solving the objective function to obtain an optimal index set.
In another aspect, an embodiment of the present invention further provides a system for selecting an evaluation index for a security situation of an electric power monitoring system, where the system includes:
the index acquisition module is used for acquiring candidate characteristic indexes for safety evaluation of the power monitoring system;
the sample acquisition module is used for constructing a training sample data set based on the candidate characteristic indexes;
the training module is used for calculating the relevance of the candidate characteristic indexes used for evaluating the safety state of the power monitoring system based on the training sample data set to obtain a candidate characteristic index relevance ranking set;
the screening module is used for eliminating the redundant indexes in the candidate characteristic index relevance sorting set to form a candidate characteristic index relevance sorting set with the redundant indexes eliminated;
and the number of the first and second groups,
and the optimizing module is used for selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index relevance sorting set in which the redundant indexes are removed.
Further, the training module is specifically configured to,
training a support vector machine model by adopting a training sample data set to obtain the weight of the candidate characteristic index;
calculating the correlation evaluation scores of the candidate characteristic indexes and the safety evaluation of the power monitoring system according to the weights of the candidate characteristic indexes;
and circularly processing the candidate characteristic index set by using a sequence backward selection heuristic search algorithm, removing one candidate characteristic index with the minimum correlation evaluation score from the candidate characteristic index set every time, and sequentially putting the candidate characteristic index into a candidate characteristic index correlation sorting set to finally obtain a sorting set with the candidate characteristic index correlation from large to small.
Furthermore, the screening module is specifically configured to,
and (3) taking the Pearson correlation coefficient as the redundancy among indexes in the candidate characteristic index correlation sorting set:
Figure GDA0003810387740000052
wherein Pearson (s, g) is a Pearson correlation coefficient of two candidate feature indexes s, g in the candidate feature index correlation sorting set,
Figure GDA0003810387740000053
respectively representing the average value, s, of the two candidate characteristic indexes in the training sample data set T i ,g i Respectively representing the values of the two candidate characteristic indexes in the ith training sample record in the training sample data set T;
and eliminating the redundancy indexes with the redundancy rate larger than a preset redundancy rate threshold value.
Further, the optimizing module is specifically configured to,
the objective function selected by designing the optimal index set is as follows:
Figure GDA0003810387740000061
s is a subset of a candidate characteristic index relevance ordering set Q with redundant indexes removed, P (S) is the classification accuracy of a support vector machine classifier constructed according to an index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weight factor used for balancing the classification accuracy and the index number;
and solving the objective function to obtain an optimal index set.
Compared with the prior art, the method for selecting the evaluation index of the safety situation of the power monitoring system has the following advantages:
starting from two aspects of the relevance of the indexes and the evaluation safety state and the redundancy among the indexes, the candidate characteristic indexes are sorted based on the relevance, the redundancy indexes are removed by setting a redundancy threshold value, and finally the optimal evaluation indexes are selected by using an optimal index set selection algorithm, so that the method is beneficial to reducing the expenditure of data acquisition, storage, transmission and processing resources, improving the performance of real-time evaluation on the safety situation of the power monitoring system, and providing support for accurate evaluation of the safety situation of the subsequent power monitoring system.
Drawings
Fig. 1 is a flowchart of a method for selecting an evaluation index for a security situation of an electric power monitoring system according to the present invention.
Detailed Description
The invention is further described below. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The embodiment of the invention provides a method for selecting indexes for evaluating the safety situation of an electric power monitoring system on one hand, and the indexes are selected from two aspects of the correlation between the indexes and the safety state of the electric power monitoring system and the redundancy among the indexes. Firstly, starting from factors affecting the safety of the power monitoring system, acquiring historical data capable of reflecting the safety state of the power monitoring system, carrying out manual marking after standardization processing, determining the safety state of the power monitoring system corresponding to each standardized original sample data record, and constructing a training sample data set. And then, analyzing the degree of correlation of each candidate index on judging the safety state of the power monitoring system, and determining the sequence of the correlation of the candidate index and the safety state of the power monitoring system. And finally, on the basis of the relevance ranking of the candidate indexes, removing the indexes with higher redundancy by calculating the redundancy between every two indexes, and finally selecting an optimal index set.
Referring to fig. 1, the specific implementation process is as follows:
the first step is as follows: constructing a training sample data set, specifically as follows:
analyzing safety factors influencing the electric power monitoring system from three aspects of flow characteristics, equipment running states and alarm logs, selecting n characteristic indexes capable of reflecting the safety states of the electric power monitoring system, and collecting data of the corresponding characteristic indexes.
Selecting collected historical data reflecting the safety state of the power monitoring system to construct a sample data set D, and representing the sample data set D as follows:
Figure GDA0003810387740000071
wherein, X i (i∈[1,m]) For the ith data record, x, reflecting the safety status of the power monitoring system ij Represents the jth (j E [1, n) ] in the ith data record]) Individual characteristics meanAnd marking corresponding numerical values.
Carrying out standardization processing on the original sample data set D by adopting a minimum-maximization method, and generating a standardized sample data set D' as follows:
Figure GDA0003810387740000072
wherein:
Figure GDA0003810387740000073
manually labeling the standardized original sample data set D' according to the facts that whether safety problems are caused in expert experience and historical data or not, and the like, noting the safety state of the power monitoring system corresponding to each standardized sample data record, and generating a training sample data set T which is expressed as follows:
Figure GDA0003810387740000074
wherein, T i (i∈[1,m]) For the ith training sample data, y i (y i The element {0,1 }) represents the safety state of the power monitoring system corresponding to the ith training sample data, the value 1 represents the safety state, and the value 0 represents the dangerous state.
The second step is that: the n characteristic indexes which can reflect the safety state of the power monitoring system and are selected in the first step are taken as candidate indexes, the correlation between the candidate indexes and the safety evaluation of the power monitoring system is analyzed, and whether the correlation priority of the safety state of the power monitoring system can be judged by each candidate index is determined;
analyzing the correlation degree of each candidate capable of judging the safety state of the power monitoring system according to a Support Vector Machine (SVM) classification algorithm based on a linear kernel function, wherein the analysis comprises the following steps:
the classification function of a Support Vector Machine (SVM) is:
Figure GDA0003810387740000081
wherein, the independent variable X represents the standardized data record to be measured reflecting the safety state of the power monitoring system, f (X) represents the classification result obtained according to the independent variable X, namely the corresponding safety state X 'of the power monitoring system' i For the ith data record, y, reflecting the safety state of the power monitoring system in the standardized sample data set D i (y i E {0,1 }) represents the safety state of the power monitoring system corresponding to the ith data record in the standardized sample data set D', the value 1 represents the safety state, the value 0 represents the dangerous state, and alpha i Is a Lagrange multiplier, sgn (-) is a sign function, b is a threshold, alpha i And b are parameters obtained by model training.
Training a support vector machine model by using a training sample data set T to obtain a coefficient omega of an independent variable X:
Figure GDA0003810387740000082
then the jth (j e [1, n) ] of vector ω]) Corresponding to each element
Figure GDA0003810387740000083
Representing the weight omega of the jth candidate index in the SVM classification model j Selecting the square of the corresponding weight of each candidate index in the SVM classification model as an evaluation score c for judging the correlation j Namely:
c j =ω j 2 (j∈[1,n]),
c j the larger the value, the greater the correlation between the jth candidate index and the power monitoring system safety assessment.
And (3) performing loop processing on a set S formed by n candidate indexes by using a sequence backward selection heuristic search algorithm, removing one candidate index with the minimum correlation with the safety evaluation of the power monitoring system from the set S by adopting the method for analyzing the correlation between the candidate indexes and the safety evaluation of the power monitoring system each time, and sequentially putting the candidate index into a candidate index correlation sorting set R to finally obtain the sorting of the correlation of the candidate indexes from large to small.
The specific algorithm of the correlation analysis of the candidate indexes and the safety assessment of the power monitoring system is as follows:
Figure GDA0003810387740000091
the third step: removing the redundant indexes in the candidate index correlation sorting set R, which is concretely as follows:
and sequentially calculating the redundancy between every two indexes in the candidate index relevance sorting set R by adopting the Pearson correlation coefficient, setting a redundancy threshold, and eliminating the indexes with the redundancy greater than the threshold to obtain a candidate index relevance sorting set Q with the redundancy indexes eliminated. The Pearson correlation coefficient formula is as follows:
Figure GDA0003810387740000092
wherein s and g respectively represent two candidate indexes in the candidate index correlation ordering set R,
Figure GDA0003810387740000093
respectively representing the mean values, s, of all sample records of the two candidate indexes in the training sample data set T i 、g i And respectively representing the values of the two candidate indexes in the ith training sample record in the T.
The fourth step: and selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate index correlation sorting set Q from which the redundant indexes are removed.
The selection of the optimal index set mainly considers two aspects, namely, the accuracy of model classification trained by the sample data set constructed by the optimal index set is high, and the number of the index sets is not too large. The objective function selected by the optimal index set is as follows:
Figure GDA0003810387740000101
wherein, S is a subset of the candidate index relevance ranking set Q from which the redundancy index is removed, P (S) is the classification accuracy of a Support Vector Machine (SVM) classifier constructed according to the index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weighting factor used for balancing the classification accuracy and the index number. The larger the value of the objective function, the better the set of selected metrics.
The specific algorithm for selecting the optimal index set is as follows:
Figure GDA0003810387740000102
finally, the optimal index set for evaluating the safety situation of the power monitoring system after characteristic correlation sorting and redundant index deletion is obtained, so that the expenditure of data acquisition, storage, transmission and processing resources is reduced, the performance of evaluating the safety situation of the power monitoring system in real time is improved, and support is provided for accurately evaluating the safety situation of the subsequent power monitoring system.
In another aspect, an embodiment of the present invention further provides a system for selecting an evaluation index for a security situation of an electric power monitoring system, where the system includes:
the index acquisition module is used for acquiring candidate characteristic indexes for safety evaluation of the power monitoring system;
the sample acquisition module is used for constructing a training sample data set based on the candidate characteristic indexes;
the training module is used for calculating the relevance of the candidate characteristic indexes used for evaluating the safety state of the power monitoring system based on the training sample data set to obtain a candidate characteristic index relevance ranking set;
the screening module is used for rejecting redundant indexes in the candidate characteristic index correlation sorting set to form a candidate characteristic index correlation sorting set with the redundant indexes rejected;
and the number of the first and second groups,
and the optimizing module is used for selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index correlation sorting set in which the redundant indexes are removed.
Further, the training module is specifically configured to,
training a support vector machine model by adopting a training sample data set to obtain the weight of the candidate characteristic index;
calculating the correlation evaluation scores of the candidate characteristic indexes and the safety evaluation of the power monitoring system according to the weights of the candidate characteristic indexes;
and circularly processing the candidate characteristic index set by using a sequence backward selection heuristic search algorithm, removing one candidate characteristic index with the minimum correlation evaluation score from the candidate characteristic index set every time, and sequentially putting the candidate characteristic index into a candidate characteristic index correlation sorting set to finally obtain a sorting set with the candidate characteristic index correlation from large to small.
Furthermore, the screening module is specifically configured to,
and (3) taking the Pearson correlation coefficient as the redundancy among indexes in the candidate characteristic index correlation sorting set:
Figure GDA0003810387740000111
wherein Pearson (s, g) is a Pearson correlation coefficient of two candidate characteristic indexes s, g in the candidate characteristic index correlation sorting set,
Figure GDA0003810387740000112
respectively representing the average value, s, of the two candidate characteristic indexes in the training sample data set T i ,g i Respectively representing the values of the two candidate characteristic indexes in the ith training sample record in the training sample data set T;
and setting a redundancy threshold value, and eliminating the redundancy indexes with the redundancy greater than the threshold value.
Further, the optimizing module is specifically configured to,
the objective function selected by designing the optimal index set is as follows:
Figure GDA0003810387740000113
s is a subset of a candidate characteristic index relevance ordering set Q with redundant indexes removed, P (S) is the classification accuracy of a support vector machine classifier constructed according to an index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weight factor used for balancing the classification accuracy and the index number;
and solving the objective function to obtain an optimal index set.
It is to be noted that the apparatus embodiment corresponds to the method embodiment, and the implementation manners of the method embodiment are all applicable to the apparatus embodiment and can achieve the same or similar technical effects, so that the details are not described herein.
Starting from the two aspects of the correlation between the indexes and the evaluation safety state and the redundancy between the indexes, analyzing the correlation between the indexes and the evaluation safety state by using a classification algorithm based on a support vector machine and a sequence backward selection heuristic search algorithm, calculating the redundancy between the indexes by adopting a Pearson correlation coefficient, removing the redundancy indexes by setting a redundancy threshold value, and finally selecting the optimal evaluation index by using an optimal index set selection algorithm, so that the method is favorable for reducing the expenditure of data acquisition, storage, transmission and processing resources, improves the performance of real-time evaluation on the safety situation of the power monitoring system, and provides support for accurately evaluating the safety situation of the subsequent power monitoring system.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. A method for selecting safety situation assessment indexes of a power monitoring system is characterized by comprising the following steps:
acquiring candidate characteristic indexes for safety evaluation of the power monitoring system;
constructing a training sample data set based on the candidate characteristic indexes;
calculating the relevance of the candidate characteristic indexes used for evaluating the safety state of the power monitoring system based on the training sample data set to obtain a candidate characteristic index relevance ranking set, wherein the candidate characteristic index relevance ranking set comprises the following steps: training a support vector machine model by adopting a training sample data set to obtain the weight of the candidate characteristic index; calculating the relevance evaluation score of the candidate characteristic index and the safety evaluation of the power monitoring system according to the weight of the candidate characteristic index; using a sequence backward selection heuristic search algorithm to carry out circular processing on the candidate characteristic index set, removing a candidate characteristic index with the minimum correlation evaluation score from the candidate characteristic index set each time, and sequentially putting the candidate characteristic index into a candidate characteristic index correlation sorting set to finally obtain a sorting set with the candidate characteristic index correlation from large to small;
removing the redundant indexes in the candidate characteristic index relevance sorting set to form a candidate characteristic index relevance sorting set with the redundant indexes removed;
selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index relevance sorting set with the redundant indexes removed, wherein the optimal index set comprises the following steps:
the objective function selected by designing the optimal index set is as follows:
Figure FDA0003810387730000011
s is a subset of a candidate characteristic index relevance ordering set Q with redundant indexes removed, P (S) is the classification accuracy of a support vector machine classifier constructed according to an index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weight factor used for balancing the classification accuracy and the index number;
and solving the objective function to obtain an optimal index set.
2. The method for selecting the evaluation index of the safety situation of the power monitoring system according to claim 1, wherein the obtaining of the candidate characteristic index for the safety evaluation of the power monitoring system comprises:
and selecting a plurality of candidate characteristic indexes capable of reflecting the safety state of the power monitoring system from the flow characteristic, the equipment running state and the alarm log.
3. The method for selecting the evaluation index of the safety situation of the power monitoring system according to claim 1, wherein the constructing a training sample data set based on the candidate feature index comprises:
collecting candidate characteristic index historical data for safety evaluation of a power monitoring system to form an original sample data set D:
Figure FDA0003810387730000021
wherein, X i ,i∈[1,m]The ith historical data record reflecting the safety state of the power monitoring system, m is the number of the collected historical data records, x ij Represents the value corresponding to the jth candidate characteristic index in the ith historical data record, wherein j belongs to [1, n ]]N is the number of candidate characteristic indexes;
normalizing the original sample data set D to generate a normalized sample data set D':
Figure FDA0003810387730000022
wherein: x' i Is X i Standardized sample data record, x' ij Is x ij A normalized value;
x' ij the calculation is as follows:
Figure FDA0003810387730000023
and (3) manually labeling the standardized original sample data set D', and generating a training sample data set T:
Figure FDA0003810387730000024
wherein, T i ,i∈[1,m]For the ith training sample record, x i ,y i The electric power monitoring system safety state corresponding to the ith training sample record is represented by element {0,1}, the safety state is represented by value 1, and the danger state is represented by value 0.
4. The method for selecting the evaluation index of the safety situation of the power monitoring system according to claim 1, wherein the training of the support vector machine model by using the training sample data set to obtain the weight of the candidate characteristic index comprises:
the classification function of the support vector machine is:
Figure FDA0003810387730000025
wherein the independent variable X represents a standardized data record to be measured reflecting the safety state of the power monitoring system, f (X) represents a classification result obtained according to the independent variable X, and alpha i Is a Lagrange multiplier, sgn (-) is a sign function, b is a threshold, alpha i And b is obtained by model training, X' i For standardized training sample data records, y i Representing the safety state of the power monitoring system corresponding to the ith training sample record;
training a support vector machine model by using a training sample data set to obtain a coefficient omega of an independent variable X:
Figure FDA0003810387730000031
then the jth element of the vector omega corresponds to
Figure FDA0003810387730000032
Weight ω representing jth candidate feature index j ,x' ij The number is a numerical value corresponding to the jth candidate characteristic index in the ith standardized training data record, m is the number of the training data records, and n is the number of the candidate characteristic indexes.
5. The method for selecting the evaluation index of the safety situation of the power monitoring system according to claim 4, wherein the calculating the relevance evaluation score of the candidate characteristic index and the safety evaluation of the power monitoring system according to the weight of the candidate characteristic index comprises:
c j =ω j 2
wherein, c j Is the correlation evaluation score, ω, of the jth candidate feature index j Is the weight of the jth candidate feature index.
6. The method for selecting the evaluation index of the safety situation of the power monitoring system according to claim 1, wherein the removing of the redundant indexes in the candidate feature index relevance ranking set forms the candidate feature index relevance ranking set from which the redundant indexes are removed, and comprises:
and (3) taking the Pearson correlation coefficient as the redundancy among indexes in the candidate characteristic index correlation sorting set:
Figure FDA0003810387730000033
wherein Pearson (s, g) is a Pearson correlation coefficient of two candidate characteristic indexes s, g in the candidate characteristic index correlation sorting set,
Figure FDA0003810387730000034
respectively representing the average value, s, of the two candidate characteristic indexes in the training sample data set T i ,g i Respectively representing the values of the two candidate characteristic indexes in the ith training sample record in the training sample data set T;
and eliminating the redundancy index with the redundancy rate larger than a preset redundancy threshold value.
7. A power monitoring system oriented security situation assessment index selection system is characterized in that the system is used for selecting the power monitoring system oriented security situation assessment index by adopting the power monitoring system oriented security situation assessment index selection method of any one of claims 1 to 6, and the system comprises:
the index acquisition module is used for acquiring candidate characteristic indexes for safety evaluation of the power monitoring system;
the sample acquisition module is used for constructing a training sample data set based on the candidate characteristic indexes;
the training module is used for calculating the relevance of the candidate characteristic indexes used for evaluating the safety state of the power monitoring system based on the training sample data set to obtain a candidate characteristic index relevance ranking set;
the screening module is used for eliminating the redundant indexes in the candidate characteristic index relevance sorting set to form a candidate characteristic index relevance sorting set with the redundant indexes eliminated;
and (c) a second step of,
and the optimizing module is used for selecting an optimal index set for evaluating the safety situation of the power monitoring system from the candidate characteristic index relevance sorting set in which the redundant indexes are removed.
8. The system for selecting the evaluation index of the safety situation of the power monitoring system according to claim 7, wherein the training module is specifically configured to,
training a support vector machine model by adopting a training sample data set to obtain the weight of the candidate characteristic index;
calculating the correlation evaluation scores of the candidate characteristic indexes and the safety evaluation of the power monitoring system according to the weights of the candidate characteristic indexes;
and circularly processing the candidate characteristic index set by using a sequence backward selection heuristic search algorithm, removing one candidate characteristic index with the minimum correlation evaluation score from the candidate characteristic index set every time, and sequentially putting the candidate characteristic index into a candidate characteristic index correlation sorting set to finally obtain a sorting set with the candidate characteristic index correlation from large to small.
9. The system for selecting the evaluation index of the safety situation of the power monitoring system according to claim 7, wherein the screening module is specifically configured to,
the Pearson correlation coefficient is adopted as the redundancy rate among indexes in the candidate characteristic index correlation sorting set:
Figure FDA0003810387730000041
wherein Pearson (s, g) is a Pearson correlation coefficient of two candidate feature indexes s, g in the candidate feature index correlation sorting set,
Figure FDA0003810387730000042
respectively representing the average value, s, of the two candidate characteristic indexes in the training sample data set T i ,g i Respectively representing the values of the two candidate characteristic indexes in the ith training sample record in the training sample data set T;
and eliminating the redundancy index with the redundancy rate larger than a preset redundancy threshold value.
10. The system for selecting safety situation assessment indexes of power monitoring systems according to claim 7, wherein the optimizing module is specifically configured to,
the objective function selected by designing the optimal index set is as follows:
Figure FDA0003810387730000051
s is a subset of a candidate characteristic index relevance ordering set Q with redundant indexes removed, P (S) is the classification accuracy of a support vector machine classifier constructed according to an index set S, | S | is the index number of the index set S, | Q | is the index number of the index set Q, and μ is a weight factor used for balancing the classification accuracy and the index number;
and solving the objective function to obtain an optimal index set.
CN202010370523.0A 2020-05-06 2020-05-06 Power monitoring system oriented safety situation assessment index selection method and system Active CN111614491B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010370523.0A CN111614491B (en) 2020-05-06 2020-05-06 Power monitoring system oriented safety situation assessment index selection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010370523.0A CN111614491B (en) 2020-05-06 2020-05-06 Power monitoring system oriented safety situation assessment index selection method and system

Publications (2)

Publication Number Publication Date
CN111614491A CN111614491A (en) 2020-09-01
CN111614491B true CN111614491B (en) 2022-10-04

Family

ID=72201998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010370523.0A Active CN111614491B (en) 2020-05-06 2020-05-06 Power monitoring system oriented safety situation assessment index selection method and system

Country Status (1)

Country Link
CN (1) CN111614491B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633622B (en) * 2020-09-29 2024-02-27 国网四川省电力公司信息通信公司 Smart power grid operation index screening method
CN112488871A (en) * 2020-10-23 2021-03-12 广西电网有限责任公司电力科学研究院 Method and system for eliminating redundant data of original input features of power grid
CN112330116A (en) * 2020-10-27 2021-02-05 中国建设银行股份有限公司 Business system performance analysis method, device and storage medium
CN113935031B (en) * 2020-12-03 2022-07-05 奇安信网神信息技术(北京)股份有限公司 Method and system for file feature extraction range configuration and static malicious software identification
CN112818028B (en) * 2021-01-12 2021-09-17 平安科技(深圳)有限公司 Data index screening method and device, computer equipment and storage medium
CN113111854A (en) * 2021-04-30 2021-07-13 平安国际融资租赁有限公司 Current signal extraction method, current signal extraction device, computer equipment and storage medium
CN113592379A (en) * 2021-06-25 2021-11-02 南京财经大学 Key characteristic identification method for detecting logistics transportation environment abnormity of bulk grain container
CN113469151B (en) * 2021-09-03 2022-02-15 深圳市信润富联数字科技有限公司 Method, device, equipment and medium for screening sensor in industrial manufacturing
CN114553681B (en) * 2022-03-08 2024-02-02 中国人民解放军国防科技大学 Device state abnormality detection method and device and computer device
CN116226705A (en) * 2022-12-05 2023-06-06 安徽继远软件有限公司 Situation awareness method based on power resource monitoring

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123149B (en) * 2011-03-04 2013-11-20 哈尔滨工程大学 Service-oriented large-scale network security situational assessment device and method
US10614373B1 (en) * 2013-12-23 2020-04-07 Groupon, Inc. Processing dynamic data within an adaptive oracle-trained learning system using curated training data for incremental re-training of a predictive model
CN104794534B (en) * 2015-04-16 2017-12-15 国网山东省电力公司临沂供电公司 A kind of power grid security Tendency Prediction method based on improvement deep learning model
CN106483947A (en) * 2016-09-21 2017-03-08 国网江苏省电力公司南通供电公司 Distribution Running State assessment based on big data and method for early warning

Also Published As

Publication number Publication date
CN111614491A (en) 2020-09-01

Similar Documents

Publication Publication Date Title
CN111614491B (en) Power monitoring system oriented safety situation assessment index selection method and system
CN110636066B (en) Network security threat situation assessment method based on unsupervised generative reasoning
CN111723367B (en) Method and system for evaluating service scene treatment risk of power monitoring system
CN113360358B (en) Method and system for adaptively calculating IT intelligent operation and maintenance health index
CN111598179B (en) Power monitoring system user abnormal behavior analysis method, storage medium and equipment
CN113780443B (en) Threat detection-oriented network security situation assessment method
CN113570200B (en) Power grid running state monitoring method and system based on multidimensional information
CN114580706A (en) Power financial service wind control method and system based on GRU-LSTM neural network
CN114757468B (en) Root cause analysis method for process execution abnormality in process mining
CN113918367A (en) Large-scale system log anomaly detection method based on attention mechanism
CN114707431B (en) Method and system for predicting residual service life of rotating multiple components and storage medium
CN115719283A (en) Intelligent accounting management system
CN116030955B (en) Medical equipment state monitoring method and related device based on Internet of things
CN117216713A (en) Fault delimiting method, device, electronic equipment and storage medium
CN117113232A (en) Thermal runaway risk identification method for lithium ion battery pack of electric automobile
CN116701846A (en) Hydropower station dispatching operation data cleaning method based on unsupervised learning
CN116136897A (en) Information processing method and device
CN111221704B (en) Method and system for determining running state of office management application system
CN113448840A (en) Software quality evaluation method based on predicted defect rate and fuzzy comprehensive evaluation model
CN112422505A (en) Network malicious traffic identification method based on high-dimensional extended key feature vector
Du et al. Unstructured log oriented fault diagnosis for operation and maintenance management
Xi et al. Power mobile terminal security assessment based on weights self-learning
CN117745080B (en) Multi-factor authentication-based data access control and security supervision method and system
CN113904801B (en) Network intrusion detection method and system
CN117149551B (en) Test method of vehicle-mounted wireless communication chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant