CN117609862A - Power grid data anomaly level determination method, device, equipment and medium - Google Patents

Power grid data anomaly level determination method, device, equipment and medium Download PDF

Info

Publication number
CN117609862A
CN117609862A CN202311620857.9A CN202311620857A CN117609862A CN 117609862 A CN117609862 A CN 117609862A CN 202311620857 A CN202311620857 A CN 202311620857A CN 117609862 A CN117609862 A CN 117609862A
Authority
CN
China
Prior art keywords
data
power grid
abnormal
historical
grid data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311620857.9A
Other languages
Chinese (zh)
Inventor
梁庆华
许江移
张国雄
陈剑文
杜辉
黄远东
钟文
吴艳萍
黎健良
杨国军
张帆
梁慧
陈志华
魏武林
潘世驹
牟峻林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202311620857.9A priority Critical patent/CN117609862A/en
Publication of CN117609862A publication Critical patent/CN117609862A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method, a device, equipment and a medium for determining abnormal grades of power grid data. Acquiring power grid data to be determined in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined. The method solves the problem that the acquired power grid data cannot be accurately classified, improves the accuracy and reliability of the power grid data classification, can realize reasonable scheduling of resources, improves user defense measures, and better guarantees the safety performance of the power grid.

Description

Power grid data anomaly level determination method, device, equipment and medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for determining an abnormal level of power grid data.
Background
Support vector machine (Support Vector Machine, SVM) algorithms are more sophisticated to apply in intrusion detection models, however, in multi-class classification algorithms, the parameter optimization of the SVM becomes very complex as the class increases, since the SVM is very sensitive to parameter adjustment and kernel selection. Especially in the smart grid intrusion detection scenario, it is therefore critical to determine the grid data type and anomaly level.
The inventors have found that the following drawbacks exist in the prior art in the process of implementing the present invention: at present, because the distribution of the power grid data is uneven, especially under the condition that the number of normal data samples is far greater than that of abnormal data samples, the accuracy of judging the type of the abnormal data samples is far lower than that of the normal data samples, so that the type and abnormal grade of the power grid data cannot be accurately judged and processed, and the abnormal data cannot be timely determined, so that the safety and reliability of the power grid are low.
Disclosure of Invention
The invention provides a method, a device, equipment and a medium for determining abnormal grades of power grid data, which are used for improving the accuracy and reliability of power grid data classification and realizing reasonable scheduling of resources.
According to an aspect of the present invention, there is provided a method for determining an anomaly level of power grid data, including:
acquiring power grid data to be determined in real time;
inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data;
the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result;
if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined;
wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
According to another aspect of the present invention, there is provided a power grid data anomaly level determining apparatus, including:
the power grid data acquisition module is used for acquiring power grid data to be determined in real time;
The power grid data initial classification result determining module is used for inputting the power grid data to be determined into a pre-trained binary classification model to determine a power grid data initial classification result;
the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result;
the abnormal type grade result determining module is used for inputting the power grid abnormal data corresponding to the power grid abnormal data initial classification result into a pre-trained abnormal grade determining model if the power grid data initial classification result is the power grid abnormal data initial classification result, and determining an abnormal type grade result corresponding to the power grid data to be determined;
wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
According to another aspect of the present invention, there is provided an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for determining an anomaly class of grid data according to any embodiment of the present invention when the processor executes the computer program.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the grid data anomaly level determination method according to any one of the embodiments of the present invention when executed.
According to the technical scheme, the power grid data to be determined are obtained in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined. The method solves the problem that the acquired power grid data cannot be accurately classified, improves the accuracy and reliability of the power grid data classification, can realize reasonable scheduling of resources, improves user defense measures, and better guarantees the safety performance of the power grid.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for determining an abnormal level of grid data according to a first embodiment of the present invention;
fig. 2 is a schematic structural diagram of a device for determining abnormal levels of grid data according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "target," "current," and the like in the description and claims of the present invention and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a method for determining an abnormal grade of power grid data according to an embodiment of the present invention, where the method may be performed by a device for determining an abnormal grade of power grid data, and the device may be implemented in hardware and/or software.
Accordingly, as shown in fig. 1, the method includes:
s110, acquiring power grid data to be determined in real time.
The power grid data to be determined can be data for describing the power grid state, and description of the current state of the power grid can be performed according to the power grid data acquired in real time.
S120, inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data.
The binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result.
The binary classification model may be a model capable of judging whether the grid data belongs to normal data or abnormal data.
In this embodiment, if it is determined that the grid data is normal data through the binary classification model, it may be determined that the initial classification result of the grid data corresponding to the grid data is the initial classification result of the normal data of the grid. If the power grid data is determined to be abnormal data through the binary classification model, the initial classification result of the power grid data corresponding to the power grid data can be determined to be the initial classification result of the abnormal data of the power grid.
Specifically, whether the power grid data are normal or not is determined through a binary classification model, and if the power grid data are normal, the current state of the power grid data is normal. Otherwise, if the power grid data is abnormal data, the power grid data is further subjected to abnormal grade determination processing, so that the power grid state can be mastered in time, property loss is reduced, and the safety and reliability of the power grid are improved.
In addition, the binary classification model is constructed based on a fuzzy support vector machine (Fuzzy Support Vector Machine, FSVM) algorithm, and the binary classification model can be used for judging whether the power grid data belongs to normal data or abnormal data.
Optionally, before the inputting the grid data to be determined into the pre-trained binary classification model, determining an initial classification result of the grid data, the method further includes: acquiring a plurality of historical power grid data and first historical power grid data labels respectively corresponding to the historical power grid data; the first historical power grid data tag comprises a first historical power grid normal data tag and a first historical power grid abnormal data tag; carrying out data preprocessing on each piece of historical power grid data through a preset data preprocessing method to obtain each standard historical power grid data; inputting the standard historical power grid data and the first historical power grid data labels into an initial binary classification model to perform model training, and if the binary classification accuracy meets a preset accuracy threshold, determining that the training is completed on the binary classification model.
In this embodiment, after each historical grid data is obtained, a first historical grid data tag and a second historical grid data tag corresponding to each historical grid data are also required to be obtained.
Specifically, the first historical grid data tag includes a first historical grid normal data tag and a first historical grid abnormal data tag. The second historical grid anomaly data tag may include a denial of service attack type, a remote host user unauthorized access attack type, an unauthorized local supervisor privileged access attack type, a port monitored or scanned attack type, and the like.
Furthermore, data preprocessing operation is needed to be performed on the historical power grid data, so that standard historical power grid data is obtained, and therefore training processing can be further performed on the initial binary classification model, and a binary classification model is obtained.
In detail, the standard historical power grid data in the test set is required to be identified through a binary classification model in the training process, so that the binary classification accuracy is calculated according to the result output by the model and the standard classification result. Correspondingly, if the binary classification accuracy meets a preset accuracy threshold, the training of the binary classification model can be determined.
For example, assuming that the accuracy threshold is 99%, but the calculated binary classification accuracy is 99.2%, it is determined that the binary classification accuracy meets the preset accuracy threshold, and it may be further determined that training is completed on the binary classification model.
Optionally, after the standard historical grid data and the first historical grid data labels are input into the initial binary classification model for training of the model, the method further includes: and if the binary classification accuracy rate does not meet the preset accuracy rate threshold value, returning to execute the operation of acquiring a plurality of historical power grid data and the first historical power grid data labels respectively corresponding to the historical power grid data until the binary classification accuracy rate meets the preset accuracy rate threshold value, and determining that training is completed on the binary classification model.
In the previous example, assuming that the accuracy rate threshold is 99%, but the calculated binary classification accuracy rate is 95%, if it is determined that the binary classification accuracy rate does not meet the preset accuracy rate threshold, the operation of acquiring a plurality of historical grid data and first historical grid data labels corresponding to each of the historical grid data may be further performed in a returning manner, and retraining is performed on the model until the binary classification accuracy rate meets the preset accuracy rate threshold, and if it is determined that training is completed on the binary classification model.
The advantages of this arrangement are that: the accuracy of the binary classification model for classifying the power grid data can be improved, so that the accuracy and the reliability of abnormal grades of the power grid data are improved.
Optionally, the performing data preprocessing on each historical power grid data by a preset data preprocessing method to obtain each standard historical power grid data includes: performing discrete character type data processing on each historical power grid data to obtain each historical power grid digital data; carrying out data standardization processing on each historical power grid digital data to obtain each initial standard historical power grid digital data; carrying out data normalization processing on the digital data of each initial standard historical power grid to obtain normalized data of each historical power grid; and carrying out data formatting processing on the normalized data of each historical power grid to obtain data of each standard historical power grid.
In this embodiment, data preprocessing operation is required to be performed on historical grid data, so that standard data is obtained, and model training operation can be performed on the binary classification model better.
Specifically, the data preprocessing operation includes discrete character data processing, data normalization processing, and data formatting processing.
In this embodiment, the historical grid data may be from a KDD Cup99 dataset, which is not specifically limited herein. Because most of the historical power grid data are data packets in the network, the data packets in the network contain discrete fields and continuous fields, and the discrete fields comprise discrete character types and discrete digital types, the historical power grid data need to be processed with discrete character type data at first, and the character type data are converted into digital data, so that the historical power grid digital data are obtained.
Furthermore, the data standardization and the data normalization eliminate the influence of the measurement unit on the model training, so that the training result is more dependent on the characteristics of the data, and the model prediction accuracy is improved. The data formatting is converting the network data packets into libsvm format. For example, { Label 1 (Value) 1, 2 (Value) 2, …, i (Value) i, …, n (Value) n }. Where Label is a category Label, sequence number i is the sequence number of the i-th field, and (Value) i is the Value of the i-th field. The libsvm format is a data format commonly used by a clustering algorithm, and is convenient for the clustering algorithm to process data.
In particular, a KDD Cup99 dataset is taken as an example to describe in detail how the data preprocessor is implemented. The KDD Cup99 dataset contains 42 fields, 41 of which are network packet characteristic attribute fields, and the other is a label of the historical grid data record. The KDD Cup99 dataset contains 9 TCP connection basic features, 13 TCP connection content features, 9 time-based network traffic statistics and 10 machine-based network traffic statistics.
In detail, in these feature attribute fields, the data type includes discrete character type data, discrete numerical type data, and continuous type data. Discrete character type data needs to be processed into discrete numerical type data to participate in subsequent calculations in the data preprocessor.
Further, table 1 below shows a schematic table of discrete character type data of KDD Cup99 dataset. The value range contained in the discrete character type field is specifically: assuming that the number of the discrete character type fields of the KDD Cup99 is n, the original values are respectively replaced by 0-n-1 numbers.
TABLE 1
Assume, for example, protocol_type, which is a discrete value packet of "TCP", "UDP", and "ICMP". "TCP" may be denoted by 0, "UDP" by 1, and "ICMP" by 2. And so on, the service field is replaced by 0 to 69 for the original value, and the flag field is replaced by 0 to 10 for the original value.
Furthermore, data standardization processing and data normalization processing are carried out on the digital data of the historical power grid. Specifically, a certain piece of historical grid data in each piece of historical grid digital data with labels is recorded as x kj (1.ltoreq.k.ltoreq.n, 0.ltoreq.j.ltoreq.40). Wherein k represents the number of the digital data of the historical power grid, and j represents the number of the feature. All discrete and continuous fields are involved in the data normalization and normalization process. Both data normalization and normalization are performed for the same feature attributes. First, the average value AVG of the j-th feature attribute is calculated j And average absolute deviation STAD j The calculation formula is as follows:
finally, standardized x 'is obtained' kj The formula of (2) is:
furthermore, after the data normalization processing, further data normalization processing is performed on the digital data of each initial standard historical power grid. Due to data normalization, x 'is obtained' kj It is necessary to normalize the value range of each field to 0,1]the influence of a large difference of the data range on the training result is reduced. The normalized formula is:wherein x is min =min{x’ kj },x max =max{x’ kj 1.ltoreq.i.ltoreq.n, and 0.ltoreq.j.ltoreq.40.
Further, data formatting processing needs to be performed on each historical power grid normalized data, namely, the processed KDD Cup99 data set is converted into a libsvm format. Specifically, for the KDD Cup99 dataset, the Label is divided into 5 categories, respectively: NORMAL type, denial of service attack type, remote host user unauthorized access attack type, unauthorized local supervisor privileged access attack type, and port monitored or scanned attack type. Wherein the NORMAL type indicates a NORMAL type. In the data formatting processing part, two groups of training sets are generated, wherein the first group sets Label of NORMAL type data to 1, and other four types of data Label to-1; the second group eliminates NORMAL type data and the rest data are reserved.
Assuming that the grid data is determined to be abnormal data, the predictor may be configured by 4 binary logistic regression predictors. The network data packet of each unknown label enters the 4 predictors respectively, so that 4 binary logistic regression classification probabilities are obtained, and the classification class corresponding to the maximum probability value in the 4 probability values is selected as the class to which the unknown network packet belongs.
Optionally, after the inputting the grid data to be determined into the pre-trained binary classification model, determining an initial classification result of the grid data, the method further includes: and if the initial classification result of the power grid data is the initial classification result of the power grid normal data, determining that the power grid data to be determined is the power grid normal data, and carrying out feedback operation to a user.
In this embodiment, if the initial classification result of the grid data is the initial classification result of the normal data of the grid, it is indicated that there is no abnormality in the current grid data, so that the subsequent determination of the level of abnormality is not needed, and feedback is directly performed to the user.
S130, if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined.
Wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
The anomaly class determination model may be a model for performing anomaly type class determination on the grid anomaly data.
In this embodiment, an anomaly type level result corresponding to the grid anomaly data may be determined, and specifically, the anomaly type level result may include a denial of service attack type, an unauthorized access attack type of a remote host user, an unauthorized local super user privileged access attack type, or a port monitored or scanned attack type.
Optionally, after the inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into the pre-trained abnormal grade determination model and determining the abnormal grade result corresponding to the power grid data to be determined, the method further includes: matching the abnormal type grade result with a preset abnormal data alarm mapping table to obtain an abnormal type grade matching result; and carrying out alarm feedback operation according to the abnormal type grade matching result.
The abnormal data alert mapping table may be a mapping table describing a matching relationship between the abnormal type level result and the abnormal type level matching result.
The anomaly type level matching result may be a result describing the degree of alarm. If the abnormal type level result is a denial of service attack type, the abnormal type level matching result is a primary alarm matching result; if the abnormal type level result is the attack type unauthorized to access by the remote host user, the abnormal type level matching result is a secondary alarm matching result; if the abnormal type level result is an unauthorized local super user privilege access attack type, the abnormal type level matching result is a three-level alarm matching result; if the abnormal type level result is the port monitored or scanning attack type, the abnormal type level matching result is a four-level alarm matching result.
Optionally, before the step of inputting the power grid abnormal data corresponding to the power grid abnormal data initial classification result into the pre-trained abnormal grade determination model if the power grid data initial classification result is the power grid abnormal data initial classification result, determining the abnormal grade result corresponding to the power grid data to be determined, the method further includes: acquiring each standard historical grid data corresponding to the first historical grid abnormal data tag, and acquiring a second historical grid abnormal data tag corresponding to each standard historical grid data respectively; wherein the second historical grid anomaly data tag comprises at least one of: denial of service attack type, remote host user unauthorized access attack type, unauthorized local supervisor privileged access attack type and port monitored or scanned attack type; inputting the standard historical power grid data and the second historical power grid abnormal data labels respectively corresponding to the standard historical power grid data into an initial abnormal grade determining model to perform model training, and determining to train to obtain the abnormal grade determining model when the attack type determining accuracy meets a preset type accuracy threshold.
In this embodiment, training of the model in the initial anomaly class determination model is required to be performed according to each standard historical grid data corresponding to the first historical grid anomaly data tag and the second historical grid anomaly data tag, so that the anomaly class determination model is obtained through training.
In the process of determining the abnormal level, a judgment needs to be made as to whether the attack type determination accuracy meets a preset type accuracy threshold, and if so, the abnormal level determination model can be determined to be trained. Otherwise, the model training is carried out by continuously acquiring each standard historical power grid data corresponding to the first historical power grid abnormal data label until the requirement of the type accuracy rate threshold is met, and the abnormal grade determination model training can be determined to be completed. Therefore, the accuracy and the reliability of determining the abnormal grade of the power grid data can be improved.
In the previous example, since the abnormal data in the KDD Cup99 dataset is totally divided into 4 classes, 4 trainers need to be constructed, and only 4 trainers need to be constructed. The training model based on the KDD CUP99 dataset includes two loops, an inner loop and an outer loop. The function of the internal loop (binary classification model) is to process the training set data, set Label of the m-th class of training set data to 1 (i.e. positive sample), and set Label of the other classes to-1 (i.e. negative sample). Positive and negative samples are distinguished by Label values. The external circulation (abnormal grade determining model) is to send the negative sample into the abnormal grade determining model, and 4 times of circulation are carried out to obtain 4 groups of logistic regression boundary vectors. A binary classification model and an anomaly class determination model may be trained.
According to the technical scheme, the power grid data to be determined are obtained in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined. The method solves the problem that the acquired power grid data cannot be accurately classified, improves the accuracy and reliability of the power grid data classification, can realize reasonable scheduling of resources, improves user defense measures, and better guarantees the safety performance of the power grid.
Example two
Fig. 2 is a schematic structural diagram of a device for determining abnormal levels of power grid data according to a second embodiment of the present invention. The device for determining the abnormal grade of the power grid data provided by the embodiment of the invention can be realized through software and/or hardware, and can be configured in terminal equipment or a server to realize the method for determining the abnormal grade of the power grid data. As shown in fig. 2, the apparatus includes: the system comprises a power grid data acquisition module 210 to be determined, a power grid data initial classification result determination module 220 and an anomaly type grade result determination module 230.
The power grid data acquisition module 210 is configured to acquire power grid data to be determined in real time;
the power grid data initial classification result determining module 220 is configured to input the power grid data to be determined into a pre-trained binary classification model, and determine a power grid data initial classification result;
the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result;
the anomaly type grade result determining module 230 is configured to input, if the initial classification result of the grid data is an initial classification result of the grid anomaly data, the grid anomaly data corresponding to the initial classification result of the grid anomaly data into an anomaly grade determining model trained in advance, and determine an anomaly type grade result corresponding to the grid data to be determined;
wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
According to the technical scheme, the power grid data to be determined are obtained in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined. The method solves the problem that the acquired power grid data cannot be accurately classified, improves the accuracy and reliability of the power grid data classification, can realize reasonable scheduling of resources, improves user defense measures, and better guarantees the safety performance of the power grid.
Optionally, the alarm feedback module may be specifically configured to: after the abnormal power grid data corresponding to the initial classification result of the abnormal power grid data is input into a pre-trained abnormal grade determining model and an abnormal type grade result corresponding to the power grid data to be determined is determined, matching is carried out according to the abnormal type grade result and a preset abnormal data alarm mapping table, and an abnormal type grade matching result is obtained; and carrying out alarm feedback operation according to the abnormal type grade matching result.
Optionally, the feedback operation module may be specifically configured to: after the power grid data to be determined is input into a pre-trained binary classification model, determining an initial classification result of the power grid data, if the initial classification result of the power grid data is an initial classification result of normal power grid data, determining the power grid data to be determined as normal power grid data, and performing feedback operation to a user.
Optionally, the training module of the binary classification model may be specifically configured to: before the grid data to be determined are input into a pre-trained binary classification model and an initial classification result of the grid data is determined, a plurality of historical grid data and first historical grid data labels corresponding to the historical grid data are obtained; the first historical power grid data tag comprises a first historical power grid normal data tag and a first historical power grid abnormal data tag; carrying out data preprocessing on each piece of historical power grid data through a preset data preprocessing method to obtain each standard historical power grid data; inputting the standard historical power grid data and the first historical power grid data labels into an initial binary classification model to perform model training, and if the binary classification accuracy meets a preset accuracy threshold, determining that the training is completed on the binary classification model.
Optionally, the binary classification model training module may be further specifically configured to: and if the binary classification accuracy rate does not meet the preset accuracy rate threshold value, returning to execute the operation of acquiring a plurality of historical power grid data and the first historical power grid data labels respectively corresponding to the historical power grid data until the binary classification accuracy rate meets the preset accuracy rate threshold value, and determining that training is completed on the binary classification model.
Optionally, the anomaly level determination model training module may be specifically configured to: if the initial classification result of the grid data is an initial classification result of the grid abnormal data, inputting the grid abnormal data corresponding to the initial classification result of the grid abnormal data into a pre-trained abnormal grade determination model, and acquiring each standard historical grid data corresponding to the first historical grid abnormal data label and each second historical grid abnormal data label corresponding to each standard historical grid data before determining an abnormal grade result corresponding to the grid data to be determined; wherein the second historical grid anomaly data tag comprises at least one of: denial of service attack type, remote host user unauthorized access attack type, unauthorized local supervisor privileged access attack type and port monitored or scanned attack type; inputting the standard historical power grid data and the second historical power grid abnormal data labels respectively corresponding to the standard historical power grid data into an initial abnormal grade determining model to perform model training, and determining to train to obtain the abnormal grade determining model when the attack type determining accuracy meets a preset type accuracy threshold.
Optionally, the binary classification model training module may be further specifically configured to: performing discrete character type data processing on each historical power grid data to obtain each historical power grid digital data; carrying out data standardization processing on each historical power grid digital data to obtain each initial standard historical power grid digital data; carrying out data normalization processing on the digital data of each initial standard historical power grid to obtain normalized data of each historical power grid; and carrying out data formatting processing on the normalized data of each historical power grid to obtain data of each standard historical power grid.
The power grid data abnormal grade determining device provided by the embodiment of the invention can execute the power grid data abnormal grade determining method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example III
Fig. 3 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement a third embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 3, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the grid data anomaly level determination method.
In some embodiments, the grid data anomaly level determination method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the grid data anomaly level determination method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the grid data anomaly level determination method by any other suitable means (e.g., by means of firmware).
The method comprises the following steps: acquiring power grid data to be determined in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined; wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Example IV
A fourth embodiment of the present invention also provides a computer-readable storage medium containing computer-readable instructions, which when executed by a computer processor, are configured to perform a grid data anomaly level determination method, the method comprising: acquiring power grid data to be determined in real time; inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data; the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result; if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined; wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
Of course, the embodiment of the present invention provides a computer-readable storage medium, where the computer-executable instructions are not limited to the method operations described above, but may also perform the related operations in the method for determining the abnormal grade of grid data provided by any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above-mentioned power grid data anomaly level determining device, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for determining an anomaly level of power grid data, comprising:
acquiring power grid data to be determined in real time;
inputting the power grid data to be determined into a pre-trained binary classification model, and determining an initial classification result of the power grid data;
the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result;
if the initial classification result of the power grid data is the initial classification result of the power grid abnormal data, inputting the power grid abnormal data corresponding to the initial classification result of the power grid abnormal data into a pre-trained abnormal grade determining model, and determining an abnormal type grade result corresponding to the power grid data to be determined;
Wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
2. The method according to claim 1, wherein after the inputting the grid anomaly data corresponding to the initial classification result of the grid anomaly data into the pre-trained anomaly class determination model, determining the anomaly type class result corresponding to the grid data to be determined, further comprises:
matching the abnormal type grade result with a preset abnormal data alarm mapping table to obtain an abnormal type grade matching result;
and carrying out alarm feedback operation according to the abnormal type grade matching result.
3. The method according to claim 2, further comprising, after the inputting the grid data to be determined into a pre-trained binary classification model, determining an initial classification result of the grid data:
and if the initial classification result of the power grid data is the initial classification result of the power grid normal data, determining that the power grid data to be determined is the power grid normal data, and carrying out feedback operation to a user.
4. A method according to claim 3, further comprising, before said inputting the grid data to be determined into a pre-trained binary classification model, determining an initial classification result of the grid data:
Acquiring a plurality of historical power grid data and first historical power grid data labels respectively corresponding to the historical power grid data;
the first historical power grid data tag comprises a first historical power grid normal data tag and a first historical power grid abnormal data tag;
carrying out data preprocessing on each piece of historical power grid data through a preset data preprocessing method to obtain each standard historical power grid data;
inputting the standard historical power grid data and the first historical power grid data labels into an initial binary classification model to perform model training, and if the binary classification accuracy meets a preset accuracy threshold, determining that the training is completed on the binary classification model.
5. The method of claim 4, further comprising, after said training of models by inputting each of said standard historical grid data and said first historical grid data tag into an initial binary classification model:
and if the binary classification accuracy rate does not meet the preset accuracy rate threshold value, returning to execute the operation of acquiring a plurality of historical power grid data and the first historical power grid data labels respectively corresponding to the historical power grid data until the binary classification accuracy rate meets the preset accuracy rate threshold value, and determining that training is completed on the binary classification model.
6. The method according to claim 5, wherein, before the step of inputting the grid anomaly data corresponding to the grid anomaly data initial classification result into a pre-trained anomaly level determination model to determine the anomaly level result corresponding to the grid data to be determined if the grid data initial classification result is a grid anomaly data initial classification result, further comprising:
acquiring each standard historical grid data corresponding to the first historical grid abnormal data tag, and acquiring a second historical grid abnormal data tag corresponding to each standard historical grid data respectively;
wherein the second historical grid anomaly data tag comprises at least one of: denial of service attack type, remote host user unauthorized access attack type, unauthorized local supervisor privileged access attack type and port monitored or scanned attack type;
inputting the standard historical power grid data and the second historical power grid abnormal data labels respectively corresponding to the standard historical power grid data into an initial abnormal grade determining model to perform model training, and determining to train to obtain the abnormal grade determining model when the attack type determining accuracy meets a preset type accuracy threshold.
7. The method according to claim 6, wherein the performing data preprocessing on each historical grid data by a preset data preprocessing method to obtain each standard historical grid data comprises:
performing discrete character type data processing on each historical power grid data to obtain each historical power grid digital data;
carrying out data standardization processing on each historical power grid digital data to obtain each initial standard historical power grid digital data;
carrying out data normalization processing on the digital data of each initial standard historical power grid to obtain normalized data of each historical power grid;
and carrying out data formatting processing on the normalized data of each historical power grid to obtain data of each standard historical power grid.
8. A power grid data anomaly level determination device, characterized by comprising:
the power grid data acquisition module is used for acquiring power grid data to be determined in real time;
the power grid data initial classification result determining module is used for inputting the power grid data to be determined into a pre-trained binary classification model to determine a power grid data initial classification result;
the binary classification model is constructed based on a fuzzy support vector machine algorithm; the power grid data initial classification result comprises a power grid abnormal data initial classification result and a power grid normal data initial classification result;
The abnormal type grade result determining module is used for inputting the power grid abnormal data corresponding to the power grid abnormal data initial classification result into a pre-trained abnormal grade determining model if the power grid data initial classification result is the power grid abnormal data initial classification result, and determining an abnormal type grade result corresponding to the power grid data to be determined;
wherein the anomaly level determination model is constructed based on a multiple class logistic regression algorithm.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a method for determining an anomaly level of grid data according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium storing computer instructions for causing a processor to perform a method of determining an anomaly level of grid data according to any one of claims 1 to 7.
CN202311620857.9A 2023-11-29 2023-11-29 Power grid data anomaly level determination method, device, equipment and medium Pending CN117609862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311620857.9A CN117609862A (en) 2023-11-29 2023-11-29 Power grid data anomaly level determination method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311620857.9A CN117609862A (en) 2023-11-29 2023-11-29 Power grid data anomaly level determination method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN117609862A true CN117609862A (en) 2024-02-27

Family

ID=89953123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311620857.9A Pending CN117609862A (en) 2023-11-29 2023-11-29 Power grid data anomaly level determination method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN117609862A (en)

Similar Documents

Publication Publication Date Title
CN113705425B (en) Training method of living body detection model, and method, device and equipment for living body detection
CN113222942A (en) Training method of multi-label classification model and method for predicting labels
CN115794578A (en) Data management method, device, equipment and medium for power system
CN114881129A (en) Model training method and device, electronic equipment and storage medium
CN115632874A (en) Method, device, equipment and storage medium for detecting threat of entity object
CN113657249B (en) Training method, prediction method, device, electronic equipment and storage medium
CN116628554B (en) Industrial Internet data anomaly detection method, system and equipment
CN117474091A (en) Knowledge graph construction method, device, equipment and storage medium
CN117034149A (en) Fault processing strategy determining method and device, electronic equipment and storage medium
CN116755974A (en) Cloud computing platform operation and maintenance method and device, electronic equipment and storage medium
CN116668264A (en) Root cause analysis method, device, equipment and storage medium for alarm clustering
CN113612777B (en) Training method, flow classification method, device, electronic equipment and storage medium
CN117609862A (en) Power grid data anomaly level determination method, device, equipment and medium
CN115761648A (en) Oil leakage evaluation method, device, equipment, medium and product applied to transformer
CN115665783A (en) Abnormal index tracing method and device, electronic equipment and storage medium
CN114692778A (en) Multi-modal sample set generation method, training method and device for intelligent inspection
CN115496916B (en) Training method of image recognition model, image recognition method and related device
CN116071628B (en) Image processing method, device, electronic equipment and storage medium
CN117609723A (en) Object identification method and device, electronic equipment and storage medium
CN116961229A (en) Transformer substation fault positioning method and device, electronic equipment and storage medium
CN117454204A (en) Method, device, equipment and storage medium for determining API request function
CN116128296A (en) Risk prediction method, risk prediction device, electronic equipment and storage medium
CN117907752A (en) Fault early warning method and device for high-voltage transmission line, electronic equipment and storage medium
CN117896170A (en) Vulnerability identification method and device, electronic equipment and storage medium
CN116455999A (en) Application state management method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination