CN117874653A - Power system safety monitoring method and system based on multi-source data - Google Patents

Power system safety monitoring method and system based on multi-source data Download PDF

Info

Publication number
CN117874653A
CN117874653A CN202410272154.XA CN202410272154A CN117874653A CN 117874653 A CN117874653 A CN 117874653A CN 202410272154 A CN202410272154 A CN 202410272154A CN 117874653 A CN117874653 A CN 117874653A
Authority
CN
China
Prior art keywords
monitoring
monitoring parameter
template
data
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410272154.XA
Other languages
Chinese (zh)
Other versions
CN117874653B (en
Inventor
曾玉伟
任文鹏
胡斐
蔡明芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Jiahua Innovation Electrical Co ltd
Original Assignee
Wuhan Jiahua Innovation Electrical Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Jiahua Innovation Electrical Co ltd filed Critical Wuhan Jiahua Innovation Electrical Co ltd
Priority to CN202410272154.XA priority Critical patent/CN117874653B/en
Publication of CN117874653A publication Critical patent/CN117874653A/en
Application granted granted Critical
Publication of CN117874653B publication Critical patent/CN117874653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • G06Q10/063114Status monitoring or status determination for a person or group
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a method and a system for monitoring safety of an electric power system based on multi-source data, wherein the method comprises the following steps: collecting samples of all monitoring parameters, determining the monitoring parameters of a template, constructing an isolated forest of the monitoring parameters of the template, determining the dividing effect of all thresholds on the isolated tree of the monitoring parameters of the template, determining the fitting function of the dividing effect of the monitoring parameters of the template, determining the probability of taking all data in the value range of all the monitoring parameters as the thresholds, constructing the isolated forest of all the monitoring parameters, and judging the safety of the power system through the abnormal score of the data of all the monitoring parameters at the moment to be monitored. The invention improves the stability of the isolated forest performance of each monitoring parameter, and further improves the accuracy of safety monitoring of the power system.

Description

Power system safety monitoring method and system based on multi-source data
Technical Field
The invention relates to the technical field of data processing. More particularly, the invention relates to a method and a system for monitoring safety of an electric power system based on multi-source data.
Background
Power system safety monitoring is a key step in maintaining healthy and reliable operation of a power system, helps to prevent faults, improves operating efficiency, and ensures high quality power service to customers. The multi-source data provides more accurate information than the data of a single source, so that safety monitoring of the power system can be realized by carrying out data analysis on the data of various monitoring parameters of the power system and judging whether the power system is abnormal according to the analysis result.
The isolated forest anomaly detection algorithm is a conventional data analysis method, and when the data of various monitoring parameters of the power system are subjected to data analysis through the isolated forest anomaly detection algorithm, the anomaly score of the data of the various monitoring parameters of the power system is calculated through the constructed isolated forest, so that whether the power system is abnormal or not is judged; the method comprises the steps of establishing an isolated tree in an isolated forest through a threshold value, wherein the isolated tree in the isolated forest is established through the threshold value, and the different threshold values are different in dividing effect, so that the performance of the isolated tree established through the threshold value randomly selected by equal probability is unstable, the abnormal score calculated through the isolated forest is inaccurate, whether the power system is abnormal or not cannot be accurately judged, and the safety monitoring of the power system is affected.
Disclosure of Invention
To solve one or more of the above-described technical problems, the present invention provides aspects as follows.
In a first aspect, the present invention provides a method for monitoring safety of an electric power system based on multi-source data, comprising:
collecting data of each monitoring parameter of the power system through a sensor, and taking all the data of each monitoring parameter collected in a preset sampling period as a sample of each monitoring parameter;
determining template monitoring parameters according to the correlation coefficients of the samples of each monitoring parameter and the samples of other monitoring parameters;
randomly selecting a threshold value in the value range of the template monitoring parameters at equal probability, and constructing an isolated forest of the template monitoring parameters according to the sample of the template monitoring parameters, wherein the isolated forest comprises a plurality of isolated trees; determining the dividing effect of each threshold value on the isolated tree of the template monitoring parameter according to the difference of the quantity, the numerical value and the abnormal score of all data contained by two nodes corresponding to each threshold value on the isolated tree of the template monitoring parameter;
fitting all the threshold dividing effects on the isolated tree of the template monitoring parameters to determine a dividing effect fitting function of the template monitoring parameters;
determining the probability of taking each datum as a threshold value in the value range of each monitoring parameter according to the correlation coefficient of each monitoring parameter and the template monitoring parameter and the dividing effect fitting function of the template monitoring parameter;
according to the probability that each data in the value range of each monitoring parameter is used as a threshold value, randomly selecting the threshold value according to the unequal probability in the value range of each monitoring parameter, and constructing an isolated forest of each monitoring parameter according to the sample of each monitoring parameter;
and determining an abnormal score of the data of each monitoring parameter at the moment to be monitored through the isolated forest of each monitoring parameter, and judging the safety of the power system through the abnormal score.
In one embodiment, the determining the template monitoring parameter according to the correlation coefficient between the sample of each monitoring parameter and the samples of other monitoring parameters includes:
taking any one monitoring parameter as a target monitoring parameter, and taking an average value of correlation coefficients of the target monitoring parameter and other monitoring parameters as a comprehensive correlation degree of the target monitoring parameter; and taking the monitoring parameter with the maximum comprehensive association degree as a template monitoring parameter.
In one embodiment, the selecting the threshold value randomly with equal probability in the value range of the template monitoring parameter, and constructing an isolated forest of the template monitoring parameter according to the sample of the template monitoring parameter includes:
the method comprises the steps of (1) setting N isolated trees in an isolated forest of a template monitoring parameter, wherein N represents a preset number, for any one isolated tree, putting a sample of the template monitoring parameter into a root node of the isolated tree, randomly selecting a threshold value in a value range of the template monitoring parameter at equal probability, dividing all data in the root node into two groups through the threshold value, taking each group as a child node of the root node, repeating the dividing operation in each child node until the data in the child node is not subdivided or the height of the isolated tree reaches the preset height, and stopping the dividing operation to obtain the isolated tree.
In one embodiment, the division effect of each threshold on the isolated tree of the template monitoring parameter satisfies the relation:
where g represents the effect of dividing the threshold value on the isolated tree of the template monitoring parameter, S1 represents the amount of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S2 represents the amount of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S represents the amount of all data contained in the sample of the template monitoring parameter, Z1 represents the average value of the values of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z2 represents the average value of the values of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z represents the size of the value range of the template monitoring parameter, f1 represents the average value of the anomaly scores of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, f2 represents the average value of the anomaly scores of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter,the expression takes absolute value, min () takes minimum value, max () takes maximum value, exp () represents an exponential function based on natural constant.
In one embodiment, the first node and the second node corresponding to the threshold value refer to a left child node and a right child node of two child nodes divided by the threshold value.
In one embodiment, the probability that each data is used as a threshold value in the value range of each monitoring parameter satisfies the relation:
wherein p (x) represents the probability that data x in the value range of the monitoring parameter is taken as a threshold value, x represents the data in the value range of the monitoring parameter, L represents the correlation coefficient of the monitoring parameter and the template monitoring parameter, w represents the size of the value range of the monitoring parameter, Z represents the size of the value range of the template monitoring parameter, H () represents the partitioning effect fitting function of the template monitoring parameter,among all the data contained in the samples representing the monitored parameters, in +.>The ratio of the amount of data in the range to the amount of all data contained in the sample of the monitored parameter.
In one embodiment, determining the safety of the power system by the anomaly score includes:
if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is larger than a preset threshold, the power system has a safety problem, and if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is smaller than or equal to the preset threshold, the power system is safe.
In a second aspect, the present invention also provides a multi-source data-based power system safety monitoring system, comprising: a processor; a memory storing computer program instructions for implementing a multi-source data based power system safety monitoring method in accordance with one or more embodiments described above.
The invention has the beneficial effects that: according to the method, the threshold value is randomly selected according to the probability of taking each data in the value range of each monitoring parameter as the threshold value, and the non-equal probability in the value range of each monitoring parameter, so that the stability of the performance of the isolated forest of each monitoring parameter is improved, the accuracy of the abnormal score calculated by the isolated forest of each monitoring parameter is higher, and the accuracy of safety monitoring of the power system is improved.
Furthermore, the invention selects the monitoring parameters with large correlation coefficients with the samples of other monitoring parameters as the template monitoring parameters, and only establishes the isolated forest of the template monitoring parameters to determine the dividing effect of each threshold value on the isolated tree of the template monitoring parameters, thereby determining the probability of each data as the threshold value in the value range of each monitoring parameter according to the correlation coefficients of each monitoring parameter and the template monitoring parameters and the dividing effect fitting function of the template monitoring parameters, and improving the efficiency of the safety monitoring method of the electric power system.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. In the drawings, embodiments of the invention are illustrated by way of example and not by way of limitation, and like reference numerals refer to similar or corresponding parts and in which:
FIG. 1 is a flow chart schematically illustrating a method for multi-source data based power system safety monitoring in accordance with the present invention;
fig. 2 is a block diagram schematically illustrating a multi-source data-based power system safety monitoring system according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The isolated forest anomaly detection algorithm is a conventional data analysis method, and when the data of various monitoring parameters of the power system are subjected to data analysis through the isolated forest anomaly detection algorithm, the anomaly score of the data of the various monitoring parameters of the power system is calculated through the constructed isolated forest, so that whether the power system is abnormal or not is judged; the method comprises the steps of dividing all data in nodes into two groups through a threshold value, taking each group as a new node, and repeating the dividing step on the new node until the data in the node cannot be subdivided or the height of the isolated tree reaches a preset height; therefore, the structure of the isolated tree depends on the selection of the threshold, and as the different threshold dividing effects are different, in the conventional isolated forest anomaly detection algorithm, the threshold is selected randomly in the value range of the data with medium probability, so that the performance of the isolated tree constructed by the threshold selected randomly with equal probability is unstable, the anomaly score calculated by the isolated forest is inaccurate, the real anomaly point cannot be accurately identified, whether the power system is abnormal or not cannot be accurately judged, and the safety monitoring of the power system is affected.
In summary, taking the relevance of data of various monitoring parameters of a power system into consideration, the invention selects the monitoring parameters with large correlation coefficients with samples of other monitoring parameters as template monitoring parameters, randomly selects thresholds at equal probability within the value range of the template monitoring parameters, constructs an isolated forest of the template monitoring parameters, and determines the dividing effect of each threshold on the isolated tree of the template monitoring parameters according to the difference of the quantity, the numerical value and the abnormal score of all data contained in two corresponding nodes of each threshold on the isolated tree of the template monitoring parameters; fitting all the threshold dividing effects on the isolated tree of the template monitoring parameters to determine a dividing effect fitting function of the template monitoring parameters; determining the probability of taking each datum as a threshold value in the value range of each monitoring parameter according to the correlation coefficient of each monitoring parameter and the template monitoring parameter and the dividing effect fitting function of the template monitoring parameter; for each monitoring parameter, according to the probability that each data is used as a threshold value in the value range of each monitoring parameter, randomly selecting the threshold value in the value range of each monitoring parameter at unequal probability, and constructing an isolated forest of each monitoring parameter, so that the stability of the performance of the isolated forest of each monitoring parameter is improved, the accuracy of an abnormal score calculated by the isolated forest of each monitoring parameter is improved, and the accuracy of safety monitoring of a power system is further improved.
Specific embodiments of the present invention are described in detail below with reference to the accompanying drawings.
The embodiment of the invention discloses a power system safety monitoring method based on multi-source data, which comprises the following steps of S1-S7 with reference to FIG. 1:
s1: samples of each monitored parameter are collected.
It should be noted that, the multi-source data mainly refers to the diversity of data sources, the data from different sources presents different modes, the data from different sources can be supported, supplemented and corrected, and the multi-source data can provide more accurate information than the data from a single source; therefore, the embodiment realizes the safety monitoring of the power system by performing data analysis on the voltage data, the current data, the frequency data and the like of the power system.
Optionally, the data of each monitoring parameter of the power system is collected through a sensor, and all the data of each monitoring parameter collected in a preset sampling period are taken as samples of each monitoring parameter.
Preferably, the data of each monitoring parameter of the power system is collected by the sensor according to a preset sampling frequency, and all the data of each monitoring parameter including, but not limited to, voltage, current and frequency are collected by the voltage sensor, the current sensor and the frequency sensor respectively, as samples of each monitoring parameter in a preset sampling period.
Specific values of the preset sampling frequency and the preset sampling period can be set according to actual application scenes and requirements, the preset sampling frequency is set to be 3 seconds/time, and the preset sampling period is set to be 10 minutes.
S2: template monitoring parameters are determined.
Specifically, for any two monitoring parameters, the correlation coefficients of the two monitoring parameters are calculated by pearson (pearson) correlation analysis according to samples of the two monitoring parameters.
It should be noted that, the larger the correlation coefficient of the two monitoring parameters is, the stronger the correlation of the data of the two monitoring parameters is.
Determining the template monitoring parameters according to the correlation coefficients of the samples of each monitoring parameter and the samples of other monitoring parameters, including: taking any one monitoring parameter as a target monitoring parameter, and taking an average value of correlation coefficients of the target monitoring parameter and other monitoring parameters as a comprehensive correlation degree of the target monitoring parameter; and taking the monitoring parameter with the greatest comprehensive association degree as a template monitoring parameter.
It should be noted that, since the performance of the isolated tree constructed by the threshold randomly selected by the equal probability is unstable, the proper threshold is required to be selected if the performance of the isolated forest to be constructed is stable, and whether the selected threshold is proper or not depends on the dividing effect of the threshold, and the dividing effect of the threshold is required to be determined by analyzing the constructed isolated forest; the power system is provided with a plurality of monitoring parameters, and if the monitoring parameters are all constructed in an isolated forest, the dividing effect of the threshold value is further judged, so that the method is low in efficiency; in order to improve efficiency, in consideration of the fact that data of various monitoring parameters of a power system have relevance, the method selects the monitoring parameters with larger relevance with other monitoring parameters from the various monitoring parameters, namely, the monitoring parameters with larger relevance with other monitoring parameters, and the monitoring parameters are used as template monitoring parameters, and then only the dividing effect of a threshold value is determined by analyzing an isolated forest of the template monitoring parameters, and the probability that each data in the value range of each monitoring parameter is used as the threshold value is determined according to the relevant coefficients of each monitoring parameter and the template monitoring parameter and the dividing effect fitting function of the template monitoring parameters.
S3: and constructing an isolated forest of the template monitoring parameters, and determining the dividing effect of each threshold value on the isolated tree of the template monitoring parameters.
Specifically, a threshold value is randomly selected in the value range of the template monitoring parameters at equal probability, an isolated forest of the template monitoring parameters is constructed according to the sample of the template monitoring parameters, and the isolated forest of the template monitoring parameters comprises a preset number of isolated trees; and determining the dividing effect of each threshold value on the isolated tree of the template monitoring parameters according to the difference of the quantity, the numerical value and the abnormal score of all data contained by the two nodes corresponding to each threshold value on the isolated tree of the template monitoring parameters.
The value range of the template monitoring parameter refers to the range formed by the minimum value and the maximum value in the sample of the template monitoring parameter.
The method comprises the steps of randomly selecting a threshold value in the value range of a template monitoring parameter at equal probability, constructing an isolated forest of the template monitoring parameter according to samples of the template monitoring parameter, wherein the isolated forest of the template monitoring parameter comprises N isolated trees, N represents a preset number, the nature of each isolated tree is a binary tree, for any one isolated tree, putting the samples of the template monitoring parameter into a root node of each isolated tree, randomly selecting the threshold value at equal probability in the value range of the template monitoring parameter, wherein the random selection at equal probability means that the probability of each data in the value range is equal, dividing all the data in the root node into two groups through the threshold value, taking each group as a child node of the root node, repeating the dividing operation in each child node until the data in the child node is not available or the height of each isolated tree reaches the preset height, stopping the dividing operation, and obtaining the isolated tree, wherein two child nodes divided through the threshold value are taken as nodes corresponding to the threshold value, the child nodes at the left side in the two child nodes divided through the threshold value, and the first node corresponding to the threshold value is marked as the second node corresponding to the threshold value in the two child nodes divided through the threshold value.
It should be noted that, the specific process of the isolated forest of the monitoring parameters of the modeling board refers to an isolated forest anomaly detection algorithm, which is a known technique and will not be described here again; the threshold is randomly selected by the equal probability to generate a plurality of isolated trees, which are known steps in the isolated forest anomaly detection algorithm, and are not described in detail herein.
Specific values of the preset number and the preset height can be set according to actual application scenes and requirements, the preset number is set to be 100, and the preset height is set to be 30.
The determining the dividing effect of each threshold on the isolated tree of the template monitoring parameter according to the difference of the quantity, the numerical value and the abnormal score of all data contained by two nodes corresponding to each threshold on the isolated tree of the template monitoring parameter comprises the following steps: for any threshold value on the isolated tree of the template monitoring parameter, the dividing effect of the threshold value meets the relation:
where g represents the effect of dividing the threshold value on the isolated tree of the template monitoring parameter, S1 represents the amount of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S2 represents the amount of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S represents the amount of all data contained in the sample of the template monitoring parameter, Z1 represents the average value of the values of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z2 represents the average value of the values of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z represents the size of the value range of the template monitoring parameter, f1 represents the average value of the anomaly scores of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, f2 represents the average value of the anomaly scores of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter,the expression takes absolute value, min () takes minimum value, max () takes maximum value, exp () represents an exponential function based on natural constant.
The method for calculating the anomaly score of the data is a known technique in an isolated forest anomaly detection algorithm, and will not be described herein.
The number of abnormal data is small and the abnormal score of the abnormal data is large, compared with the normal data, and the abnormal data and the normal dataDifferences exist in the number, value, and anomaly scores of the data; when constructing an isolated tree, all data in the nodes are segmented through a threshold value, and the aim is to separate abnormal data from normal data as far as possible, so that when the dividing effect of the threshold value is measured, the difference of the quantity, the numerical value and the abnormal score of all data contained in two nodes corresponding to the threshold value, and the minimum value of the quantity of all data contained in the two nodes and the minimum value of the abnormal score are combined; wherein,representing the difference in number of all data contained by two nodes, +.>Representing the difference in value of all data contained by two nodes,/->The difference of all data contained in two nodes on the abnormal score is represented, and the larger the difference of all data contained in the two nodes on the three aspects of the number, the numerical value and the abnormal score is, the more likely the data contained in the two nodes are normal data and abnormal data respectively, the better the dividing effect of the threshold values corresponding to the two nodes is; />The minimum value representing the number of all data contained in the two nodes is more likely to be the number of abnormal data because the number of abnormal data is smaller than the number of normal data, and the smaller the minimum value of the number of all data contained in the two nodes is, the more likely the data in the node corresponding to the minimum value is the abnormal data, and the better the dividing effect of the threshold value corresponding to the two nodes is; />The maximum value representing the average value of the abnormality scores of all the data included in the two nodes is because the abnormality score of the abnormality data is larger than the abnormality score of the normal data, and therefore, the two nodesThe maximum value of the average value of the abnormality scores of all the data contained in each node is more likely to be the average value of the abnormality scores of the abnormality data, and the greater the maximum value of the average value of the abnormality scores of all the data contained in both nodes is, the more likely the data in the node corresponding to the maximum value is the abnormality data, and the better the dividing effect of the threshold values corresponding to both nodes is.
S4: and determining a partitioning effect fitting function of the template monitoring parameters.
Specifically, the method for fitting the dividing effect of all the thresholds on the isolated tree of the template monitoring parameter refers to a least square method, the dividing effect of all the thresholds on the isolated tree of the template monitoring parameter is fitted by the least square method, and a dividing effect fitting function of the template monitoring parameter is determined, wherein the dividing effect fitting function of the template monitoring parameter is used for determining the dividing effect of each data as the threshold in the value range of the template monitoring parameter, and the method comprises the following steps: and inputting each data in the value range of the template monitoring parameters into the partitioning effect fitting function, and outputting the result which is the partitioning effect of each data in the value range of the template monitoring parameters as a threshold value.
S5: and determining the probability of taking each data in the value range of each monitoring parameter as a threshold value.
Specifically, determining the probability of each data as a threshold value in the value range of each monitoring parameter according to the correlation coefficient of each monitoring parameter and the template monitoring parameter and the partitioning effect fitting function of the template monitoring parameter, wherein the probability comprises the following steps: for any monitoring parameter, the probability of taking each data as a threshold value in the value range of the monitoring parameter satisfies the relation:
wherein p (x) represents the probability that data x in the value range of the monitoring parameter is taken as a threshold value, x represents the data in the value range of the monitoring parameter, L represents the correlation coefficient between the monitoring parameter and the template monitoring parameter, w represents the size of the value range of the monitoring parameter, Z represents the size of the value range of the template monitoring parameter, and H () represents the dividing efficiency of the template monitoring parameterThe result is a function of the fit,among all the data contained in the samples representing the monitored parameters, in +.>The ratio of the amount of data in the range to the amount of all data contained in the sample of the monitored parameter.
It should be noted that the number of the substrates,data x representing the value range of the monitoring parameter corresponding to the value range of the template monitoring parameter, < ->Data in the value range of the monitoring parameter of the representation template +.>The greater the value, the more data +.>Data->The better the division effect as threshold, the corresponding data +.>The better the dividing effect of the data x corresponding to the value range of the monitoring parameter as the threshold value is, the better the dividing effect of the data x as the threshold value is, the more the data x should be used as the threshold value, and the correlation coefficient of the monitoring parameter and the template monitoring parameter is used as the data in the value range of the template monitoring parameter>The confidence coefficient of the dividing effect of the data x in the value range of the monitoring parameter is used as the dividing effect of the data x, the greater the relevance is, the greater the confidence coefficient is, and finally, the dividing effect is obtained by +.>Determining the probability of data x as threshold value, and +.>The larger the probability that the data x is the threshold value is, the larger; because the number of the abnormal data is small, the adjacent value range of the data x is +.>The smaller the number of the internal data is, the larger the probability that the adjacent data of the data x belongs to the abnormal data is, the more the data x is used as a threshold value, and the larger the probability that the data x is used as the threshold value is correspondingly.
S6: and constructing an isolated forest of each monitoring parameter.
Specifically, according to the probability that each data is used as a threshold value in the value range of each monitoring parameter, randomly selecting the threshold value in the value range of each monitoring parameter at unequal probability, and constructing an isolated forest of each monitoring parameter according to the sample of each monitoring parameter, wherein the isolated forest of each monitoring parameter comprises a preset number of isolated trees.
It should be noted that, according to the probability that each data is used as a threshold value in the value range of each monitoring parameter, the thresholds are randomly selected according to the unequal probabilities in the value range of each monitoring parameter, the isolated forest of each monitoring parameter is constructed, the probability that the threshold value with good dividing effect is selected is larger, the probability that the threshold value with poor dividing effect is selected is smaller, so that the stability of the performance of the isolated forest of each monitoring parameter is improved, the accuracy of the abnormal score calculated by the isolated forest of each monitoring parameter is higher, and the accuracy of the safety monitoring of the electric power system is further improved.
S7: and judging the safety of the power system through the abnormal score of the data of each monitoring parameter at the moment to be monitored.
Specifically, for the data of each monitoring parameter at the moment to be monitored, determining an abnormal score of the data of each monitoring parameter at the moment to be monitored through an isolated forest of each monitoring parameter, and judging the safety of the power system through the abnormal score of the data of each monitoring parameter at the moment to be monitored: if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is larger than a preset threshold, the power system has a safety problem, and if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is smaller than or equal to the preset threshold, the power system is safe.
The specific value of the preset threshold value can be set according to the actual application scene and the requirement, and the preset threshold value is set to be 0.8.
Fig. 2 is a block diagram schematically illustrating a multi-source data-based power system safety monitoring system according to the present invention.
The invention also provides a power system safety monitoring system based on the multi-source data. As shown in fig. 2, the system includes a processor and a memory storing computer program instructions that when executed by the processor implement a multi-source data based power system safety monitoring method in accordance with the foregoing.
The system further comprises other components known to those skilled in the art, such as a communication bus and a communication interface, the arrangement and function of which are known in the art and are therefore not described in detail herein.
In the context of this patent, the foregoing memory may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, the computer readable storage medium may be any suitable magnetic or magneto-optical storage medium, such as, for example, resistance change Memory RRAM (Resistive Random Access Memory), dynamic Random Access Memory DRAM (Dynamic Random Access Memory), static Random Access Memory SRAM (Static Random-Access Memory), enhanced dynamic Random Access Memory EDRAM (Enhanced Dynamic Random Access Memory), high-Bandwidth Memory HBM (High-Bandwidth Memory), hybrid storage cube HMC (Hybrid Memory Cube), etc., or any other medium that may be used to store the desired information and that may be accessed by an application, a module, or both. Any such computer storage media may be part of, or accessible by, or connectable to, the device. Any of the applications or modules described herein may be implemented using computer-readable/executable instructions that may be stored or otherwise maintained by such computer-readable media.
In the description of the present specification, the meaning of "a plurality", "a number" or "a plurality" is at least two, for example, two, three or more, etc., unless explicitly defined otherwise.
While various embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Many modifications, changes, and substitutions will now occur to those skilled in the art without departing from the spirit and scope of the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.

Claims (8)

1. A multi-source data based power system safety monitoring method, comprising:
collecting data of each monitoring parameter of the power system through a sensor, and taking all the data of each monitoring parameter collected in a preset sampling period as a sample of each monitoring parameter;
determining template monitoring parameters according to the correlation coefficients of the samples of each monitoring parameter and the samples of other monitoring parameters;
randomly selecting a threshold value in the value range of the template monitoring parameters at equal probability, and constructing an isolated forest of the template monitoring parameters according to the sample of the template monitoring parameters, wherein the isolated forest comprises a plurality of isolated trees; determining the dividing effect of each threshold value on the isolated tree of the template monitoring parameter according to the difference of the quantity, the numerical value and the abnormal score of all data contained by two nodes corresponding to each threshold value on the isolated tree of the template monitoring parameter;
fitting all the threshold dividing effects on the isolated tree of the template monitoring parameters to determine a dividing effect fitting function of the template monitoring parameters;
determining the probability of taking each datum as a threshold value in the value range of each monitoring parameter according to the correlation coefficient of each monitoring parameter and the template monitoring parameter and the dividing effect fitting function of the template monitoring parameter;
according to the probability that each data in the value range of each monitoring parameter is used as a threshold value, randomly selecting the threshold value according to the unequal probability in the value range of each monitoring parameter, and constructing an isolated forest of each monitoring parameter according to the sample of each monitoring parameter;
and determining an abnormal score of the data of each monitoring parameter at the moment to be monitored through the isolated forest of each monitoring parameter, and judging the safety of the power system through the abnormal score.
2. The method for monitoring the safety of the power system based on the multi-source data according to claim 1, wherein the determining the template monitoring parameter according to the correlation coefficient between the sample of each monitoring parameter and the samples of other monitoring parameters comprises:
taking any one monitoring parameter as a target monitoring parameter, and taking an average value of correlation coefficients of the target monitoring parameter and other monitoring parameters as a comprehensive correlation degree of the target monitoring parameter; and taking the monitoring parameter with the maximum comprehensive association degree as a template monitoring parameter.
3. The method for monitoring the safety of the power system based on the multi-source data according to claim 1, wherein the method for constructing the isolated forest of the template monitoring parameters according to the samples of the template monitoring parameters comprises the following steps:
the method comprises the steps of (1) setting N isolated trees in an isolated forest of a template monitoring parameter, wherein N represents a preset number, for any one isolated tree, putting a sample of the template monitoring parameter into a root node of the isolated tree, randomly selecting a threshold value in a value range of the template monitoring parameter at equal probability, dividing all data in the root node into two groups through the threshold value, taking each group as a child node of the root node, repeating the dividing operation in each child node until the data in the child node is not subdivided or the height of the isolated tree reaches the preset height, and stopping the dividing operation to obtain the isolated tree.
4. The method for monitoring the safety of the power system based on the multi-source data according to claim 1, wherein the dividing effect of each threshold value on the isolated tree of the template monitoring parameter satisfies the relation:
where g represents the effect of dividing the threshold value on the isolated tree of the template monitoring parameter, S1 represents the amount of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S2 represents the amount of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, S represents the amount of all data contained in the sample of the template monitoring parameter, Z1 represents the average value of the values of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z2 represents the average value of the values of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter, Z represents the size of the value range of the template monitoring parameter, f1 represents the average value of the anomaly scores of all data contained in the first node corresponding to the threshold value on the isolated tree of the template monitoring parameter, f2 represents the average value of the anomaly scores of all data contained in the second node corresponding to the threshold value on the isolated tree of the template monitoring parameter,the expression takes absolute value, min () takes minimum value, max () takes maximum value, exp () represents an exponential function based on natural constant.
5. The method for monitoring the safety of the power system based on the multi-source data according to claim 4, wherein the first node and the second node corresponding to the threshold value refer to a left child node and a right child node of two child nodes divided by the threshold value.
6. The method for monitoring the safety of the power system based on the multi-source data according to claim 1, wherein the probability of each data as a threshold value in the value range of each monitoring parameter satisfies the relation:
wherein p (x) represents the probability that data x in the value range of the monitoring parameter is taken as a threshold value, x represents the data in the value range of the monitoring parameter, L represents the correlation coefficient of the monitoring parameter and the template monitoring parameter, w represents the size of the value range of the monitoring parameter, Z represents the size of the value range of the template monitoring parameter, H () represents the partitioning effect fitting function of the template monitoring parameter,among all the data contained in the samples representing the monitored parameters, in +.>The ratio of the amount of data in the range to the amount of all data contained in the sample of the monitored parameter.
7. The method for monitoring the safety of the power system based on the multi-source data according to claim 1, wherein the judging the safety of the power system by the abnormality score comprises:
if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is larger than a preset threshold, the power system has a safety problem, and if the average value of the abnormal scores of the data of all the monitoring parameters at the moment to be monitored is smaller than or equal to the preset threshold, the power system is safe.
8. A multi-source data based power system safety monitoring system, comprising:
a processor;
a memory storing computer program instructions for implementing a multi-source data based power system safety monitoring method, which when executed by the processor implements a multi-source data based power system safety monitoring method according to any one of claims 1-7.
CN202410272154.XA 2024-03-11 2024-03-11 Power system safety monitoring method and system based on multi-source data Active CN117874653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410272154.XA CN117874653B (en) 2024-03-11 2024-03-11 Power system safety monitoring method and system based on multi-source data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410272154.XA CN117874653B (en) 2024-03-11 2024-03-11 Power system safety monitoring method and system based on multi-source data

Publications (2)

Publication Number Publication Date
CN117874653A true CN117874653A (en) 2024-04-12
CN117874653B CN117874653B (en) 2024-05-31

Family

ID=90594983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410272154.XA Active CN117874653B (en) 2024-03-11 2024-03-11 Power system safety monitoring method and system based on multi-source data

Country Status (1)

Country Link
CN (1) CN117874653B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118070200A (en) * 2024-04-19 2024-05-24 天津市第五中心医院 Big data-based organoid abnormality monitoring system

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
CN108776683A (en) * 2018-06-01 2018-11-09 广东电网有限责任公司 A kind of electric power operation/maintenance data cleaning method based on isolated forest algorithm and neural network
JP2019020124A (en) * 2017-07-11 2019-02-07 富士通株式会社 Abnormality detection program, abnormality detection method, and information processing apparatus
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
CN111666169A (en) * 2020-05-13 2020-09-15 云南电网有限责任公司信息中心 Improved isolated forest algorithm and Gaussian distribution-based combined data anomaly detection method
CN111951116A (en) * 2020-08-26 2020-11-17 江苏云脑数据科技有限公司 Medical insurance anti-fraud monitoring and analyzing method and system based on unsupervised isolated point detection
WO2021109314A1 (en) * 2019-12-06 2021-06-10 网宿科技股份有限公司 Method, system and device for detecting abnormal data
WO2021114821A1 (en) * 2019-12-12 2021-06-17 支付宝(杭州)信息技术有限公司 Isolation forest model construction and prediction method and device based on federated learning
CN113435547A (en) * 2021-08-27 2021-09-24 中国环境监测总站 Water quality index fusion data anomaly detection method and system
CN113688870A (en) * 2021-07-22 2021-11-23 国网江苏省电力有限公司营销服务中心 Group renting house identification method based on user electricity utilization behavior by adopting hybrid algorithm
CN114707571A (en) * 2022-02-24 2022-07-05 南京审计大学 Credit data anomaly detection method based on enhanced isolation forest
CN115061838A (en) * 2022-03-28 2022-09-16 京东科技信息技术有限公司 Fault detection method and system
CN115099335A (en) * 2022-06-23 2022-09-23 广东电网有限责任公司广州供电局 Abnormal identification and feature screening method and system for multi-source heterogeneous data
WO2023169098A1 (en) * 2022-03-10 2023-09-14 东南大学 Isolation forest-based method for diagnosing open-circuit fault of modular multilevel converter
CN116776258A (en) * 2023-08-24 2023-09-19 北京天恒安科集团有限公司 Power equipment monitoring data processing method and system
CN117031509A (en) * 2023-08-01 2023-11-10 南京林业大学 Soil humidity inversion method and device integrating isolated forest and deep learning
CN117411811A (en) * 2023-12-15 2024-01-16 山西思极科技有限公司 Intelligent fault monitoring method for power communication equipment

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
JP2019020124A (en) * 2017-07-11 2019-02-07 富士通株式会社 Abnormality detection program, abnormality detection method, and information processing apparatus
CN108776683A (en) * 2018-06-01 2018-11-09 广东电网有限责任公司 A kind of electric power operation/maintenance data cleaning method based on isolated forest algorithm and neural network
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
WO2021109314A1 (en) * 2019-12-06 2021-06-10 网宿科技股份有限公司 Method, system and device for detecting abnormal data
WO2021114821A1 (en) * 2019-12-12 2021-06-17 支付宝(杭州)信息技术有限公司 Isolation forest model construction and prediction method and device based on federated learning
CN111666169A (en) * 2020-05-13 2020-09-15 云南电网有限责任公司信息中心 Improved isolated forest algorithm and Gaussian distribution-based combined data anomaly detection method
CN111951116A (en) * 2020-08-26 2020-11-17 江苏云脑数据科技有限公司 Medical insurance anti-fraud monitoring and analyzing method and system based on unsupervised isolated point detection
CN113688870A (en) * 2021-07-22 2021-11-23 国网江苏省电力有限公司营销服务中心 Group renting house identification method based on user electricity utilization behavior by adopting hybrid algorithm
CN113435547A (en) * 2021-08-27 2021-09-24 中国环境监测总站 Water quality index fusion data anomaly detection method and system
CN114707571A (en) * 2022-02-24 2022-07-05 南京审计大学 Credit data anomaly detection method based on enhanced isolation forest
WO2023169098A1 (en) * 2022-03-10 2023-09-14 东南大学 Isolation forest-based method for diagnosing open-circuit fault of modular multilevel converter
CN115061838A (en) * 2022-03-28 2022-09-16 京东科技信息技术有限公司 Fault detection method and system
CN115099335A (en) * 2022-06-23 2022-09-23 广东电网有限责任公司广州供电局 Abnormal identification and feature screening method and system for multi-source heterogeneous data
CN117031509A (en) * 2023-08-01 2023-11-10 南京林业大学 Soil humidity inversion method and device integrating isolated forest and deep learning
CN116776258A (en) * 2023-08-24 2023-09-19 北京天恒安科集团有限公司 Power equipment monitoring data processing method and system
CN117411811A (en) * 2023-12-15 2024-01-16 山西思极科技有限公司 Intelligent fault monitoring method for power communication equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
E. KHALEDIAN ET AL.: "Real-Time Synchrophasor Data Anomaly Detection and Classification Using Isolation Forest, KMeans, and LoOP", 《IEEE TRANSACTIONS ON SMART GRID》, vol. 12, no. 3, 31 December 2020 (2020-12-31), pages 2378 - 2388, XP011850451, DOI: 10.1109/TSG.2020.3046602 *
祝诚勇 等: "基于专家反馈的广义孤立森林异常检测算法", 《计算机应用研究》, vol. 41, no. 1, 31 January 2024 (2024-01-31), pages 88 - 93 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118070200A (en) * 2024-04-19 2024-05-24 天津市第五中心医院 Big data-based organoid abnormality monitoring system
CN118070200B (en) * 2024-04-19 2024-07-05 天津市第五中心医院 Big data-based organoid abnormality monitoring system

Also Published As

Publication number Publication date
CN117874653B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
CN117874653B (en) Power system safety monitoring method and system based on multi-source data
CN117556369B (en) Power theft detection method and system for dynamically generated residual error graph convolution neural network
CN117077044B (en) Method and device for judging faults of vacuum circuit breaker for generator
CN114610706A (en) Electricity stealing detection method, system and device based on oversampling and improved random forest
CN117078048A (en) Digital twinning-based intelligent city resource management method and system
CN113127464B (en) Agricultural big data environment feature processing method and device and electronic equipment
CN111539475A (en) Multi-source temperature data fusion method based on Kalman filtering
CN116719797A (en) Flash furnace missing data generation and filling method based on IF-DDPM
CN117134318A (en) Photovoltaic power generation power prediction method, device, medium and equipment
CN111159251A (en) Method and device for determining abnormal data
CN112967154B (en) Assessment method and device for Well-rolling of power system
CN115329873A (en) Low-standard-capacity transformer identification method, device, equipment and medium
CN116819368A (en) Method and device for estimating battery health
CN114757291A (en) Single-phase fault identification optimization method, system and equipment based on machine learning algorithm
CN115600494A (en) Low-voltage distribution area topology automatic identification method and device
CN115115107A (en) Photovoltaic power prediction method and device and computer equipment
CN113268552A (en) Generator equipment hidden danger early warning method based on locality sensitive hashing
CN117150393B (en) Power system weak branch identification method and system based on decision tree
CN118554450A (en) Power grid running state safety monitoring method and device based on big data
CN116418674B (en) Method and device for automatic low-delay management of intelligent internet of things VPN router system
CN117215205B (en) DC system control parameter analysis method based on decision tree and ISS theory
CN117998448B (en) Wireless network quality data acquisition method and system
CN118330408A (en) Intelligent cable life prediction method and system based on data analysis
CN116644325A (en) Safety early warning method and system applied to hydrogeological investigation scene
CN116436006A (en) Power grid topology estimation method and device based on PMU-AMI mixed data of intelligent power distribution room

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant