CN117332322A - Boundary region importance sampling method - Google Patents

Boundary region importance sampling method Download PDF

Info

Publication number
CN117332322A
CN117332322A CN202311072657.4A CN202311072657A CN117332322A CN 117332322 A CN117332322 A CN 117332322A CN 202311072657 A CN202311072657 A CN 202311072657A CN 117332322 A CN117332322 A CN 117332322A
Authority
CN
China
Prior art keywords
boundary region
sampling
sample
equation
sphere
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311072657.4A
Other languages
Chinese (zh)
Inventor
刘颂凯
陈浩
刘峻良
晏光辉
张涛
李文武
李欣
郭攀锋
刁良涛
江进波
曹成
王丰
李丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges University CTGU
Original Assignee
China Three Gorges University CTGU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges University CTGU filed Critical China Three Gorges University CTGU
Priority to CN202311072657.4A priority Critical patent/CN117332322A/en
Publication of CN117332322A publication Critical patent/CN117332322A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Supply And Distribution Of Alternating Current (AREA)
  • Complex Calculations (AREA)

Abstract

A boundary region importance sampling method, comprising the steps of: step (1) determining a boundary region using information entropy; step (2) using a sampling method based on a Monte Carlo variance reduction technology (MCVR), constructing effective sampling, introducing deviation in the sampling process, so that the characterization of rare events in an evaluation stage is increased; through the steps, the offline training sample set is efficiently generated.

Description

Boundary region importance sampling method
Technical Field
The invention relates to the field of transient security assessment of power systems, in particular to a boundary area importance sampling method, which is a divisional application of an invention patent with the name of a transient security assessment method (application number 2020103010104) based on boundary area importance sampling and a kernel vector machine.
Background
On the one hand, due to the environment of the interconnected power systems, a large number of devices, such as smart meters and new energy sources, have been connected to the grid. As the scale of the power grid continues to expand, the complexity of the power grid also continues to increase, which presents a significant challenge for safe operation of the power grid. Meanwhile, with the continuous development of national economy and society, the requirements on the safe and stable operation and the power supply reliability of the power system are also higher and higher. The reform of the electric power system in China is deepened continuously, and the electric power system is developed in the direction of long distance and extra-high voltage. The novel loads such as various large-scale energy storage elements, electric automobile charging piles and the like are continuously connected, and the trans-regional high-capacity tie line power transmission system is gradually put into operation, so that the stability and the scheduling operation of the power system face serious challenges. In order to avoid huge economic loss and social influence caused by power outage in the whole country, the transient stability assessment of the power system plays an important role in analysis and judgment of the dynamic behavior of the system.
On the other hand, area criteria such as time domain simulation, a direct method (including a Lyapunov method and a transient energy function method) and expansion are mainstream methods for transient stability evaluation of an electric power system. These methods can provide near real-time or real-time assessment of transient stability of the power system, but leave room for improvement in terms of computational accuracy, speed and capacity. The time domain simulation method has low calculation speed, can not provide stability margin, and is difficult to be applied to real-time online analysis. The direct method and the extended area method can obtain the stability margin of the system, but are limited to application under a simple model, and cannot completely meet the requirement of online calculation. With the continuous development of synchronous phasor measurement units in an electric power system, the methods cannot utilize a large amount of phasor measurement unit data to perform real-time online calculation. Meanwhile, with the maturation of wide area measurement technology and the development of big data theory, machine learning has become one of the main methods for online stability assessment of power systems. However, conventional machine learning methods still have a number of drawbacks, such as: the efficiency problem of training a sample set; evaluating the evaluation result; the transient security information cannot be visualized; training time is too long to be suitable for large-scale data. Some accidents often occur in an actually operated power system, and the conventional transient safety evaluation model is difficult to evaluate the accidents.
In summary, the conventional method is difficult to adapt to the practical requirement of the modern power grid with high-speed development on real-time transient security evaluation, and a real-time evaluation method capable of meeting high adaptability and high precision is urgently needed.
Patent document with the authority of publication number CN104881741A discloses a power system transient stability judging method based on a support vector machine, wherein an input characteristic quantity set of the support vector machine (Support Vector Machine, SVM) is determined by utilizing round-by-round optimization, and then a transient stability evaluation rule is established through the SVM. Firstly, determining an input vector alternative set, the number of input vector elements, a kernel function of an SVM and training parameters, then generating a training sample and a test sample, adding all the alternative feature quantities into the input feature quantity set one by one, training the SVM, determining the feature quantity with highest SVM classification accuracy, further judging whether feature quantity selection calculation is finished or not and outputting the input feature quantity set, and finally training the SVM and obtaining a stability rule. However, this method has the following drawbacks:
(1) the power system safety assessment method requires a large number of sample sets to train or test its performance, and generating such sample sets is a very difficult task, even for smaller-scale power systems. It would be quite time consuming to determine the set of input features of the SVM using round-by-round optimization.
(2) Compared to SVM, the kernel vector machine (Core Vector Machine, CVM) has higher accuracy, lower temporal and spatial complexity, and thus higher efficiency.
Disclosure of Invention
The invention aims to provide a method which is beneficial to improving the evaluation speed and precision, so that the method has extremely strong applicability in the field of transient safety evaluation of a power system, is beneficial to system operators to take preventive control measures in time, and improves the stability of safe operation of a power grid.
The aim of the invention is realized as follows:
the transient state safety evaluation method based on the boundary area importance sampling and the kernel vector machine is characterized by comprising the following steps of:
step one): acquiring a system operation sample by utilizing historical operation data of the power system and analog simulation of a series of faults of the power system, constructing a dynamic safety index and establishing a corresponding sample database;
step two): for the sample database, a boundary area importance sampling method is used for sampling the sample database to efficiently generate an offline training sample set, and standard normalization is carried out on the sample set;
step three): based on the sample set, combining with CVM, constructing a transient security assessment model of the power system, and performing offline training and updating on the model by utilizing the sample set;
step four): based on the real-time operation data of the power system, the continuously updated evaluation model is utilized to complete the evaluation of the real-time transient state safety state of the power system, and a transient state safety evaluation result is obtained.
In the first step), based on historical operation data and an expected accident set of the power system, carrying out detailed power flow analysis and time domain simulation to obtain a system operation sample, and establishing a corresponding sample database.
And performing time domain simulation by using PSS/E software to obtain limit cutting time (Critical Clearing Time, CCT) of each fault position under each running state. Typically, when the CCT is greater than the actual clean time (Actual Clearing Time, ACT), the operating state of the system is judged to be safe. Thus, a transient safety index, i.e. a transient safety margin (Transient Stability Margin, TSM), is constructed as shown in equation (1):
wherein: CCT (CCT) i Limiting cutting time under an accident i for a certain position of the power system; ACT (active transport protocol) i The actual cutting time of the fault point under the accident i is the actual cutting time; TSM (TSM) i Is a transient safety margin for that location. The definition of TSM is shown in equation (2):
in step two), the boundary region importance sampling method used for the established sample database is divided into the following two steps:
1) The information entropy is used to determine the boundary region as shown in equation (3):
wherein: s is a sample data set; c is the number of categories; p is p i The proportion in S classified as class i. From the concept of entropy, a measure of the purity of the sample database can be obtained. The larger the value of E (S) is, the lower the purity, i.e., the more information content is, and therefore a place where the entropy value is relatively large is defined as a boundary region. By usingThis approach approximately determines the boundary region.
2) An efficient sampling is constructed using a sampling method based on the monte carlo variance reduction (Monte Carlo Variance Reduction, MCVR) technique. A bias is introduced during sampling such that the characterization of rare events in the evaluation phase increases. The boundary region is approximately determined in step 1), and the sampling method based on the MCVR technology is used to bias the sampling process towards the boundary region. Thus, an offline training sample set may be obtained.
Standard normalization is carried out on the training sample set generated efficiently so as to reduce the calculation burden of the machine, and the standard normalization mode is shown in a formula (4):
wherein:a value of a certain operation variable after standard normalization; x is x i An original value for the run variable; x is x i_min A minimum value for the variable in the acquired sample; x is x i_max Maximum value of the variable in the acquired sample; in this way the values of all variables are varied from 0 to 1.
The power system safety assessment method requires a large sample set to train or test its performance. Since historical data often contains a limited number of anomalies, and related information in the vicinity of the boundary region is often lost, simulation data is required for this purpose. Generating such a sample set is a very difficult task, even for smaller scale power systems. Therefore, by using a boundary region importance sampling method, the boundary region is mainly biased in the sampling process, and safe and unsafe regions can be mapped. The sample set with rich information and small data volume is generated, so that the training process is faster and the prediction precision is higher.
In step three), the efficiently generated sample set is input into a training model. CVM (chemical vapor deposition) medicineOverfeature mappingThe sample set S is projected into a high dimensional space to build a minimum bounding sphere (Minimum Enclosing Ball, MEB) and solve the MEB problem. And solving the MEB problem by adopting a CVM algorithm. By S t 、c t And R is t The core set, center of sphere and radius over t iterations are represented, respectively. Center and radius sphere B is defined by c B And r B Representing, given a positive number ε, the offline training process is as follows:
1)S 0 、c 0 and R is 0 Initializing:
selecting an arbitrary point z ε S to initialize S 0 = { z }, in the feature space, z is found a E S the point furthest from z, then can be found a Another point z furthest b E S, the initial core set is S 0 ={z a ,z b The initial sphere center isInitial sphere center R 0
2) If there is no pointOutside the (1+ε) sphere, the algorithm ends. Otherwise, the core set is S t+1 =S t U { z }, z is->Separation c t The furthest point;
3) Searching for new MEBs:
new MEB (S) t+1 ) Given by step 2), andand->Can be according toObtained, wherein α= [ α ] 12 ,...,α m ]' is the Lagrangian multiplier and k is the kernel matrix. And then go to 2) step for the next iteration.
Through the steps, an offline training model can be obtained.
A variety of factors that may affect the transient safety state of the power system are comprehensively considered, including topology changes, generator power changes, load power changes, and other operating condition changes. Aiming at the situation, a near-real-time updating sample set is obtained, and the sample set is used for updating the offline training model so as to obtain an updated transient security assessment model.
In the fourth step), the synchronous phasor measurement unit and the wide area monitoring system are utilized to collect the operation variables of the power system in real time, and based on real-time data, the updated transient state safety evaluation model is utilized to predict the transient state safety state of the power system, so that an online transient state safety evaluation result is obtained.
A boundary region importance sampling method, comprising the steps of:
step (1) determining a boundary region using information entropy;
step (2) uses a sampling method based on MCVR techniques to construct an efficient sample, introducing a bias in the sampling process that increases the characterization of rare events during the evaluation phase.
In step (1), the information entropy is used to determine the boundary region, as shown in equation (5):
wherein: s is a sample data set, C is the number of categories, p i The proportion in S classified as class i; from the concept of entropy, a measure of purity of the sample database can be obtained; the larger the value of E (S), the lower the purity, i.e., the more abundant the information content, and therefore, a boundary region is defined as a region where the entropy value is relatively large, and the boundary region is roughly determined by this method.
In step (2), an effective sampling is constructed by using a sampling method based on the MCVR technology, deviation is introduced in the sampling process, so that the characterization of rare events in the evaluation stage is increased, the boundary area is roughly determined in step (1), and the sampling process can be biased towards the boundary area by using the sampling method based on the MCVR technology; the method comprises the following two steps:
1) Variance reduction of importance samples:
defining the probability of an unacceptable event, i.e., P (Y-unacceptable event), as shown in equation (6):
wherein: y=t represents a threshold, Y < t represents the performance of an unacceptable event, we can define the indicator function I (Y) as shown in equation (7):
equation (6) can thus be defined as shown in equation (8):
the above-mentioned expectation function gives a rough Monte Carlo estimate, where y i Is a Monte Carlo sample extracted from the f (y) distribution, this estimate has a variance associated with it, since h (y) i ) Number of (a) with y i The variance of the estimate is reduced by reconstructing the desired function as shown in equation (9):
wherein: y is i Is a Monte Carlo sample extracted from distribution g (y), which is trueProtect theThe number of (2) is almost equal to y i Is consistent in number;
2) Efficient generation of training samples:
the first stage operation provides a boundary region where X is most likely to occur, thus determining the X-space in which we want the offset samples to be generated, and in terms of the indicator function, the sampled region is as shown in equation (10):
wherein: s is a bounding region, e.g., in the case of univariate, S= { x: x is defined 1 ≤x≤x 2 Sampling distribution function g X (x) Can be constructed as |h (x) |f (x), f X (x) Is S, and the sample density importance is expressed as shown in formula (11):
wherein: k (k) 1 And k 2 Is to satisfy the probability condition k 1 +k 2 Bias of =1, f 1X (x) Is a probability density function of the boundary region, f 2X (x) Is a probability distribution function outside the boundary region, and the sampling distribution function g X (x) At k 1 When=1, that is, completely biased toward the boundary region, the state space probability distribution conditional on the boundary region is as shown in the formula (12) and the formula (13):
a=∫ S f X (x)dx (13)
wherein: a is a scaling factor that satisfies 0.ltoreq.a.ltoreq.1. The probability distribution is changed by the above equation, so that more data comes from the boundary region, and thus an offline training sample set is obtained.
By adopting the technical scheme, the following technical effects can be brought:
(1) By using the concept of information entropy, the region with rich information can be judged, so that the boundary region can be roughly determined;
(2) By using the sampling method based on the MCVR technology, the sampling can be biased to the boundary area, and a high-efficiency sample set with rich information and small data volume can be generated, so that the speed of the offline training process can be faster;
(3) Based on the sample set generated efficiently, a transient stability evaluation model of the power system is built by combining CVM, and the evaluation result has higher precision and less use time.
Drawings
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a flow chart of a sampling method according to the present invention;
FIG. 3 is a diagram of an IEEE 39 node system topology in accordance with an embodiment of the invention;
FIG. 4 is a graph comparing data processing speeds for four different models tested by an embodiment of the present invention;
FIG. 5 is a graph comparing accuracy of model evaluation using three different sampling methods tested by an embodiment of the present invention.
Detailed Description
The transient security assessment method based on the boundary region importance sampling and the kernel vector machine, as shown in fig. 1, specifically comprises the following steps:
step one): acquiring a system operation sample by utilizing historical operation data of the power system and analog simulation of a series of faults of the power system, constructing a dynamic safety index and establishing a corresponding sample database;
step two): for the sample database, a boundary area importance sampling method is used for sampling the sample database to efficiently generate an offline training sample set, and standard normalization is carried out on the sample set;
step three): based on the sample set, combining with CVM, constructing a transient security assessment model of the power system, and performing offline training and updating on the model by utilizing the sample set;
step four): based on the real-time operation data of the power system, the continuously updated evaluation model is utilized to complete the evaluation of the real-time transient state safety state of the power system, and a transient state safety evaluation result is obtained.
In the first step), based on historical operation data and an expected accident set of the power system, carrying out detailed power flow analysis and time domain simulation to obtain a system operation sample, and establishing a corresponding sample database.
And performing time domain simulation by using PSS/E software to obtain CCT of each fault position in each running state. Typically, when CCT is greater than ACT, the operating state of the system is judged as safe. Thus, a transient safety index, TSM, is constructed as shown in equation (1):
wherein: CCT (CCT) i Limiting cutting time under an accident i for a certain position of the power system; ACT (active transport protocol) i The actual cutting time of the fault point under the accident i is the actual cutting time; TSM (TSM) i Is a transient safety margin for that location. The definition of TSM is shown in equation (2):
in step two), a boundary region importance sampling method is used for the established sample database, as shown in fig. 2, specifically including the following steps:
1) The information entropy is used to determine the boundary region as shown in equation (3):
wherein: s is the sample data set, C is the number of categories,p i the proportion in S classified as class i. From the concept of entropy, a measure of the purity of the sample database can be obtained. The larger the value of E (S) is, the lower the purity, i.e., the more information content is, and therefore a place where the entropy value is relatively large is defined as a boundary region. In this way, the boundary region is approximately determined.
2) An efficient sampling is constructed using a sampling method based on MCVR techniques. A bias is introduced during sampling such that the characterization of rare events in the evaluation phase increases. The boundary region is approximately determined in step 1), and the sampling method based on the MCVR technology is used to bias the sampling process towards the boundary region. The method comprises the following two steps:
1) Variance reduction of importance samples:
defining the probability of an unacceptable event, i.e., P (Y-unacceptable event), as shown in equation (6):
wherein: y=t represents a threshold and Y < t represents the performance of an unacceptable event. We can define the indicator function I (Y) as shown in equation (7):
equation (6) can thus be defined as shown in equation (8):
the above-mentioned expectation function gives a rough Monte Carlo estimate, where y i Is a monte carlo sample extracted from the f (y) distribution. This estimate has a variance associated with it because h (y i ) Number of (a) with y i And (3) a change. The variance of the estimate is reduced by reconstructing the desired function as shown in equation (9):
wherein: y is i Is a Monte Carlo sample extracted from distribution g (y), which ensuresThe number of (2) is almost equal to y i Is uniform in number.
2) Efficient generation of training samples:
the first stage operation provides a boundary region where X is most likely to occur, thus determining the X-space in which we want the offset samples to be generated. As for the indicator function, the sampled region is shown in formula (10):
wherein: s is a bounding region, e.g., in the case of univariate, S= { x: x is defined 1 ≤x≤x 2 }. Sampling distribution function g X (x) May be constructed as |h (x) |f (x). f (f) X (x) Is S, and the sample density importance is expressed as shown in formula (11):
wherein: k (k) 1 And k 2 Is to satisfy the probability condition k 1 +k 2 Bias of =1, f 1X (x) Is a probability density function of the boundary region, f 2X (x) Is a probability distribution function outside the boundary region. Sampling distribution function g X (x) At k 1 When=1, that is, completely biased toward the boundary region, the state space probability distribution conditional on the boundary region is as shown in the formula (12) and the formula (13):
a=∫ S f X (x)dx (13)
wherein: a is a scaling factor, which satisfies 0.ltoreq.a.ltoreq.1. The probability distribution is changed by the above equation, causing more data to come from the boundary region, thus resulting in an offline training sample set.
Standard normalization is carried out on the training sample set generated efficiently so as to reduce the calculation burden of the machine, and the standard normalization mode is shown in a formula (4):
wherein:a value of a certain operation variable after standard normalization; x is x i An original value for the run variable; x is x i_min A minimum value for the variable in the acquired sample; x is x i_max Maximum value of the variable in the acquired sample; in this way the values of all variables are varied from 0 to 1.
In step three), the efficiently generated sample set is input into a training model. CVM through feature mappingThe sample set S is projected into a high dimensional space to build an MEB (S) and solve the MEB problem. And solving the MEB problem by adopting a CVM algorithm. By S t 、c t And R is t The core set, center of sphere and radius over t iterations are represented, respectively. Center and radius sphere B is defined by c B And r B Representing, given a positive number ε, the offline training process is as follows:
1)S 0 、c 0 and R is 0 Initializing:
selecting an arbitrary point z ε S to initialize S 0 = { z }, in the feature space, z is found a E S the point furthest from z, then can be found a Another point z furthest b E S, initial coreThe heart set is S 0 ={z a ,z b The initial sphere center isInitial sphere center R 0
2) If there is no pointOutside the (1+ε) sphere, the algorithm ends. Otherwise, the core set is S t+1 =S t U { z }, z is->Separation c t The furthest point;
3) Searching for new MEBs:
new MEB (S) t+1 ) Given by step 2), andand->Can be according toObtained, wherein α= [ α ] 12 ,...,α m ]' is the Lagrangian multiplier and k is the kernel matrix. And then go to 2) step for the next iteration.
Through the steps, an offline training model can be obtained.
A variety of factors that may affect the transient safety state of the power system are comprehensively considered, including topology changes, generator power changes, load power changes, and other operating condition changes. Aiming at the situation, a near-real-time updating sample set is obtained, and the sample set is used for updating the offline training model so as to obtain an updated transient security assessment model.
In the fourth step), the synchronous phasor measurement unit and the wide area monitoring system are utilized to collect the operation variables of the power system in real time, and based on real-time data, the updated transient state safety evaluation model is utilized to predict the transient state safety state of the power system, so that an online transient state safety evaluation result is obtained.
Examples:
the inventive example uses an IEEE 39 node system. As shown in fig. 3, the test system involved 39 nodes, 10 generators, 46 transmission lines. The reference power was 100MVA and the reference voltage was 345kV. It is assumed that a synchronization vector measurement unit is installed on all buses in order to collect a large number of data sets. To generate a reasonable data set, the operating conditions of the test system are changed randomly. Consider 10 different load levels (80%, 85%, 90%, 95%, 100%, 105%, 110%, 115%, 120%, 125%), with corresponding changes in generator output. On the basis, a load-changing and power-generating method is adopted to solve the tide problem of the power system. The emergency considered is mainly a three-phase ground fault on each bus, and three locations on each transmission line (25%, 50% and 75% of the length of the line). The simulation assumes that the specific fault occurred at 0.1 seconds and was shut off at 0.3 seconds (or 0.35 seconds, 0.4 seconds). The generator is a fourth-order model, and the load is a constant impedance model. A total of 6310 samples were obtained, and 1890 samples were obtained for testing using a boundary region importance sampling method for these samples. 10 cross-validation was used for 1890 samples obtained, each validation being repeated 10 times.
Four different models were used for testing and training, including: SVM, core vector data description (Core Vector Data Description, CVDD), ball vector machine (Ball Vector Machine, BVM), CVM. Four different evaluation models for the test were evaluated comprehensively using the confusion matrix shown in table 1. In the figure, class=1 and class=0 are respectively represented as stability and instability. f (f) 11 The actual condition and the predicted condition of the system are the same, and the system is in a stable state. f (f) 00 The actual condition and the predicted condition of the system are the same, and the system is in an unstable state. f (f) 10 The representation predicts an unstable state, but the system is actually steady state. f (f) 01 Indicating that the prediction is transient steady state, but that the system is actually unstable.
The accuracy AC, the missed alarm rate FD and the false alarm rate FA are used as evaluation indexes of the classification performance.
TABLE 1
The results of the performance tests for four different types of models are given in table 2, fig. 4. As shown in table 2, the accuracy AC of the CVM model is highest, and the false alarm rate FA and the false alarm rate FD are both lowest. As shown in fig. 4, the data processing time of four different types of models is given, and the CVM model takes the least time. Therefore, the CVM model has higher precision compared with other three models, realizes lower time and space complexity and has higher efficiency than other algorithms.
TABLE 2
Model AC(%) FD(%) FA(%)
SVM 77.85 13.29 8.86
CVDD 79.11 12.53 8.36
BVM 83.54 9.88 6.58
CVM 93.04 3.83 3.13
As shown in fig. 5, the results of another study are shown comparing the model evaluation accuracy using three different sampling methods, namely sampling from the entire state space by probability distribution, sampling by uniform sampling, and sampling of boundary region importance. It can be seen that the boundary region importance sampling method shows high accuracy even in the case of a reduction in the data amount.
The results prove the effectiveness of a transient security assessment model based on the boundary area importance sampling and the kernel vector machine. The result shows that the CVM algorithm has extremely high performance, and under the condition of smaller data volume, more information content can be generated by using the boundary region importance sampling method, so that the performance of the evaluation model is improved. The training sample set generation method provided by the invention can be applied to other data mining technologies, and the proposed evaluation model can also solve the safety problem of other power systems and can be applied to actual power system operation.

Claims (5)

1. A method for sampling importance of a boundary region, comprising the steps of:
step 1: determining a boundary region using the information entropy;
step 2: using a sampling method based on a Monte Carlo variance reduction technology MCVR to construct effective sampling, introducing deviation in the sampling process, so that the characterization of rare events in an evaluation stage is increased;
through the steps, the offline training sample set is efficiently generated.
2. The boundary region importance sampling method according to claim 1, wherein in step 1, the boundary region is determined using information entropy as shown in formula (5):
wherein: s is a sample data set, C is the number of categories, p i The proportion in S classified as class i; from the concept of entropy, a measure of purity of the sample database can be obtained; the larger the value of E (S) is, the lower the purity, i.e., the more abundant the information content is, and therefore, a boundary region is defined as a place where the entropy value is relatively large, and the boundary region is determined by this method.
3. The boundary region importance sampling method according to claim 1 or 2, characterized in that in step 2, an efficient sampling is constructed using a sampling method based on the monte carlo variance reduction technique MCVR, wherein deviations are introduced during the sampling process, such that the characterization of rare events in the evaluation phase is increased, and wherein in step 1 the boundary region is determined, and wherein the sampling process is biased towards the boundary region using a sampling method based on the monte carlo variance reduction technique MCVR.
4. A method according to claim 3, characterized in that in step 2, it comprises in particular the steps of:
1) Variance reduction of importance samples:
defining the probability of an unacceptable event, i.e., P (Y-unacceptable event), as shown in equation (6):
wherein: y=t represents a threshold, Y < t represents the performance of an unacceptable event, we can define the indicator function I (Y) as shown in equation (7):
equation (6) can thus be defined as shown in equation (8):
the expectation function of equation (8) gives a rough Monte Carlo estimate, where y i Is a Monte Carlo sample extracted from the f (y) distribution, this estimate has a variance associated with it, since h (y) i ) Number of (a) with y i The variance of the estimate is reduced by reconstructing the desired function as shown in equation (9):
wherein: y is i Is a Monte Carlo sample extracted from distribution g (y), which ensuresThe number of (2) is almost equal to y i Is consistent in number;
2) Efficient generation of training samples:
the first stage operation provides a boundary region where X is most likely to occur, thus determining the X-space in which we want the offset samples to be generated, and in terms of the indicator function, the sampled region is as shown in equation (10):
wherein: s is a bounding region, in the univariate case, S= { x: x is defined 1 ≤x≤x 2 Sampling distribution function g X (x) Can be constructed as |h (x) |f (x), f X (x) Is S, and the sample density importance is expressed as shown in formula (11):
wherein: k (k) 1 And k 2 Is to satisfy the probability condition k 1 +k 2 Bias of =1, f 1X (x) Is a probability density function of the boundary region, f 2X (x) Is a probability distribution function outside the boundary region, and the sampling distribution function g X (x) At k 1 When=1, that is, completely biased toward the boundary region, the state space probability distribution conditional on the boundary region is as shown in the formula (12) and the formula (13):
a=∫ S f X (x)dx (13)
wherein: a is a scaling factor, satisfying 0.ltoreq.a.ltoreq.1, and the probability distribution is changed as illustrated by the formulas (12) and (13), so that more data comes from the boundary area, and thus an offline training sample set is obtained.
5. The method according to claim 1 or 2 or 4, characterized in that in obtaining an offline training model from the sample set, the following steps are taken:
the sample set generated efficiently is input into a training model, and the kernel vector machine CVM is mapped through characteristicsProjecting the sample set S to a high-dimensional space to establish a minimum bounding sphere MEB, solving the minimum bounding sphere MEB problem, adopting a kernel vector machine CVM algorithm to solve the minimum bounding sphere MEB problem, and using S t 、c t And R is t Respectively representing a core set, a sphere center and a radius after t times of iteration, wherein the sphere B with the center and the radius is formed by c B And r B Representing, given a positive number ε, the offline training process is as follows:
1)S 0 、c 0 and R is 0 Initializing:
selecting an arbitrary point z ε S to initialize S 0 = { z }, in the feature space, z is found a E S the point furthest from z, then can be found a Another point z furthest b E S, the initial core set is S 0 ={z a ,z b The initial sphere center isInitial sphere center R 0
2) If there is no pointIf the core set falls outside the (1+epsilon) sphere, the algorithm is ended, otherwise, the core set is S t+1 =S t U { z }, z is->Separation c t The furthest point;
3) Searching for a new minimum bounding sphere MEB:
new MEB (S) t+1 ) Given by step 2), andand->Can be according toObtained, wherein α= [ α ] 12 ,...,α m ]' is the Lagrangian multiplier, k is the kernel matrix, and then go to 2) step for next iteration;
through the steps, an offline training model can be obtained.
CN202311072657.4A 2020-04-16 2020-04-16 Boundary region importance sampling method Pending CN117332322A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311072657.4A CN117332322A (en) 2020-04-16 2020-04-16 Boundary region importance sampling method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010301010.4A CN111401476B (en) 2020-04-16 2020-04-16 Transient state safety evaluation method based on boundary region importance sampling and kernel vector machine
CN202311072657.4A CN117332322A (en) 2020-04-16 2020-04-16 Boundary region importance sampling method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202010301010.4A Division CN111401476B (en) 2020-04-16 2020-04-16 Transient state safety evaluation method based on boundary region importance sampling and kernel vector machine

Publications (1)

Publication Number Publication Date
CN117332322A true CN117332322A (en) 2024-01-02

Family

ID=71431587

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202311072657.4A Pending CN117332322A (en) 2020-04-16 2020-04-16 Boundary region importance sampling method
CN202010301010.4A Active CN111401476B (en) 2020-04-16 2020-04-16 Transient state safety evaluation method based on boundary region importance sampling and kernel vector machine

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010301010.4A Active CN111401476B (en) 2020-04-16 2020-04-16 Transient state safety evaluation method based on boundary region importance sampling and kernel vector machine

Country Status (1)

Country Link
CN (2) CN117332322A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508324B (en) * 2020-10-14 2024-02-23 浙江大学 Power system characteristic value evaluation method based on complex planar regionalization
CN112583414A (en) * 2020-12-11 2021-03-30 北京百度网讯科技有限公司 Scene processing method, device, equipment, storage medium and product
CN113313406B (en) * 2021-06-16 2023-11-21 吉林大学 Power battery safety risk assessment method for electric automobile operation big data
CN113484573B (en) * 2021-07-14 2023-03-07 国家电网有限公司 Abnormal electricity utilization monitoring method based on energy data analysis

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289842A (en) * 2011-06-13 2011-12-21 天津大学 Monte Carlo integrated illumination adaptive method
US10452793B2 (en) * 2014-08-26 2019-10-22 International Business Machines Corporation Multi-dimension variable predictive modeling for analysis acceleration
CN108183499B (en) * 2016-12-08 2021-05-28 南京理工大学 Static security analysis method based on Latin hypercube sampling probability trend
US11403554B2 (en) * 2018-01-31 2022-08-02 The Johns Hopkins University Method and apparatus for providing efficient testing of systems by using artificial intelligence tools
CN109102146B (en) * 2018-06-29 2021-10-29 清华大学 Electric power system risk assessment acceleration method based on multi-parameter linear programming
CN109492851B (en) * 2018-09-06 2021-11-30 国网浙江省电力有限公司经济技术研究院 Grid frame margin evaluation method based on load growth uncertainty of different regions
CN110442941B (en) * 2019-07-25 2022-04-29 桂林电子科技大学 Battery state and RUL prediction method based on particle filtering and process noise fusion
CN110264116A (en) * 2019-07-31 2019-09-20 三峡大学 A kind of Electrical Power System Dynamic safety evaluation method explored based on relationship with regression tree

Also Published As

Publication number Publication date
CN111401476B (en) 2023-11-03
CN111401476A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN111401476B (en) Transient state safety evaluation method based on boundary region importance sampling and kernel vector machine
Williams et al. Probabilistic load flow modeling comparing maximum entropy and Gram-Charlier probability density function reconstructions
Ni et al. Basis-adaptive sparse polynomial chaos expansion for probabilistic power flow
CN111652479B (en) Data driving method for dynamic security assessment of power system
Gao et al. Adequacy assessment of generating systems containing wind power considering wind speed correlation
Li et al. Wide-area voltage monitoring and optimization
Liu et al. A Lagrange-multiplier-based reliability assessment for power systems considering topology and injection uncertainties
CN109657913B (en) Transmission and distribution network joint risk assessment method considering distributed power supply
CN111585277B (en) Power system dynamic security assessment method based on hybrid integration model
CN115062534A (en) Method and device for calculating gas supply reliability of natural gas pipeline system
Abdelmalak et al. A polynomial chaos-based approach to quantify uncertainties of correlated renewable energy sources in voltage regulation
Jafarzadeh et al. Probabilistic dynamic security assessment of large power systems using machine learning algorithms
Li et al. Forecasting of wind capacity ramp events using typical event clustering identification
CN109635430A (en) Grid power transmission route transient signal monitoring method and system
CN109217339B (en) Construction method of static voltage security domain based on PMU configuration
Lin et al. A data-driven scheme based on sparse projection oblique randomer forests for real-time dynamic security assessment
Chen et al. Clustering-based Two-stage Probabilistic Small-signal Stability Analysis of Power Systems with Uncertainties
Kadhem et al. Differential evolution optimization algorithm based on generation systems reliability assessment integrated with wind energy
Liu et al. An Online Dynamic Security Assessment Integrated Scheme for Power Systems Based on Sparse Multinomial Naive Bayes and Canonical Correlation Forest
Xi et al. A pseudo-analytical mix sampling strategy for reliability assessment of power girds
Senyuk et al. Methodology for Forming a Training Sample for Power Systems Emergency Control Algorithm Based on Machine Learning
Qu et al. An Improved Hybrid Method for Power System Reliability Assessment
Xie et al. Research on the Key Technology of Urban Distribution Network Scheduling Support System
Zhou et al. Power-Load Fault Diagnosis via Fractal Similarity Analysis
Dong et al. Analysis of power quality events in distribution network based on interval-affine algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination