CN109871860B - Daily load curve dimension reduction clustering method based on kernel principal component analysis - Google Patents

Daily load curve dimension reduction clustering method based on kernel principal component analysis Download PDF

Info

Publication number
CN109871860B
CN109871860B CN201811302328.3A CN201811302328A CN109871860B CN 109871860 B CN109871860 B CN 109871860B CN 201811302328 A CN201811302328 A CN 201811302328A CN 109871860 B CN109871860 B CN 109871860B
Authority
CN
China
Prior art keywords
clustering
matrix
daily load
kernel
principal component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811302328.3A
Other languages
Chinese (zh)
Other versions
CN109871860A (en
Inventor
李欣然
何聪
钟伟
宋军英
汤杰
徐飘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN201811302328.3A priority Critical patent/CN109871860B/en
Publication of CN109871860A publication Critical patent/CN109871860A/en
Application granted granted Critical
Publication of CN109871860B publication Critical patent/CN109871860B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses a daily load curve dimensionality reduction clustering method based on kernel principal component analysis, which comprises the steps of firstly utilizing a kernel function to carry out nonlinear mapping on a preprocessed daily load data matrix to a high-dimensional space to obtain a high-dimensional kernel matrix; then, correcting the high-dimensional kernel matrix and performing eigenvalue decomposition to obtain corresponding eigenvalues and unitized eigenvectors, setting extraction efficiency according to the descending trend of the eigenvalues, and obtaining principal component components and the number of dimension reduction indexes; secondly, taking the characteristic value as a weight, carrying out normalization processing with the sum of the weight as 1 to obtain a weight vector, and taking the projection of the corrected Gaussian kernel matrix on the principal component as a dimension reduction data matrix; and finally, clustering the daily load curve by taking the dimension reduction data matrix and the weight vector as the input of a weighting algorithm, and determining the optimal clustering number and the clustering result based on the Silhouette index. The method improves the clustering quality while improving the calculation efficiency. The clustering result is consistent with the reality, and has a certain engineering value.

Description

Daily load curve dimension reduction clustering method based on kernel principal component analysis
Technical Field
The invention belongs to the technical field of analysis and control of power systems, and mainly relates to a daily load curve dimension reduction clustering method based on kernel principal component analysis.
Background
Daily load curve clustering is the basis of power distribution and power distribution big data mining, and has certain guiding significance on load prediction, power grid planning and demand side response. With the continuous advance of smart power grids, the informatization degree of a power system is continuously improved, a power utilization information acquisition system, a distribution network GIS system, a distribution network automation system and the like are gradually improved, and power distribution and utilization data show big data characteristics of large data volume, multiple types, quick growth and the like. How to adopt an effective data mining technology and finely divide mass users of different types under the background of big data so as to mine the internal relation among loads of different types and the corresponding information such as power utilization behaviors, power utilization characteristics and the like, and the method has important significance to power grid companies and power users.
In a traditional daily load curve clustering method, after power values of sampling time points of a daily load curve are normalized through a maximum value, the daily load curve is clustered by adopting algorithms such as K-means and fuzzy C-means and the like and by taking Euclidean distance as a similarity criterion. The method has the following two disadvantages: 1) As for the load curves in the time series, the similarity between the curves is easily influenced by many factors such as air temperature and climate, income, electricity price policy and the like, cannot be fully reflected by distance, and the curves with obvious load shapes like daily load curves have undesirable equidistance under the high-dimensional condition, so that the load curves are wrongly divided; 2) With the increasing size of the load data, the method faces huge challenges in computational efficiency.
The indirect clustering method can effectively solve two defects of the traditional method, and dimension reduction processing is carried out on the daily load curve, and dimension reduction indexes are extracted to be used as input of a clustering algorithm. Two important problems are still faced: 1) Determining the dimensionality reduction indexes and the number; 2) And determining the weight of the dimension reduction index. By selecting the appropriate number of dimension reduction indexes and reasonably configuring the weight of each index, the accuracy and the efficiency of the daily load curve clustering result can be improved to a great extent.
Disclosure of Invention
The invention solves the technical problem of providing a daily load curve dimension reduction clustering method based on kernel principal component analysis, which aims at solving the problems in the existing daily load curve clustering method.
The technical scheme adopted by the invention is as follows:
a daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps:
step 1), preprocessing daily load power curve data: firstly, identifying and correcting data in a daily load power curve, identifying the data with the load power change rate exceeding a preset threshold value as abnormal data, correcting the curve with abnormal and missing data through a unitary three-point parabolic interpolation algorithm, then performing per-unit processing on the corrected daily load power curve data by taking the maximum power value of the daily load power curve as a reference value, and obtaining a normalized daily load power curve active power per-unit value matrix as an original data matrix;
step 2), according to the original data matrix, firstly selecting a Gaussian radial basis kernel function to perform nonlinear mapping on the Gaussian radial basis kernel function to a high-dimensional characteristic space to obtain a kernel matrix, and then correcting the kernel matrix;
step 3), performing principal component analysis on the corrected kernel matrix, and determining the number of principal component components and dimension reduction indexes: firstly, decomposing characteristic values to obtain characteristic values and characteristic vectors, then obtaining unitized characteristic vectors by a Gram-Schmidt orthogonal method, calculating the accumulated contribution rate corresponding to the characteristic values, then forming different fitting curves by fitting the characteristic values of which the quantity is sequentially increased, selecting the quantity of the characteristic values corresponding to the fitting curve with the smallest area of the graph surrounded by the x axis and the y axis as the quantity of characteristic indexes, taking the quantity as the quantity of dimension reduction indexes, and taking the unitized characteristic vectors corresponding to the quantity as principal component components;
step 4), a projection matrix obtained by projection of the corrected kernel matrix on the principal component is used as a dimensionality reduction data matrix obtained after dimensionality reduction through kernel principal component analysis, and then the eigenvalue corresponding to each principal component is used as a dimensionality reduction index weight, and meanwhile, a corresponding dimensionality reduction index weight vector is obtained;
and 5), combining the dimensionality reduction data matrix and the dimensionality reduction index weight vector, and clustering by a weighted k-means algorithm: firstly, randomly selecting a plurality of samples in a dimension reduction data matrix as initial clustering centers, then dividing each sample into clusters of the initial clustering centers with the minimum corresponding weighted Euclidean distance by combining dimension reduction indexes and index weight vectors, updating the clustering centers according to the sample condition in each formed cluster, then calculating the square error from each sample to the clustering centers, finishing clustering when the square error is smaller than the square error obtained after last iteration, and otherwise returning to the steps to perform a new round of iteration; and after clustering is finished, the Silhouette index of the clustering result under different clustering numbers is calculated, and the clustering number with the maximum Silhouette index value and the clustering result are taken as the best.
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of 1):
step 1-1), identifying and correcting abnormal data in the daily load power curve:
firstly, abnormal data is identified, and the power value P of a certain load curve at each sampling time point is identified k =[p k,1 ,p k,2 ···,p k,m ] T And (3) identifying abnormal data according to the formula (1):
Figure BDA0001852707870000031
in the formula: delta k,i The load power change rate of the load curve at the ith point is regarded as abnormal data after exceeding a preset threshold value epsilon;
then, correcting, and if the data loss and abnormal quantity of a certain load curve are not less than 10%, directly deleting the load curve;
if the data missing amount and the abnormal amount of a certain load curve are lower than 10%, interpolation fitting is carried out through a unitary three-point parabolic interpolation algorithm to obtain the values of the abnormal amount and the missing amount again, and the unitary three-point parabolic interpolation algorithm comprises the following steps:
let n nodes x i (i =0,1, ·, n-1) has a function value y i =f(x i ) Has x 0 <x 1 <···<x n-1 Corresponding to function value y 0 <y 1 <···<y n-1 To calculate the approximate function value z = f (t) of the specified interpolation point t, the 3 nodes closest to t are selected: x is the number of k-1 、x k 、x k+1 (x k <t<x k+1 ) Then the value of z is calculated according to the formula (2) of parabolic interpolation, i.e.
Figure BDA0001852707870000032
In the formula, when | x k -t|<|t-x k+1 I, |, m = k-1; when | x k -t|>|t-x k+1 I, |, m = k;
if the interpolation point t is not in the interval containing n nodes, only 2 nodes at one end of the interval are selected for linear interpolation;
step 1-2), per-unit processing is carried out on the corrected daily load power curve data:
note P k =[p k1 ,...,p ki ,...,p km ]∈R 1×m For the m-point original active power matrix of the corrected k-th daily load power curve, k =1,2,3, …, N is the total number of daily load power curves, p ki The original active power of the ith point of the kth daily load power curve is i =1,2, …, and m is the number of sampling points; then P = [ P = 1 ,...,P k ,...,P N ] T ∈R N×m An m-point original active power matrix for N daily load power curves, wherein R 1×m Refers to a real number matrix of 1 row and m columns, R N×m A real number matrix with N rows and m columns;
taking the maximum power value p of daily load power curve k.max =max{p k1 ,p k2 ,...,p ki ,...,p km The original data samples are subjected to per-unit processing according to equation (3) as a reference value,
p' ki =p ki /p k·max (3)
obtaining a normalized active power per unit value matrix of a daily load power curve:
P' k =[p' k1 ,p' k2 ,...,p' ki ,…,p' km ]∈R 1×m and let the matrix be A ∈ R N×m Wherein N is the number of daily load curves.
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of 2):
step 2-1), carrying out nonlinear mapping on the original data matrix A to a high-dimensional space to obtain a kernel matrix K:
let vector x i ∈R 1×m ,y j ∈R 1×m (i ≠ j) is respectively per-unit data of any two daily load curves in the original data matrix A, and nonlinear mapping is carried out on the matrix A according to a formula (4) by a Gaussian radial basis kernel function to obtain a kernel matrix K;
Figure BDA0001852707870000041
step 2-2), correcting the kernel matrix K:
the kernel matrix K is corrected according to the following formula (5):
Figure BDA0001852707870000042
in formula (5): i, j =1,2, ·, m, m is the daily load curve dimension, i.e. the number of sampling points, N is the number of daily load curves, and for all i, j, there are l i,j =1。
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of 3):
the method for performing principal component analysis on the corrected kernel matrix K comprises the following steps:
a) Performing eigenvalue decomposition on the corrected kernel matrix K to obtain an eigenvalue lambda 12 ,···,λ N And corresponding feature vector v 1 ,v 2 ,···,v N
b) Sorting the characteristic values in a descending order, and adjusting corresponding characteristic vectors;
c) Orthogonal method using Gram-SchmidtUnitizing the feature vector to obtain alpha 12 ,···,α N
The method for determining the principal component components and the number of the dimensionality reduction indexes comprises the following steps:
calculating the eigenvalue lambda 12 ,···,λ N Corresponding cumulative contribution rate B 1 ,B 2 ,···,B N Setting extraction efficiency eta, if the cumulative contribution rate corresponding to the t-th characteristic value is greater than the set extraction efficiency, namely B t If the t is greater than or equal to eta, the t is set as the number of dimension reduction indexes, and lambda is 12 ,···λ t Corresponding unitized feature vector alpha 12 ,···α t Namely the main component; the method for setting the efficiency eta comprises the following steps:
a) Setting an initial value of the efficiency η such that t =3, the first three characteristic values λ 123 Fitting a curve y = ax + b, and calculating the area S1 of a graph surrounded by the fitting curve and the x axis and the y axis;
b) Changing an initial value of the efficiency eta so that t =4,5,6.. Calculating areas S2, S3, S4 … of a graph surrounded by a fitting curve and an x axis and a y axis;
c) Comparing the area of the graph enclosed under different t values, wherein eta corresponding to the minimum area is the extraction efficiency, and B is satisfied at the moment t T corresponding to more than or equal to eta is the finally determined characteristic index number.
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of 4):
the method for obtaining the dimension reduction data matrix B comprises the following steps:
calculating a projection Y = K · α of the modified kernel matrix K on the principal component, where α = (α =) 12 ,···α t ) The obtained projection matrix Y is a dimension reduction data matrix B obtained after the data is subjected to kernel principal component analysis and dimension reduction;
the method for determining the weight vector of the dimensionality reduction index comprises the following steps:
let the characteristic value corresponding to each principal component be lambda 12 ,···λ t Each characteristic valueThe occupied weight is determined according to equation (6),
Figure BDA0001852707870000051
the weight coefficient w corresponding to each index calculated according to the formula (6) i That is, the dimension reduction index weight vector W = [ W ] is determined 1 ,w 2 ,···,w t ]。
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of 5):
the clustering step by the weighted k-means algorithm is as follows:
a) Initialization: the initialization comprises initialization of an initial clustering number k, an iteration number n and an initial clustering center;
b) Sample classification: randomly selecting k samples in the dimension reduction data matrix B as initial clustering centers
Figure BDA0001852707870000052
Let y i =[y i,1 ,y i,2 ,···,y i,t ]For the ith sample in the reduced dimension data matrix B, it goes to the jth cluster center c j =[c j.1 ,c j.2 ,···,c j.t ]The weighted Euclidean distance of (c) is calculated according to the following formula (7), and all samples i are divided into the clustering centers c with the closest weighted Euclidean distances j In (1),
Figure BDA0001852707870000053
c) Updating a clustering center: let n j Is the number of class j samples obtained from step b), y i,j Updating the clustering centers of all classes according to the following formula (8) for the ith sample in the jth class;
Figure BDA0001852707870000061
d) And (3) iterative calculation: setting the current iteration number as t, calculating the square error J (t) from all samples in the matrix B to the corresponding clustering center according to the following formula (9), comparing the square error J (t) with the error J (t-1) of the previous time, if J (t) -J (t-1) < 0, skipping to the step B), otherwise, finishing the algorithm and outputting the corresponding clustering result;
Figure BDA0001852707870000062
the method for determining the optimal clustering number and the clustering result based on the Silhouette index comprises the following steps:
e) Setting maximum cluster number
Figure BDA0001852707870000063
Minimum number of clusters k min =2, where N is the number of samples;
f) Based on k min ≤k≤k max Obtaining clustering results under different clustering numbers by a weighted k-means algorithm, and then calculating Silhouette indexes for the clustering results respectively;
g) And comparing the Silhouette indexes under different clustering numbers, wherein the k value corresponding to the maximum Silhouette index is the optimal clustering number k, and the corresponding result is the optimal clustering result.
The daily load curve dimension reduction clustering method based on kernel principal component analysis comprises the following steps of:
let N daily load curves be classified into k classes, and for the ith sample in the jth class, define d a (i) The average distance between the sample i and other samples in the class is shown, and the smaller the value of the average distance is, the more compact the class is; definition of d b (i) The minimum average distance from the sample i to all samples in non-homogeneous classes is defined, and the larger the value of the minimum average distance is, the more dispersion between the classes is indicated; silhouette index I of sample I Sil (i) Silhouette index I of all samples, defined as formula (10) Silhouette Is defined by equation (11):
Figure BDA0001852707870000064
Figure BDA0001852707870000065
in the formula (11) I Silhouette The larger the value is, the better the clustering quality is, the negative value indicates clustering failure, I Silhouette And k corresponding to the maximum value is the optimal clustering number, and the clustering result corresponding to the k value is the optimal clustering result.
The technical effect of the invention is that the efficiency and quality of daily load curve clustering can be improved to a great extent under the background of big data. On the other hand, the method can effectively solve the problems that the number and the weight of dimension reduction indexes are difficult to determine in the dimension reduction processing process in the indirect clustering method. The clustering result is in accordance with the actual engineering, and powerful support can be provided for a power grid company to analyze the power utilization behavior of users and formulate a reasonable power utilization plan. Has good application prospect.
Drawings
FIG. 1 is a block diagram of the general concept of the method of the present invention.
FIG. 2 is a diagram showing the result of descending order of eigenvalues.
FIG. 3 is a flow chart of the weighted k-means algorithm.
Fig. 4 is a flowchart of determining the optimal cluster number and cluster result based on the Silhouette index.
Detailed Description
The daily load curve dimension reduction clustering method based on kernel principal component analysis provided by the invention is explained in detail by combining the accompanying drawings as follows:
the general idea block diagram of the invention is shown in fig. 1, and comprises the following steps:
1) Preprocessing daily load power curve data to obtain an original data matrix A belonging to R N×m . Wherein N is the number of daily load curves, and m is the dimension, namely the number of sampling points;
2) Combining the original data matrix A obtained in the step 1), selecting a Gaussian Radial Basis (RBF) kernel function, carrying out nonlinear mapping on the RBF kernel function to a high-dimensional characteristic space, obtaining a kernel matrix K, and correcting the kernel matrix K;
3) Performing principal component analysis on the corrected kernel matrix K, and determining principal component components and the number of dimension reduction indexes;
4) Combining the kernel matrix K obtained in the step 2) and the principal component components determined in the step 3) to obtain a dimension reduction data matrix B, taking the characteristic values corresponding to the principal component components as dimension reduction index weights, and determining a dimension reduction index weight vector W;
5) And (3) combining the dimension reduction data matrix B obtained in the step 4) and the dimension reduction index weight vector W, clustering by using a weighted k-means algorithm, and determining the optimal clustering number and the clustering result based on the Silhouette index.
The step 1) comprises the following steps:
1-1) identifying and correcting abnormal data in a daily load power curve;
1-2) per unit processing is carried out on the corrected daily load power curve data;
the relevant explanation for the above steps is as follows:
the method for identifying the abnormal data in the step 1-1) specifically comprises the following steps:
note P k =[p k,1 ,p k,2 ···,p k,m ] T And (3) identifying abnormal data by using a formula (1) for the power value of a certain load curve at each sampling time point.
Figure BDA0001852707870000071
In the formula: delta k,i And (3) regarding the load power change rate of the load curve at the ith point as abnormal data when the load power change rate exceeds a preset threshold value epsilon. Without loss of generality, ε may be 0.5-0.8.
The method for correcting the abnormal data in the step 1-1) comprises the following specific steps:
if the data loss amount and the abnormal amount of a certain load curve reach 10% or more, the curve is determined to be invalid and the load curve is directly deleted.
If the data missing amount and the abnormal amount of a certain load curve are lower than 10%, setting the values of the missing amount and the abnormal amount to be 0, and then carrying out interpolation fitting on the missing amount and the abnormal amount which are set to be 0 by using a unitary three-point parabolic interpolation algorithm to obtain a normal value again. The principle of the unitary three-point parabolic interpolation algorithm is as follows:
let n number of nodes x i (i =0,1, ·, n-1) has a function value y i =f(x i ) Has x 0 <x 1 <···<x n-1 Corresponding to function value y 0 <y 1 <···<y n-1 . To calculate the approximate function value z = f (t) for a given interpolation point t, the 3 nodes closest to t are selected: x is the number of k-1 、x k 、x k+1 (x k <t<x k+1 ) Then the value of z is calculated according to the parabolic interpolation equation (2), i.e.
Figure BDA0001852707870000081
In the formula, when | x k -t|<|t-x k+1 I, |, m = k-1; when | x k -t|>|t-x k+1 I, |, m = k.
If the interpolation point t is not in the interval containing n nodes, only 2 nodes at one end of the interval are selected for linear interpolation.
The method for performing per unit processing on the corrected daily load power curve data in the step 1-2) specifically includes:
note P k =[p k1 ,...,p ki ,...,p km ]∈R 1×m For the m-point original active power matrix of the corrected k-th daily load power curve, k =1,2,3, …, N is the total number of daily load power curves, p ki The original active power of the ith point of the kth daily load power curve is i =1,2, …, m, m is the number of sampling points, namely dimension, and is generally 48; then P = [ P = 1 ,...,P k ,...,P N ] T ∈R N×m Is an m-point original active power matrix of N daily load power curves, wherein R 1 × m Refers to a real number matrix of 1 row and m columns, R N × m A real number matrix with N rows and m columns;
taking the maximum power value p of daily load power curve k.max =max{p k1 ,p k2 ,...,p ki ,...,p km The original data samples are subjected to per-unit processing according to equation (3) as a reference value,
p' ki =p ki /p k·max (3)
obtaining a normalized daily load power curve active power per unit value matrix P' k =[p' k1 ,p' k2 ,...,p' ki ,...,p' km ]∈R 1×m And let the matrix be A ∈ R N×m
2) Combining the original data matrix A obtained in the step 1), selecting a Gaussian Radial Basis Function (RBF) kernel function, performing nonlinear mapping on the RBF kernel function to a high-dimensional characteristic space to obtain a kernel matrix K, and correcting the kernel matrix K; the step 2) comprises the following steps;
2-1) carrying out nonlinear mapping on the original data matrix A to a high-dimensional space to obtain a kernel matrix K;
2-2) correcting the kernel matrix K;
the above steps are explained in relation to the following:
the method for performing nonlinear mapping on the original data matrix A to the high-dimensional space in the step 2-1) to obtain the kernel matrix K specifically comprises the following steps:
let vector x i ∈R 1×m ,y j ∈R 1×m And (i ≠ j) is respectively any two daily load curve per unit data in the original data matrix A, and nonlinear mapping is carried out on the matrix A according to a formula (4) by a Gaussian Radial Basis (RBF) kernel function to obtain a kernel matrix K.
Figure BDA0001852707870000091
The method for correcting the kernel matrix K in the step 2-2) comprises the following steps:
the kernel matrix K is corrected in accordance with the following formula (5).
Figure BDA0001852707870000092
In formula (5): i, j =1,2, ·, m, m is the daily load curve dimension, N is the number of daily load curves, and there are l for all i, j i,j =1。
3) Performing principal component analysis on the corrected kernel matrix K, and determining principal component components and the number of dimension reduction indexes;
the method for performing principal component analysis on the corrected kernel matrix K in the step 3) comprises the following steps:
a) Performing eigenvalue decomposition on the corrected kernel matrix K to obtain an eigenvalue lambda 12 ,···,λ N And corresponding feature vector v 1 ,v 2 ,···,v N
b) Sorting the eigenvalues in descending order (fig. 2 is the result of sorting the eigenvalues in descending order), and adjusting the corresponding eigenvectors;
c) Utilizing Gram-Schmidt orthogonal method to unitize characteristic vector to obtain alpha 12 ,···,α N
The method for determining the number of the principal component components and the dimensionality reduction indexes in the step 3) comprises the following steps:
calculating the eigenvalue lambda 12 ,···,λ N Corresponding cumulative contribution rate B 1 ,B 2 ,···,B N Setting the extraction efficiency eta, if B t The number of dimension reduction indexes is set to t and lambda 12 ,···λ t Corresponding unitized feature vector alpha 12 ,···α t I.e. the principal component. The method for setting the efficiency eta comprises the following steps:
a) Setting an initial value of the efficiency η such that t =3, the first three characteristic values λ 123 Fitting the curve y = ax + b, and calculating the area S1 of the graph surrounded by the fitting curve, the x axis and the y axis;
b) Changing an initial value of the efficiency eta so that t =4,5,6.. The area S2, S3 and S4 … of a graph surrounded by a fitting curve and an x axis and a y axis is calculated;
c) And comparing the sizes of the areas of the graphs enclosed under different t values, wherein the t corresponding to the minimum area is the finally determined number of the characteristic indexes.
4) Combining the kernel matrix K obtained in the step 2) and the principal component components determined in the step 3) to obtain a dimension reduction data matrix B, taking the characteristic values corresponding to the principal component components as dimension reduction index weights, and determining a dimension reduction index weight vector W;
the method for obtaining the dimensionality reduction data matrix B in the step 4) comprises the following steps:
calculating a projection Y = K · α of the corrected kernel matrix K on the extracted principal component, where α = (α =) 12 ,···α t ). And the obtained projection matrix Y is a dimension reduction data matrix B obtained after the data is subjected to kernel principal component analysis and dimension reduction.
The method for determining the dimensionality reduction index weight vector in the step 4) comprises the following steps:
let the eigenvalue corresponding to each principal component be λ 12 ,···λ t The weight of each feature value is determined according to equation (6).
Figure BDA0001852707870000101
The weight coefficient w corresponding to each index can be obtained according to the formula (6) i That is, the dimension reduction index weight vector W = [ W ] is determined 1 ,w 2 ,···,w t ]。
5) Combining the dimensionality reduction data matrix B obtained in the step 4) and the dimensionality reduction index weight vector W, clustering by using a weighted k-means algorithm, and determining the optimal clustering number and a clustering result based on a Silhouette index;
as shown in fig. 3, the step of performing cluster analysis by using the weighted k-means algorithm in step 5) includes:
a) And (5) initializing. The initialization comprises initialization of an initial clustering number k, an iteration number n and an initial clustering center, wherein the iteration number n is generally 100;
b) And (6) classifying the samples. Randomly selecting k samples in the dimension reduction data matrix B as initial clustering centers
Figure BDA0001852707870000102
Let y i =[y i,1 ,y i,2 ,···,y i,t ]For the ith sample in the reduced dimension data matrix B, it goes to the jth cluster center c j =[c j.1 ,c j.2 ,···,c j.t ]The weighted euclidean distance of (a) is calculated by the following equation (7). All samples i are divided into the clustering centers c with the nearest weighted Euclidean distances j In (1),
Figure BDA0001852707870000111
c) And updating the clustering center. Let n j Is the number of class j samples obtained from step b), y i,j Updating the clustering centers of all classes according to the following formula (8) for the ith sample in the jth class;
Figure BDA0001852707870000112
d) And (5) performing iterative computation. And (3) setting the current iteration number as t, calculating the square error J (t) from all samples in the matrix B to the corresponding clustering center according to the following formula (9), comparing the square error J (t) with the previous error J (t-1), and jumping to the step B if the J (t) -J (t-1) < 0, otherwise, finishing the algorithm and outputting the corresponding clustering result.
Figure BDA0001852707870000113
As shown in fig. 4, the method for determining the optimal clustering number and the clustering result based on the Silhouette index in step 5) includes the steps of:
a) And setting the maximum and minimum cluster numbers. Without loss of generality, set
Figure BDA0001852707870000114
k min =2, where N is the number of samples.
b) Taking the clustering number k min ≤k≤k max And obtaining clustering results under different clustering numbers based on a weighted k-means algorithm, and calculating Silhouette indexes under different clustering numbers. E.g. the preceding step gives k max Is 10, k min And 2, the set k value is changed in the range, namely clustering results with the clustering numbers between 2 and 10 are calculated, and the Silhouette index of the clustering results under the clustering numbers is calculated.
c) And (5) comparing Silhouuette indexes under different clustering numbers. And when the Silhouette index is maximum, the corresponding k value is the optimal clustering number k, and the corresponding result is the optimal clustering result.
The relevant explanation for the above steps is as follows:
the method for defining the Silhouette index in the step b) specifically comprises the following steps:
let N daily load curves be classified into k classes, and for the ith sample in the jth class, define d a (i) The average distance between the sample i and other samples in the class is shown, and the smaller the value of the average distance is, the more compact the class is; definition of d b (i) The smallest average distance from sample i to all samples within a non-homogeneous class, with larger values indicating a greater dispersion between classes. Silhouette index I of sample I Sil (i) Silhouette index I of all samples, defined as formula (10) Silhouette Is defined by the formula (11).
Figure BDA0001852707870000115
Figure BDA0001852707870000121
In the formula (11) I Silhouette The larger the value is, the better the clustering quality is, the negative value is, the clustering failure is indicated, I Silhouette The k corresponding to the maximum value is the optimal clustering number.
The daily load curve dimensionality reduction clustering method based on kernel principal component analysis comprises the steps of firstly utilizing a Gaussian Radial Basis Function (RBF) kernel function to carry out nonlinear mapping on a preprocessed daily load data matrix to a high-dimensional space to obtain a kernel matrix K; then, performing eigenvalue decomposition on the kernel matrix K to obtain corresponding eigenvalues and unitized eigenvectors, setting extraction efficiency according to the descending trend of the eigenvalues, and obtaining principal component components and the number of dimension reduction indexes; secondly, taking the characteristic value as the weight reduced as an index, carrying out normalization processing on the characteristic value to obtain a weight vector W after the sum of the weight vector W and taking the projection of the kernel matrix on the principal component as a dimension reduction data matrix B; and finally, clustering the daily load curve by taking the dimension reduction data matrix B and the weight vector W as the input of a weighted k-means algorithm, and determining the optimal clustering number and the clustering result based on the Silhouette index.

Claims (7)

1. A daily load curve dimension reduction clustering method based on kernel principal component analysis is characterized by comprising the following steps:
step 1), preprocessing daily load power curve data: firstly, identifying and correcting data in a daily load power curve, identifying the data with the load power change rate exceeding a preset threshold value as abnormal data, correcting the curve with abnormal and missing data through a unitary three-point parabolic interpolation algorithm, then performing per-unit processing on the corrected daily load power curve data by taking the maximum power value of the daily load power curve as a reference value, and obtaining a normalized daily load power curve active power per-unit value matrix as an original data matrix;
step 2), according to the original data matrix, firstly selecting a Gaussian radial basis kernel function to perform nonlinear mapping on the Gaussian radial basis kernel function to a high-dimensional characteristic space to obtain a kernel matrix, and then correcting the kernel matrix;
step 3), performing principal component analysis on the corrected kernel matrix, and determining the number of principal component components and dimension reduction indexes: firstly, decomposing characteristic values to obtain characteristic values and characteristic vectors, then obtaining unitized characteristic vectors by a Gram-Schmidt orthogonal method, calculating the cumulative contribution rate corresponding to the characteristic values, then forming different fitting curves by fitting the characteristic values of which the quantity is increased in sequence, selecting the quantity of the characteristic values corresponding to the fitting curve with the smallest area of the graph surrounded by the x axis and the y axis as the quantity of characteristic indexes, and taking the quantity as the quantity of dimension reduction indexes, wherein the unitized characteristic vectors corresponding to the quantity are the principal component;
step 4), taking a projection matrix obtained by projection of the corrected kernel matrix on the principal component components as a dimensionality reduction data matrix obtained after kernel principal component analysis dimensionality reduction, then taking a characteristic value corresponding to each principal component as a dimensionality reduction index weight, and simultaneously obtaining a corresponding dimensionality reduction index weight vector;
and 5), combining the dimensionality reduction data matrix and the dimensionality reduction index weight vector, and clustering by a weighted k-means algorithm: firstly, randomly selecting a plurality of samples in a dimension reduction data matrix as initial clustering centers, then dividing each sample into clusters of the initial clustering centers with the minimum corresponding weighted Euclidean distance by combining dimension reduction indexes and index weight vectors, updating the clustering centers according to the sample condition in each formed cluster, then calculating the square error from each sample to the clustering centers, finishing clustering when the square error is smaller than the square error obtained after last iteration, and otherwise returning to the steps to perform a new round of iteration; and after clustering is finished, the Silhouette index of the clustering result under different clustering numbers is calculated, and the clustering number with the maximum Silhouette index value and the clustering result are taken as the best.
2. The daily load curve dimension reduction clustering method based on kernel principal component analysis as claimed in claim 1, wherein said step 1) comprises the following processes:
step 1-1), identifying and correcting abnormal data in the daily load power curve:
firstly, abnormal data is identified, and the power value P of a certain load curve at each sampling time point is identified k =[p k,1 ,p k,2 ···,p k,m ] T And (3) identifying abnormal data according to the formula (1):
Figure FDA0003858794680000021
in the formula: delta k,i The load power change rate of the load curve at the ith point is regarded as abnormal data after exceeding a preset threshold value epsilon;
then, correcting, and if the data loss and abnormal quantity of a certain load curve are not less than 10%, directly deleting the load curve;
if the data missing amount and the abnormal amount of a certain load curve are lower than 10%, interpolation fitting is carried out through a unitary three-point parabolic interpolation algorithm to obtain the values of the abnormal amount and the missing amount again, and the unitary three-point parabolic interpolation algorithm comprises the following steps:
let n nodes x i Has a function value of y i =f(x i ) Wherein i =0,1, ·, n-1, has x 0 <x 1 <···<x n-1 Corresponding to function value y 0 <y 1 <···<y n-1 To calculate the approximate function value z = f (t) of the specified interpolation point t, the 3 nodes closest to t are selected: x is the number of k-1 、x k 、x k+1 Wherein x is k <t<x k+1 Then the value of z is calculated according to the formula (2) of parabolic interpolation, i.e.
Figure FDA0003858794680000022
In the formula, when | x k -t|<|t-x k+1 I, |, m = k-1; when | x k -t|>|t-x k+1 I, |, m = k;
if the interpolation point t is not in the interval containing n nodes, only 2 nodes at one end of the interval are selected for linear interpolation;
step 1-2), per unit processing is carried out on the corrected daily load power curve data:
note P k =[p k1 ,...,p ki ,...,p km ]∈R 1×m For the m-point original active power matrix of the corrected k-th daily load power curve, k =1,2,3, …, N is the total number of daily load power curves, p ki The original active power of the ith point of the kth daily load power curve is i =1,2, …, and m is the number of sampling points; then P = [ P = 1 ,...,P k ,...,P N ] T ∈R N×m Is an m-point original active power matrix of N daily load power curves, wherein R 1×m Refers to a real number matrix of 1 row and m columns, R N×m A real number matrix with N rows and m columns;
taking the maximum power value p of daily load power curve k.max =max{p k1 ,p k2 ,...,p ki ,...,p km The original data samples are subjected to per-unit processing according to equation (3) as a reference value,
p' ki =p ki /p k. max (3)
obtaining a normalized active power per unit value matrix of a daily load power curve:
P' k =[p' k1 ,p' k2 ,...,p' ki ,...,p' km ]∈R 1×m and let the matrix be A ∈ R N×m Wherein N is the number of daily load curves.
3. The daily load curve dimension reduction clustering method based on kernel principal component analysis as claimed in claim 1, wherein said step 2) comprises the following processes:
step 2-1), carrying out nonlinear mapping on the original data matrix A to a high-dimensional space to obtain a kernel matrix K:
let vector x i ∈R 1×m ,y j ∈R 1×m Respectively performing per unit data on any two daily load curves in an original data matrix A, wherein i is not equal to j, and performing nonlinear mapping on the matrix A by a Gaussian radial basis kernel function according to a formula (4) to obtain a kernel matrix K;
Figure FDA0003858794680000031
wherein i =1,2 ·, N j =1,2 ·, N i ≠ j;
step 2-2), correcting the kernel matrix K:
the kernel matrix K is corrected according to the following formula (5):
Figure FDA0003858794680000032
in formula (5): i, j =1,2, ·, m, m is the dimension of daily load curve, i.e. the number of sampling points, N is the number of daily load curve, and for all i, j, there are l i,j =1。
4. The daily load curve dimension reduction clustering method based on kernel principal component analysis as claimed in claim 1, wherein said step 3) comprises the following processes:
the method for performing principal component analysis on the corrected kernel matrix K comprises the following steps:
a) Performing eigenvalue decomposition on the corrected kernel matrix K to obtain an eigenvalue lambda 12 ,···,λ N And corresponding feature vector v 1 ,v 2 ,···,v N
b) Sorting the characteristic values in a descending order, and adjusting corresponding characteristic vectors;
c) Utilizing Gram-Schmidt orthogonal method to unitize characteristic vector to obtain alpha 12 ,···,α N
The method for determining the principal component components and the number of the dimensionality reduction indexes comprises the following steps:
calculating the eigenvalue lambda 12 ,···,λ N Corresponding cumulative contribution rate B 1 ,B 2 ,···,B N Setting extraction efficiency eta, if the cumulative contribution rate corresponding to the t-th characteristic value is greater than the set extraction efficiency, namely B t If the t is greater than or equal to eta, the t is set as the number of dimension reduction indexes, and lambda is 12 ,···λ t Corresponding unitized feature vector alpha 12 ,···α t Namely the main component; the method for setting the efficiency eta comprises the following steps:
a) Setting an initial value of the efficiency η such that t =3, the first three characteristic values λ 123 Fitting a curve y = ax + b, and calculating the area S1 of a graph surrounded by the fitting curve and the x axis and the y axis;
b) Changing an initial value of the efficiency eta so that t =4,5,6.. The area S2, S3 and S4 … of a graph surrounded by a fitting curve and an x axis and a y axis is calculated;
c) Comparing the enclosed graph surfaces under different t valuesThe product size and the eta corresponding to the minimum area are the extraction efficiency, and satisfy the condition that B is t T corresponding to more than or equal to eta is the finally determined characteristic index number.
5. The daily load curve dimension reduction clustering method based on kernel principal component analysis as claimed in claim 1, wherein said step 4) comprises the following processes:
the method for obtaining the dimension reduction data matrix B comprises the following steps:
calculating a projection Y = K · α of the modified kernel matrix K on the principal component, where α = (α =) 12 ,···α t ) The obtained projection matrix Y is a dimension reduction data matrix B obtained after the data is subjected to kernel principal component analysis and dimension reduction;
the method for determining the weight vector of the dimensionality reduction index comprises the following steps:
let the eigenvalue corresponding to each principal component be λ 12 ,···λ t The weight occupied by each characteristic value is determined according to the formula (6),
Figure FDA0003858794680000041
the weight coefficient w corresponding to each index calculated according to the formula (6) i That is, the dimension reduction index weight vector W = [ W ] is determined 1 ,w 2 ,···,w t ]。
6. The daily load curve dimension reduction clustering method based on kernel principal component analysis as claimed in claim 1, wherein said step 5) comprises the following processes:
the clustering step by the weighted k-means algorithm is as follows:
a) Initialization: the initialization comprises initialization of an initial clustering number k, an iteration number n and an initial clustering center;
b) Sample classification: randomly selecting k samples in the dimension reduction data matrix B as initial clustering centers
Figure FDA0003858794680000042
Wherein j =1,2, ·, k, let y i =[y i,1 ,y i,2 ,···,y i,t ]For the ith sample in the dimension-reduced data matrix B, it goes to the jth cluster center c j =[c j.1 ,c j.2 ,···,c j.t ]The weighted Euclidean distance of (c) is calculated according to the following formula (7), and all samples i are divided into the clustering centers c with the closest weighted Euclidean distances j In (1),
Figure FDA0003858794680000051
c) Updating a clustering center: let n j Is the number of class j samples obtained from step b), y i,j Updating various cluster centers for the ith sample in the jth class according to the following formula (8);
Figure FDA0003858794680000052
wherein j =1,2, ·, k;
d) And (3) iterative calculation: setting the current iteration number as t, calculating the square error J (t) from all samples in the matrix B to the corresponding clustering center according to the following formula (9), comparing the square error J (t) with the error J (t-1) of the previous time, if J (t) -J (t-1) < 0, skipping to the step B), otherwise, finishing the algorithm and outputting the corresponding clustering result;
Figure FDA0003858794680000053
the method for determining the optimal clustering number and the clustering result based on the Silhouette index comprises the following steps:
e) Setting maximum cluster number
Figure FDA0003858794680000054
Minimum number of clusters k min =2, where N is the number of samples;
f)based on k min ≤k≤k max Obtaining clustering results under different clustering numbers by a weighted k-means algorithm, and then calculating Silhouette indexes for the clustering results respectively;
g) And comparing the Silhouette indexes under different clustering numbers, wherein the k value corresponding to the maximum Silhouette index is the optimal clustering number k, and the corresponding result is the optimal clustering result.
7. The daily load curve dimension reduction clustering method based on kernel principal component analysis according to claim 6, wherein the method for defining the Silhouette index in the step f) is as follows:
let N daily load curves be classified into k classes, and for the ith sample in the jth class, define d a (i) The average distance between the sample i and other samples in the class is shown, and the smaller the value of the average distance is, the more compact the class is; definition of d b (i) The minimum average distance from the sample i to all samples in the non-homogeneous class is defined, and the larger the value of the minimum average distance is, the more dispersion between the classes is indicated; silhouette index I of sample I Sil (i) Silhouette index I of all samples, defined as formula (10) Silhouette Is defined by equation (11):
Figure FDA0003858794680000055
Figure FDA0003858794680000056
in the formula (11) I Silhouette The larger the value is, the better the clustering quality is, the negative value indicates clustering failure, I Silhouette And k corresponding to the maximum value is the optimal clustering number, and the clustering result corresponding to the k value is the optimal clustering result.
CN201811302328.3A 2018-11-02 2018-11-02 Daily load curve dimension reduction clustering method based on kernel principal component analysis Active CN109871860B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811302328.3A CN109871860B (en) 2018-11-02 2018-11-02 Daily load curve dimension reduction clustering method based on kernel principal component analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811302328.3A CN109871860B (en) 2018-11-02 2018-11-02 Daily load curve dimension reduction clustering method based on kernel principal component analysis

Publications (2)

Publication Number Publication Date
CN109871860A CN109871860A (en) 2019-06-11
CN109871860B true CN109871860B (en) 2022-12-13

Family

ID=66916908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811302328.3A Active CN109871860B (en) 2018-11-02 2018-11-02 Daily load curve dimension reduction clustering method based on kernel principal component analysis

Country Status (1)

Country Link
CN (1) CN109871860B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610121B (en) * 2019-06-20 2023-04-07 国网重庆市电力公司 Small-scale source load power abnormal data identification and restoration method based on curve clustering
CN110533087B (en) * 2019-08-16 2023-04-07 成都电科慧安科技有限公司 Classification method based on speed vector crowd clustering
CN110796173B (en) * 2019-09-27 2023-05-16 昆明电力交易中心有限责任公司 Load curve morphology clustering algorithm based on improved kmeans
CN111199016B (en) * 2019-09-29 2023-11-07 国网湖南省电力有限公司 Daily load curve clustering method for improving K-means based on DTW
CN110991638B (en) * 2019-11-29 2024-01-05 国网山东省电力公司聊城供电公司 Generalized load modeling method based on clustering and neural network
CN111259944B (en) * 2020-01-10 2022-04-15 河北工业大学 Block stone shape classification method based on rapid PCA algorithm and K-means clustering algorithm
CN112149052B (en) * 2020-04-30 2023-07-11 国网湖南省电力有限公司 Daily load curve clustering method based on PLR-DTW
CN111612319A (en) * 2020-05-11 2020-09-01 上海电力大学 Load curve depth embedding clustering method based on one-dimensional convolution self-encoder
CN111539657B (en) * 2020-05-30 2023-11-24 国网湖南省电力有限公司 Typical power industry load characteristic classification and synthesis method combined with user daily electricity quantity curve
CN111782708B (en) * 2020-06-29 2024-02-02 国网上海市电力公司 Non-invasive substation feeder load identification and decomposition method
CN111930839A (en) * 2020-07-31 2020-11-13 浙江浙能技术研究院有限公司 Grouping method for retired power battery energy storage device
CN115249042B (en) * 2022-06-30 2023-11-07 国网河北省电力有限公司石家庄供电分公司 Feature extraction and clustering method for non-invasive load identification
CN115271274B (en) * 2022-09-30 2022-12-27 华中科技大学 Short-term daily load prediction method for power system and related equipment
CN116662326B (en) * 2023-07-26 2023-10-20 江西省检验检测认证总院计量科学研究院 Multi-energy variety data cleaning and collecting method
CN116700408A (en) * 2023-07-31 2023-09-05 济南深蓝动物保健品有限公司 Automatic water quantity control method based on artificial intelligence and related equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440398A (en) * 2013-07-10 2013-12-11 武汉大学 Pattern recognition-based power grid branch importance estimation method
CN107657266A (en) * 2017-08-03 2018-02-02 华北电力大学(保定) A kind of load curve clustering method based on improvement spectrum multiple manifold cluster
CN108280479A (en) * 2018-01-25 2018-07-13 重庆大学 A kind of power grid user sorting technique based on Load characteristics index weighted cluster algorithm

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9212989B2 (en) * 2005-10-06 2015-12-15 Jp3 Measurement, Llc Optical determination and reporting of gas properties
MX2017008947A (en) * 2015-01-10 2017-11-15 Dullen Deborah Method and apparatus for the measurement of autonomic function for the diagnosis and validation of patient treatments and outcomes.

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440398A (en) * 2013-07-10 2013-12-11 武汉大学 Pattern recognition-based power grid branch importance estimation method
CN107657266A (en) * 2017-08-03 2018-02-02 华北电力大学(保定) A kind of load curve clustering method based on improvement spectrum multiple manifold cluster
CN108280479A (en) * 2018-01-25 2018-07-13 重庆大学 A kind of power grid user sorting technique based on Load characteristics index weighted cluster algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A probabilistic load modelling approach using clustering algorithms;M.S. ElNozahy et al.;《2013 IEEE Power & Energy Society General Meeting》;20131125;第1-5页 *

Also Published As

Publication number Publication date
CN109871860A (en) 2019-06-11

Similar Documents

Publication Publication Date Title
CN109871860B (en) Daily load curve dimension reduction clustering method based on kernel principal component analysis
CN111199016B (en) Daily load curve clustering method for improving K-means based on DTW
CN109902953B (en) Power consumer classification method based on self-adaptive particle swarm clustering
Aliniya et al. A novel combinatorial merge-split approach for automatic clustering using imperialist competitive algorithm
CN108805213B (en) Power load curve double-layer spectral clustering method considering wavelet entropy dimensionality reduction
Ding et al. Intelligent optimization methods for high-dimensional data classification for support vector machines
CN110795690A (en) Wind power plant operation abnormal data detection method
CN112633457A (en) Hyperspectral image band selection method based on multi-target rice breeding algorithm
Wang et al. Big data analytics for price forecasting in smart grids
Kumar et al. Comparative analysis of SOM neural network with K-means clustering algorithm
Mandal et al. Unsupervised non-redundant feature selection: a graph-theoretic approach
CN113688960A (en) Grey wolf optimization GHFCM-based residential power data clustering method and device
Łukasik et al. An algorithm for sample and data dimensionality reduction using fast simulated annealing
Sarhani et al. Feature selection and parameter optimization of support vector regression for electric load forecasting
CN112149052A (en) Daily load curve clustering method based on PLR-DTW
CN116307059A (en) Power distribution network region fault prediction model construction method and device and electronic equipment
Rakesh et al. A general framework for class label specific mutual information feature selection method
Zhang et al. Feature selection embedded robust K-means
Gavagsaz Efficient Parallel Processing of k-Nearest Neighbor Queries by Using a Centroid-based and Hierarchical Clustering Algorithm
Mahaweerawat et al. Adaptive self-organizing map clustering for software fault prediction
Liu et al. Dimension estimation using weighted correlation dimension method
Yin et al. Forecasting of stock price trend based on CART and similar stock
CN111310842A (en) Density self-adaptive rapid clustering method
CN113688229B (en) Text recommendation method, system, storage medium and equipment
Li et al. Soft subspace clustering with entropy constraints

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant