CN115618261A - Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM - Google Patents

Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM Download PDF

Info

Publication number
CN115618261A
CN115618261A CN202211156808.XA CN202211156808A CN115618261A CN 115618261 A CN115618261 A CN 115618261A CN 202211156808 A CN202211156808 A CN 202211156808A CN 115618261 A CN115618261 A CN 115618261A
Authority
CN
China
Prior art keywords
kelm
ssa
sample
nca
distribution network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211156808.XA
Other languages
Chinese (zh)
Inventor
汪自虎
栾宁
刘苑红
刘政生
许洪华
李蕊
瞿洪庆
陈晓勇
乐平富
邓鹏�
闫涛
韩晓慧
夏越
杜松怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electric Power Research Institute Co Ltd CEPRI
Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
China Electric Power Research Institute Co Ltd CEPRI
Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electric Power Research Institute Co Ltd CEPRI, Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd filed Critical China Electric Power Research Institute Co Ltd CEPRI
Priority to CN202211156808.XA priority Critical patent/CN115618261A/en
Publication of CN115618261A publication Critical patent/CN115618261A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of fault identification, in particular to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM, which comprises the following steps: collecting residual current data of the photovoltaic access power distribution network in different states; extracting characteristic quantity of the collected residual current data; preprocessing the extracted features; screening the preprocessed features by using a neighbor-based component analysis (NCA); training a KELM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to obtain an SSA-KELM electric leakage recognition model; and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested. The method and the device can accurately identify the type of the leakage fault of the photovoltaic access power distribution network.

Description

Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM
Technical Field
The invention relates to the technical field of fault identification, in particular to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM.
Background
In recent years, with the continuous increase of installed photovoltaic capacity, a large number of distributed photovoltaic systems are connected to a power distribution network, a grid-connected inverter is used as a key link for connecting a photovoltaic system and a power grid, and a power frequency transformer is usually used for realizing voltage matching and electrical isolation of the power grid. However, the line frequency transformer increases the weight, volume and cost of the photovoltaic system, reducing the system efficiency. The transformer-free non-isolated grid-connected inverter has the advantages of small size, low cost and simple structure, and the most outstanding advantage is that the efficiency of the whole system can be improved, so that more and more photovoltaic grid-connected systems adopt a non-isolated access scheme.
In a non-isolated photovoltaic-accessed power distribution network, due to the lack of the isolation effect of a transformer, an electrical connection exists between a photovoltaic cell and a power grid, and common-mode leakage current is generated on parasitic capacitance between a photovoltaic cell panel and the ground. Due to the existence of the common mode leakage current, the conventional leakage fault detection technology cannot distinguish the leakage current from the leakage current generated when the biological electric shock fault occurs, so that the residual current protection device frequently malfunctions, and the power distribution network cannot safely and reliably operate. In practical application, the action threshold of the leakage protection device is generally increased or the leakage protection device is directly removed, so that potential safety hazards exist in the power distribution network.
At present, the existing electric leakage identification method is mainly researched aiming at the traditional power distribution network, and the electric leakage identification research of the photovoltaic access power distribution network is not found.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM.
In order to achieve the purpose, the invention provides the following technical scheme:
a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM comprises the following steps:
step 1: collecting residual current data of the photovoltaic access power distribution network in different states;
and 2, step: extracting characteristic quantity of the collected residual current data;
and step 3: preprocessing the extracted features;
and 4, step 4: screening the preprocessed features by using a neighbor-based component analysis (NCA);
and 5: training a KeLM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to obtain an SSA-KELM electric leakage recognition model;
and 6: and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested.
Further, in step 1, different states of the power distribution network include: normal operation, when an organism electric shock occurs and when an abiotic body electric shock occurs; in step 6, outputting categories including normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.
Further, in step 2, the extracted features include 9 dimensions: root of square amplitude p 1 Absolute average value p 2 Root mean square planting of p 3 Kurtosis index p 4 And a skewness index p 5 Peak index p 6 Wave form factor p 7 Margin coefficient p 8 Sum pulse index p 9
Further, in step 3, the preprocessing method adopts a maximum and minimum normalization method.
Further, the step 4 specifically includes:
setting a preprocessed raw feature set S = { (x) i ,y i ) I =1,2, \ 8230; n }, where x i Represents the ith sample feature, y i Is the ith sample label; randomly taking a sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:
Figure BDA0003859122250000021
in the formula, w m Representing the characteristic weight when the sample characteristic is m-dimensional; m is the feature dimension and r represents the featureThe total number of dimensions; x is a radical of a fluorine atom j Is x i Neighbor samples; x is a radical of a fluorine atom im M-dimensional feature, x, representing the i-th sample jm Representing the m-dimension feature of the j sample;
x j is selected as x i The probability of the nearest sample is:
Figure BDA0003859122250000022
in the formula, n is the number of samples; x is a radical of a fluorine atom j Denotes the jth sample as x i Neighboring samples, p when the number of samples i = j ij =0;x t Denotes dividing by x i All samples except; k is a kernel function defined as:
Figure BDA0003859122250000023
where σ is the kernel width, then x i The probability of being correctly classified is:
Figure BDA0003859122250000024
in the formula, y j Is a sample x j The label of (2); y is ij =y i -y j Is a sample x i And x j When y is a label error of i =y j When y is ij =1, otherwise y ij =0;
Calculating p i Average value of (d):
Figure BDA0003859122250000025
introducing a regularization term to obtain the following objective function:
Figure BDA0003859122250000031
in the formula, n is the number of samples; λ is a regularization parameter; r represents the total number of feature dimensions;
the objective of the neighbor component analysis is to compute the weight w at which F (w) is maximized, and the equivalence problem is also expressed as:
Figure BDA0003859122250000032
Figure BDA0003859122250000033
representing a minimization objective function; f (w) represents a maximization objective function;
thus, the weight of each feature in the sample set is calculated, the greater F (w) is or
Figure BDA0003859122250000034
Smaller indicates higher correlation of the current feature with the leakage recognition model, and vice versa.
And screening the characteristics with high characteristic weight values according to the size of the characteristic weight values.
Further, the step 5 specifically includes:
step 501: dividing feature data after NCA screening into training data and testing data;
step 502: training a KeLM model of a kernel extreme learning machine by adopting training data;
step 503: optimizing the regularization coefficient C and the kernel function parameter sigma of the KELM model by using a sparrow search algorithm SSA to obtain optimal parameters, and further establishing an SSA-KELM electric leakage identification model;
step 504: and testing the constructed SSA-KELM model by using the test data, and taking the identification accuracy as an evaluation index.
Further, the step 502 specifically includes:
suppose that x is passed through N input samples k =[x k1 ,x k2 ,…,x km ] T And N output samples t k =[t k1 ,t k2 ,t k3 ] T Training KELM modelsRefining; wherein k =1, \ 8230;, N, x k Denotes the kth input sample, x km Representing the m-dimensional feature of the kth input sample, t k Denotes the kth output sample, t k1 1 st feature representing a kth output sample;
so that the error between the predicted output and the actual output of the network approaches zero, there are:
Figure BDA0003859122250000035
in the formula: beta is a i The weight between the hidden layer node and the output layer; g (x) i ,w i ,b i ) As an activation function, w i The weight between the hidden layer node and the input layer; b is a mixture of i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:
HB=T
Figure BDA0003859122250000036
wherein H is the hidden layer output matrix, B = [ beta ] 1 ,…,β L ] T A matrix formed by output weight vectors between the hidden layer nodes and the output nodes, beta L An output weight vector between the L-th hidden layer node and the output node is obtained; t = [ T = 1 ,…,t N ] T A matrix formed for the true output values of the samples;
by solving a least squares solution, we obtain:
Figure BDA0003859122250000041
solving by using a Lagrange multiplier method to obtain a network output weight:
Figure BDA0003859122250000042
in the formula: c represents a regularization coefficient, namely a penalty coefficient; i is a diagonal matrix; the output of the ELM from the above equation is:
Figure BDA0003859122250000043
defining a kernel matrix based on Mercer's conditions as:
Figure BDA0003859122250000044
in the formula: k (x) i ,x k ) The kernel function representing the ith input sample and the kth input sample adopts a radial basis kernel function:
Figure BDA0003859122250000045
in the formula: σ represents a kernel function parameter;
the output of the KELM model from the above equation is:
Figure BDA0003859122250000046
in the formula: f (x) is the actual output value of the KELM model.
Further, the step 503 specifically includes:
step 5031: initializing parameters of a sparrow search algorithm SSA, including the sparrow population size Pop, the search space dimension d and the maximum iteration number iter max The sparrow proportion SD realizing danger, the discoverer proportion PD and the early warning value ST;
step 5032: calculating the fitness value of each sparrow, and finding out the position X corresponding to the current best fitness best Position X corresponding to worst fitness worst
Step 5033: updating the positions of the discoverers, the participants and the individuals realizing danger in the sparrow population according to the sparrow searching algorithm updating rule;
step 5034: recalculating the fitness value of each sparrow after the position is updated, comparing the fitness value with the fitness value in the last iteration, and if the fitness value is higher than the original fitness value, taking the new position value as the optimal fitness value X best (ii) a Otherwise, keeping the original fitness value unchanged;
step 5035: judging whether the maximum iteration times or the solving precision is reached, if not, returning to the step 5033; and if so, stopping the iteration process, and returning the sparrow position information with the optimal fitness, namely the optimal (C, sigma) combination.
Further, the updating rule of the sparrow search algorithm in the step 5033 includes:
the update rule of the discoverer location is as follows:
Figure BDA0003859122250000051
in the formula (I), the compound is shown in the specification,
Figure BDA0003859122250000052
representing the position information of the ith sparrow in the jth dimension in the (t + 1) th iteration;
Figure BDA0003859122250000053
representing the position information of the ith sparrow in the jth dimension at the tth iteration; t represents the current iteration number; iter max Representing the maximum number of iterations of the algorithm; alpha is (0,1)]A uniform random number in between; ST ∈ [0.5,1 ]]And R 2 ∈[0,1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 × d matrix, and elements in the matrix are all 1;
the location update rule for the enrollee is as follows:
Figure BDA0003859122250000054
in the formula, X worst Representing a global worst position at the t-th iteration;
Figure BDA0003859122250000055
representing the optimal position occupied by the finder at the t +1 th iteration; i represents the population size; a represents a 1 × d matrix in which each element is randomly assigned a value of 1 or-1, and A + =A T (AA T ) -1 (ii) a num is the number of sparrows;
the location update rules for risk-aware individuals are as follows:
Figure BDA0003859122250000056
in the formula (I), the compound is shown in the specification,
Figure BDA0003859122250000057
the current global optimal position is obtained; beta represents a step size control parameter, and is a random number which follows normal distribution with the mean value of 0 and the variance of 1; k represents the direction of movement of sparrows and is [ -1,1]A random number in between; f. of i Expressing the fitness value of the ith sparrow; f. of g And f w Respectively representing the optimal and worst fitness values of the current sparrow population; epsilon is a very small constant to avoid the situation where the denominator is zero.
Further, in step 501, the feature data after NCA selection is scaled by 7:3 randomly divided into training data and test data.
Compared with the prior art, the invention has the beneficial effects that: the invention relates to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM, which is characterized in that extracted residual current signal characteristic quantity is screened based on Neighbor Component Analysis (NCA), the screened characteristic is input into a Kernel Extreme Learning Machine (KELM) model for training, parameters in the KELM model are optimized through a Sparrow Search Algorithm (SSA) to obtain optimal parameters, then an SSA electric leakage identification model for optimizing the KELM is established, and the to-be-detected residual current data is input into the trained electric leakage identification model to obtain the electric leakage fault type. The method can accurately identify the electric leakage fault type of the photovoltaic access power distribution network, avoid the false operation of the residual current protection device, further improve the power supply reliability of the photovoltaic access power distribution network, and ensure the safe and reliable operation of the power distribution network.
Drawings
FIG. 1 is a flow chart of a method for identifying leakage of electricity in a photovoltaic access distribution network based on NCA and SSA-KELM according to the present invention;
FIG. 2 is a schematic diagram of a KeLM model of the kernel-extreme learning machine according to the present invention;
FIG. 3 is a schematic diagram of residual current signals under different states according to an embodiment of the present invention; wherein:
FIG. 3 (a) is a schematic diagram of the residual current signal in the normal state in the present embodiment;
FIG. 3 (b) is a schematic diagram of the residual current signal when the living body gets an electric shock in the embodiment;
FIG. 3 (c) is a schematic diagram of the residual current signal when the inanimate object electrocutes;
FIG. 4 is a schematic diagram of residual current sample characteristic quantities according to an embodiment of the present invention; wherein:
FIG. 4 (a) is a diagram showing the square root amplitude p as the characteristic quantity in this embodiment 1 A characteristic quantity diagram of (1);
FIG. 4 (b) shows the absolute average value p of the characteristic quantity in this embodiment 2 A characteristic quantity diagram of (1);
FIG. 4 (c) shows the characteristic quantity of RMS p in this embodiment 3 A characteristic quantity diagram of (1);
FIG. 4 (d) is a diagram illustrating the kurtosis index p as a feature quantity in this embodiment 4 A characteristic quantity diagram of (1);
FIG. 4 (e) is a diagram showing the characteristic quantity as the skewness index p in the present embodiment 5 A characteristic quantity diagram of (1);
FIG. 4 (f) shows the peak index p as the characteristic quantity in this embodiment 6 A characteristic quantity diagram of (1);
FIG. 4 (g) shows the waveform factor p as the characteristic quantity in this embodiment 7 A characteristic quantity diagram of (1);
FIG. 4 (h) shows the margin coefficient p as the characteristic quantity in this embodiment 8 A characteristic quantity diagram of (1);
FIG. 4 (i) shows the present embodimentIn the case where the characteristic quantity is a pulse index p 9 A characteristic quantity diagram of (1);
FIG. 5 is a schematic diagram of feature weights calculated by the NCA algorithm in an embodiment of the present invention;
FIG. 6 is a diagram illustrating identification results of SSA-KELM models according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1, the present invention is a method for identifying leakage of a photovoltaic access distribution network based on NCA and SSA-KELM, comprising the following steps:
step 1: collecting residual current data of the photovoltaic access power distribution network in different states;
different states of the power distribution network include: normal operation, when an organism electric shock occurs and when an organism electric shock occurs; wherein, the non-biological electric shock means the electric leakage of the photovoltaic equipment.
In the embodiment of the present invention, a simulation model of a photovoltaic access distribution network is used to obtain 390 sets of residual current data when an organism is electrically shocked and an abiotic body is electrically shocked (leakage of photovoltaic devices) during normal operation, and fig. 3 shows residual current signals obtained through simulation in three states, where fig. 3 (a) shows a graph of residual current signals in a normal state, fig. 3 (b) shows a schematic diagram of residual current signals when an organism is electrically shocked, and fig. 3 (c) shows a schematic diagram of residual current signals when an abiotic body is electrically shocked.
Step 2: calculating and extracting the characteristic quantity of the collected residual current data;
converting the obtained residual current signal into: root of square amplitude p 1 Absolute average value p 2 Root mean square planting of p 3 Kurtosis index p 4 The skewness index p 5 Peak index p 6 Wave form factor p 7 Margin coefficient p 8 Sum pulse index p 9 And 9 dimensions in total, as shown in fig. 4. The calculation formula of each feature quantity is as follows:
Figure BDA0003859122250000071
Figure BDA0003859122250000072
Figure BDA0003859122250000073
Figure BDA0003859122250000074
Figure BDA0003859122250000075
Figure BDA0003859122250000076
Figure BDA0003859122250000077
Figure BDA0003859122250000078
Figure BDA0003859122250000079
in the above formula, x is the sample data of each residual current collected, and N is the total number of the residual current samples collected.
And 3, step 3: preprocessing the extracted features
And considering that the numerical dispersion between different characteristics is large, the maximum and minimum normalization processing is carried out on all the characteristics of the samples.
The normalized calculation formula is as follows:
Figure BDA0003859122250000081
in the formula, x imin 、x imax Respectively the minimum and maximum of the ith sample characteristic, x i For the feature of the ith sample,
Figure BDA0003859122250000082
representing the normalized ith sample characteristic.
And 4, step 4: screening the preprocessed features by using a Neighbor Component Analysis (NCA) -based method to eliminate redundant features influencing the identification accuracy;
from the aspect of correlation among the features, the high-correlation optimal feature set is selected from the nine-dimensional feature set by adopting NCA (non-uniform clustering algorithm) so as to further improve the recognition model. NCA is a distance metric learning algorithm that selects the optimal feature subset for the leakage recognition model by maximizing the classification accuracy of the leave-one-out method. The NCA feature selection procedure is as follows:
setting a preprocessed original feature set S = { (x) i ,y i ) I =1,2, \ 8230; n }, where x i Representing the ith sample feature, y i Labeling the ith sample; randomly taking a sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:
Figure BDA0003859122250000083
in the formula, w m Representing the characteristic weight when the sample characteristic is m-dimensional; m is a characteristic dimension, and r represents the total quantity of the characteristic dimension; x is a radical of a fluorine atom j Is x i Neighbor samples; x is the number of im M-dimensional feature, x, representing the i-th sample jm Representing the m-dimension feature of the j sample;
x j is selected as x i The probability of the nearest sample is:
Figure BDA0003859122250000084
in the formula, n is the number of samples; x is a radical of a fluorine atom j Denotes the jth sample as x i Neighbor samples, when the number of samples i = j, p ij =0;x t Denotes dividing by x i All samples except; k is a kernel function defined as:
Figure BDA0003859122250000085
where σ is the kernel width, then x i The probability of being correctly classified is:
Figure BDA0003859122250000086
in the formula, y j Is a sample x j The label of (1); y is ij =y i -y j Is a sample x i And x j When y is the label error of i =y j When y is ij =1, otherwise y ij =0;
Calculating p i Average value of (d):
Figure BDA0003859122250000091
introducing a regularization term to obtain the following objective function:
Figure BDA0003859122250000092
in the formula, n is the number of samples; λ is a regularization parameter; r represents the total number of feature dimensions;
the objective of the neighbor component analysis is to compute the weight w at which F (w) is maximized, and the equivalence problem is also expressed as:
Figure BDA0003859122250000093
Figure BDA0003859122250000094
representing a minimization objective function;
Figure BDA0003859122250000095
that is, to distinguish the difference from F (w), F (w) is the maximum objective function, i.e., w when the maximum objective function F (w) is obtained in the process of obtaining the weight, and w when the minimum objective function F (w) is obtained in the process of obtaining the minimum objective function
Figure BDA0003859122250000096
W of (c).
Thus, the weight of each feature in the sample set is calculated, the greater F (w) or
Figure BDA0003859122250000097
Smaller indicates higher correlation of the current feature with the leakage recognition model, and vice versa.
And screening the characteristics with high characteristic weight values according to the size of the characteristic weight values. In this embodiment, fig. 5 shows that the feature weights of all feature quantities are obtained by calculation, and p is selected according to the magnitude of the feature weights 1 、p 2 、p 3 、p 5 As the optimum characteristic amount.
And 5: and training the KELM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to further obtain an SSA-KELM electric leakage recognition model. The step 5 specifically comprises the following steps:
step 501: dividing feature data after NCA screening into training data and testing data; in this embodiment, the feature data after NCA selection is scaled by 7:3 randomly dividing the training data into training data and testing data;
step 502: training a KeLM model of a kernel extreme learning machine by adopting training data; step 502 specifically includes:
the structure of the model for constructing the kernel limit learning machine is shown in the attached figure 2, and the identification principle is as follows:
hypothesis generalOver N input samples x k =[x k1 ,x k2 ,…,x km ] T And N output samples t k =[t k1 ,t k2 ,t k3 ] T Training the KELM model shown in FIG. 2; wherein k =1, \8230, N, x k Represents the kth input sample, x km Representing the m-dimensional feature of the kth input sample, t k Denotes the kth output sample, t k1 1 st feature representing a kth output sample;
training causes the error between the predicted output and the actual output of the network to approach zero, when:
Figure BDA0003859122250000101
in the formula: beta is a beta i The weight between the hidden layer node and the output layer; g (x) i ,w i ,b i ) To activate a function, w i The weight value between the hidden layer node and the input layer; b i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:
HB=T
Figure BDA0003859122250000102
wherein H is the hidden layer output matrix, B = [ beta ] 1 ,…,β L ] T A matrix formed by output weight vectors between the hidden layer nodes and the output nodes, beta L An output weight vector between the L-th hidden layer node and the output node; t = [ T = 1 ,…,t N ] T A matrix formed by real output values of the samples;
by solving the least squares solution, we obtain:
Figure BDA0003859122250000103
and (3) solving by using a Lagrange multiplier method to obtain a network output weight:
Figure BDA0003859122250000104
in the formula: c represents a regularization coefficient, namely a penalty coefficient; i is a diagonal matrix; the output of an ELM (extreme learning machine) is derived from the above equation:
Figure BDA0003859122250000105
defining a kernel matrix based on Mercer's conditions as:
Figure BDA0003859122250000106
in the formula: k (x) i ,x k ) The present invention adopts a Radial Basis Function (RBF) kernel Function, which represents the kernel functions of the ith input sample and the kth input sample:
Figure BDA0003859122250000107
in the formula: σ represents a kernel function parameter;
the output of the KELM model from the above equation is:
Figure BDA0003859122250000111
in the formula: f (x) is the actual output value of the KELM model.
The KELM model is obtained by adopting a kernel function on the basis of the ELM, wherein the KELM model is obtained by firstly obtaining an expression of the ELM and further adopting the kernel function.
Step 503: optimizing the regularization coefficient C and the kernel function parameter sigma of the KELM model by using a sparrow search algorithm SSA to obtain optimal parameters, and further establishing an SSA-KELM electric leakage identification model;
introducing a sparrow search algorithm to optimize a regularization coefficient C and a kernel function parameter sigma of the KELM so as to enhance the generalization of the KELM recognition model; the sparrow search algorithm is a novel group intelligent optimization algorithm inspired by the foraging behavior and the anti-predation behavior of sparrows in the nature. The specific steps of optimizing parameters of the KELM model by using the sparrow search algorithm are as follows:
step 5031: initializing parameters of a sparrow search algorithm SSA, including the size Pop of a sparrow population, the dimension d of a search space and the maximum iteration number iter max The sparrow proportion SD realizing danger, the discoverer proportion PD and the early warning value ST;
step 5032: calculating the fitness value of each sparrow, and finding out the position X corresponding to the current best fitness best Position X corresponding to worst fitness worst
Step 5033: updating the positions of the discoverers, the joiners and the individuals realizing danger in the sparrow population according to the corresponding updating rule of the sparrow searching algorithm; the sparrow search algorithm updating rule comprises the following steps:
the update rule of the discoverer location is as follows:
Figure BDA0003859122250000112
in the formula (I), the compound is shown in the specification,
Figure BDA0003859122250000113
representing the position information of the ith sparrow in the jth dimension in the (t + 1) th iteration;
Figure BDA0003859122250000114
representing the position information of the ith sparrow in the jth dimension at the tth iteration; t represents the current iteration number; iter max Representing the maximum number of iterations of the algorithm; alpha is (0, 1)]A uniform random number in between; ST ∈ [0.5,1 ]]And R 2 ∈[0,1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 x d matrix, and the elements in the matrix are all 1;
the location update rule for the enrollee is as follows:
Figure BDA0003859122250000115
in the formula, X worst Representing a global worst position at the t-th iteration;
Figure BDA0003859122250000116
representing the optimal position occupied by the finder in the t +1 iteration; i represents the population size; a represents a 1 × d matrix in which each element is randomly assigned a value of 1 or-1, and A + =A T (AA T ) -1 (ii) a num is the number of sparrows;
the location update rules for risk aware individuals are as follows:
Figure BDA0003859122250000121
in the formula (I), the compound is shown in the specification,
Figure BDA0003859122250000122
the current global optimal position is obtained; beta represents a step size control parameter, and is a random number which follows normal distribution with the mean value of 0 and the variance of 1; k represents the direction of movement of sparrows and is [ -1,1]A random number in between; f. of i Representing the fitness value of the ith sparrow; f. of g And f w Respectively representing the optimal and worst fitness values of the current sparrow population; epsilon is a very small constant to avoid the situation where the denominator is zero.
Step 5034: recalculating the fitness value of each sparrow after the position is updated, comparing the fitness value with the fitness value in the last iteration, and if the fitness value is higher than the original fitness value, taking the new position value as the optimal fitness value X best (ii) a Otherwise, keeping the original fitness value unchanged;
step 5035: judging whether the maximum iteration times or the solving precision is reached, if not, returning to the step 5033; and if so, stopping the iteration process, and returning the sparrow position information with the optimal fitness, namely the optimal (C, sigma) combination.
Step 504: and testing the recognition effect of the constructed SSA-KELM model by using the test data, and taking the recognition accuracy as an evaluation index. In this embodiment, the residual current characteristic sample of the test sample is input into the SSA-KELM model to identify the leakage fault type of the sample to be tested, so as to obtain the leakage type of the sample to be tested. The recognition result of a certain time is shown in figure 6, and the recognition accuracy is 98.29%.
And 6: inputting the residual current characteristic sample to be tested into an SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested; the output categories include normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.
The method establishes an efficient electric leakage identification model, accurately identifies the electric leakage fault type, ensures the correct action of the residual current protection device, and has important significance for realizing the safe and reliable operation of the photovoltaic access power distribution network.
The parts not involved in the present invention are the same as or implemented using the prior art.
The foregoing is a more detailed description of the present invention that is presented in conjunction with specific embodiments, and the practice of the invention is not to be considered limited to those descriptions. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. A photovoltaic access distribution network electric leakage identification method based on NCA and SSA-KELM is characterized by comprising the following steps:
step 1: collecting residual current data of the photovoltaic access power distribution network in different states;
step 2: extracting characteristic quantity of the collected residual current data;
and step 3: preprocessing the extracted features;
and 4, step 4: screening the preprocessed features by using a neighbor-based component analysis (NCA);
and 5: training a KeLM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to obtain an SSA-KELM electric leakage recognition model;
step 6: and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output type identification, thereby obtaining the electric leakage type of the sample to be tested.
2. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in the step 1, different states of the power distribution network comprise: normal operation, when an organism electric shock occurs and when an organism electric shock occurs;
in step 6, outputting categories including normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.
3. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in step 2, the extracted features include 9 dimensions: root of square amplitude p 1 Absolute average value p 2 Root mean square planting of p 3 Kurtosis index p 4 The skewness index p 5 Peak index p 6 Wave form factor p 7 Margin coefficient p 8 Sum pulse index p 9
4. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in step 3, the preprocessing method adopts a maximum and minimum normalization method.
5. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: the step 4 specifically comprises:
setting a preprocessed original feature set S = { (x) i ,y i ) I =1,2, \ 8230; n }, where x i Representing the ith sample feature, y i Labeling the ith sample; taking any sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:
Figure FDA0003859122240000011
in the formula, w m Representing the characteristic weight when the sample characteristic is m-dimensional; m is a characteristic dimension, and r represents the total quantity of the characteristic dimension; x is the number of j Is x i Neighbor samples; x is the number of im Representing the m-dimensional feature, x, of the ith sample jm Representing the m-dimension feature of the j sample;
x j is selected as x i The probability of the nearest sample is:
Figure FDA0003859122240000021
in the formula, n is the number of samples; x is a radical of a fluorine atom j Denotes the jth sample as x i Neighboring samples, p when the number of samples i = j ij =0;x t Denotes dividing by x i All samples except; k is a kernel function defined as:
Figure FDA0003859122240000022
where σ is the kernel width, then x i The probability of being correctly classified is:
Figure FDA0003859122240000023
in the formula, y j Is a sample x j The label of (1); y is ij =y i -y j Is a sample x i And x j When y is the label error of i =y j When, y ij =1, otherwise y ij =0;
Calculating p i Average value of (d):
Figure FDA0003859122240000024
introducing a regularization term to obtain the following objective function:
Figure FDA0003859122240000025
in the formula, n is the number of samples; λ is a regularization parameter; r represents the total number of feature dimensions;
the objective of the neighbor component analysis is to compute the weight w at which F (w) is maximized, and the equivalence problem is also expressed as:
Figure FDA0003859122240000026
Figure FDA0003859122240000027
representing a minimization objective function; f (w) represents a maximization objective function;
thus, the weight of each feature in the sample set is calculated, the greater F (w) is or
Figure FDA0003859122240000028
Smaller indicates higher correlation of the current feature with the electrical leakage recognition model and vice versa.
And screening the characteristics with high characteristic weight values according to the size of the characteristic weight values.
6. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: the step 5 specifically comprises the following steps:
step 501: dividing feature data after NCA screening into training data and testing data;
step 502: training a KeLM model of a kernel extreme learning machine by adopting training data;
step 503: optimizing the regularization coefficient C and the kernel function parameter sigma of the KELM by using a sparrow search algorithm SSA to obtain optimal parameters, and further establishing an SSA-KELM electric leakage identification model;
step 504: and testing the constructed SSA-KELM model by using the test data, and taking the identification accuracy as an evaluation index.
7. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: the step 502 specifically includes:
suppose that x is passed through N input samples k =[x k1 ,x k2 ,…,x km ] T And N output samples t k =[t k1 ,t k2 ,t k3 ] T Training the KELM model; wherein k =1, \ 8230;, N, x k Represents the kth input sample, x km Representing the m-dimensional feature of the kth input sample, t k Denotes the kth output sample, t k1 1 st feature representing a kth output sample;
so that the error between the predicted output and the actual output of the network approaches zero, there are:
Figure FDA0003859122240000031
in the formula: beta is a i The weight value between the hidden layer node and the output layer; g (x) i ,w i ,b i ) As an activation function, w i The weight value between the hidden layer node and the input layer; b is a mixture of i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:
HB=T
Figure FDA0003859122240000032
wherein H is the hidden layer output matrix, B = [ beta ] 1 ,…,β L ] T A matrix formed by output weight vectors between the hidden layer nodes and the output nodes, beta L An output weight vector between the L-th hidden layer node and the output node; t = [ T = 1 ,…,t N ] T A matrix formed by real output values of the samples;
by solving the least squares solution, we obtain:
Figure FDA0003859122240000033
and (3) solving by using a Lagrange multiplier method to obtain a network output weight:
Figure FDA0003859122240000034
in the formula: c represents a regularization coefficient, namely a penalty coefficient; i is a diagonal matrix; the output of the ELM from the above equation is:
Figure FDA0003859122240000035
based on Mercer's conditions, define the kernel matrix as:
Figure FDA0003859122240000036
in the formula: k (x) i ,x k ) Representing the kernel functions of the ith input sample and the kth input sample, a radial basis kernel function is adopted:
Figure FDA0003859122240000041
in the formula: σ represents a kernel function parameter;
the output of the KELM model from the above equation is:
Figure FDA0003859122240000042
in the formula: f (x) is the actual output value of the KELM model.
8. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: the step 503 is specifically:
step 5031: initializing parameters of a sparrow search algorithm SSA, including the sparrow population size Pop, the search space dimension d and the maximum iteration number iter max The sparrow proportion SD realizing danger, the discoverer proportion PD and the early warning value ST;
step 5032: calculating the fitness value of each sparrow, and finding out the position X corresponding to the current best fitness best Position X corresponding to worst fitness worst
Step 5033: updating the positions of discoverers, participants and dangerous individuals in the sparrow population according to the sparrow searching algorithm updating rule;
step 5034: recalculating the fitness value of each sparrow after the position is updated, comparing the fitness value with the fitness value in the last iteration, and if the fitness value is higher than the original fitness value, taking the new position value as the optimal fitness value X best (ii) a Otherwise, keeping the original fitness value unchanged;
step 5035: judging whether the maximum iteration times or the solving precision is reached, if not, returning to the step 5033; and if so, stopping the iteration process, and returning the sparrow position information with the optimal fitness, namely the optimal (C, sigma) combination.
9. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 8, characterized in that: the sparrow search algorithm updating rule in step 5033 includes:
the update rule of the finder position is as follows:
Figure FDA0003859122240000043
in the formula (I), the compound is shown in the specification,
Figure FDA0003859122240000044
representing the position information of the ith sparrow in the jth dimension in the (t + 1) th iteration;
Figure FDA0003859122240000045
representing the position information of the ith sparrow in the jth dimension during the tth iteration; t represents the current iteration number; iter (R) max Representing the maximum number of iterations of the algorithm; alpha is (0, 1)]A uniform random number therebetween; ST ∈ [0.5,1 ]]And R 2 ∈[0,1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 × d matrix, and elements in the matrix are all 1;
the location update rule for the enrollee is as follows:
Figure FDA0003859122240000051
in the formula, X worst Representing a global worst position at the t-th iteration;
Figure FDA0003859122240000052
representing the optimal position occupied by the finder at the t +1 th iteration; i represents the population size; a represents a 1 × d matrix in which each element is randomly assigned a value of 1 or-1, and A + =A T (AA T ) -1 (ii) a num is the number of sparrows;
the location update rules for risk-aware individuals are as follows:
Figure FDA0003859122240000053
in the formula (I), the compound is shown in the specification,
Figure FDA0003859122240000054
the current global optimal position is obtained; beta represents a step size control parameter, and is a random number which follows normal distribution with the mean value of 0 and the variance of 1; k represents the direction of movement of sparrows and is [ -1,1]A random number in between; f. of i Expressing the fitness value of the ith sparrow; f. of g And f w Respectively representing the optimal and worst fitness values of the current sparrow population; epsilon is a very small constant to avoid the situation where the denominator is zero.
10. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: in step 501, feature data after NCA selection is processed according to a ratio of 7:3 randomly divided into training data and test data.
CN202211156808.XA 2022-09-22 2022-09-22 Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM Pending CN115618261A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211156808.XA CN115618261A (en) 2022-09-22 2022-09-22 Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211156808.XA CN115618261A (en) 2022-09-22 2022-09-22 Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM

Publications (1)

Publication Number Publication Date
CN115618261A true CN115618261A (en) 2023-01-17

Family

ID=84858827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211156808.XA Pending CN115618261A (en) 2022-09-22 2022-09-22 Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM

Country Status (1)

Country Link
CN (1) CN115618261A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116226776A (en) * 2023-04-25 2023-06-06 广东电网有限责任公司云浮供电局 Machine learning-based photovoltaic system abnormal residual current detection method
CN116572254A (en) * 2023-07-07 2023-08-11 湖南大学 Robot humanoid multi-finger combined touch sensing method, system and equipment
CN116595449A (en) * 2023-06-06 2023-08-15 西安科技大学 Asynchronous motor fault diagnosis method based on improved SSA optimization support vector machine
CN117970182A (en) * 2024-03-28 2024-05-03 国网山东省电力公司曲阜市供电公司 Electric leakage early warning method and system based on DTW algorithm

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116226776A (en) * 2023-04-25 2023-06-06 广东电网有限责任公司云浮供电局 Machine learning-based photovoltaic system abnormal residual current detection method
CN116595449A (en) * 2023-06-06 2023-08-15 西安科技大学 Asynchronous motor fault diagnosis method based on improved SSA optimization support vector machine
CN116595449B (en) * 2023-06-06 2023-12-12 西安科技大学 Asynchronous motor fault diagnosis method based on improved SSA optimization support vector machine
CN116572254A (en) * 2023-07-07 2023-08-11 湖南大学 Robot humanoid multi-finger combined touch sensing method, system and equipment
CN116572254B (en) * 2023-07-07 2023-09-08 湖南大学 Robot humanoid multi-finger combined touch sensing method, system and equipment
CN117970182A (en) * 2024-03-28 2024-05-03 国网山东省电力公司曲阜市供电公司 Electric leakage early warning method and system based on DTW algorithm

Similar Documents

Publication Publication Date Title
CN115618261A (en) Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM
Appiah et al. Long short-term memory networks based automatic feature extraction for photovoltaic array fault diagnosis
CN108520272B (en) Semi-supervised intrusion detection method for improving Cantonese algorithm
CN111353153B (en) GEP-CNN-based power grid malicious data injection detection method
Madhiarasan et al. Analysis of artificial neural network: architecture, types, and forecasting applications
CN114021799A (en) Day-ahead wind power prediction method and system for wind power plant
CN113095442B (en) Hail identification method based on semi-supervised learning under multi-dimensional radar data
Wu et al. A hybrid support vector regression approach for rainfall forecasting using particle swarm optimization and projection pursuit technology
CN109165819B (en) Active power distribution network reliability rapid evaluation method based on improved AdaBoost. M1-SVM
CN108133225A (en) A kind of icing flashover fault early warning method based on support vector machines
CN107579846B (en) Cloud computing fault data detection method and system
CN103440493A (en) Hyperspectral image blur classification method and device based on related vector machine
CN114925612A (en) Transformer fault diagnosis method for optimizing hybrid kernel extreme learning machine based on sparrow search algorithm
CN110794360A (en) Method and system for predicting fault of intelligent electric energy meter based on machine learning
CN113869145A (en) Circuit fault diagnosis method and system for light-weight gradient elevator and sparrow search
CN113884807B (en) Power distribution network fault prediction method based on random forest and multi-layer architecture clustering
CN115186012A (en) Power consumption data detection method, device, equipment and storage medium
Zhang et al. Principal component analysis (PCA) based sparrow search algorithm (SSA) for optimal learning vector quantized (LVQ) neural network for mechanical fault diagnosis of high voltage circuit breakers
Liu et al. Fault diagnosis of photovoltaic strings by using machine learning‐based stacking classifier
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN116663414A (en) Fault diagnosis method and system for power transformer
CN111060755A (en) Electromagnetic interference diagnosis method and device
CN113988205B (en) Method and system for judging electric precipitation working condition
CN115423091A (en) Conditional antagonistic neural network training method, scene generation method and system
Fang et al. Power distribution transformer fault diagnosis with unbalanced samples based on neighborhood component analysis and k-nearest neighbors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination