CN115618261A

CN115618261A - Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM

Info

Publication number: CN115618261A
Application number: CN202211156808.XA
Authority: CN
Inventors: 汪自虎; 栾宁; 刘苑红; 刘政生; 许洪华; 李蕊; 瞿洪庆; 陈晓勇; 乐平富; 邓鹏�; 闫涛; 韩晓慧; 夏越; 杜松怀
Original assignee: China Electric Power Research Institute Co Ltd CEPRI; Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Current assignee: China Electric Power Research Institute Co Ltd CEPRI; Nanjing Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2022-09-22
Filing date: 2022-09-22
Publication date: 2023-01-17

Abstract

The invention relates to the technical field of fault identification, in particular to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM, which comprises the following steps: collecting residual current data of the photovoltaic access power distribution network in different states; extracting characteristic quantity of the collected residual current data; preprocessing the extracted features; screening the preprocessed features by using a neighbor-based component analysis (NCA); training a KELM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to obtain an SSA-KELM electric leakage recognition model; and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested. The method and the device can accurately identify the type of the leakage fault of the photovoltaic access power distribution network.

Description

Photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM

Technical Field

The invention relates to the technical field of fault identification, in particular to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM.

Background

In recent years, with the continuous increase of installed photovoltaic capacity, a large number of distributed photovoltaic systems are connected to a power distribution network, a grid-connected inverter is used as a key link for connecting a photovoltaic system and a power grid, and a power frequency transformer is usually used for realizing voltage matching and electrical isolation of the power grid. However, the line frequency transformer increases the weight, volume and cost of the photovoltaic system, reducing the system efficiency. The transformer-free non-isolated grid-connected inverter has the advantages of small size, low cost and simple structure, and the most outstanding advantage is that the efficiency of the whole system can be improved, so that more and more photovoltaic grid-connected systems adopt a non-isolated access scheme.

In a non-isolated photovoltaic-accessed power distribution network, due to the lack of the isolation effect of a transformer, an electrical connection exists between a photovoltaic cell and a power grid, and common-mode leakage current is generated on parasitic capacitance between a photovoltaic cell panel and the ground. Due to the existence of the common mode leakage current, the conventional leakage fault detection technology cannot distinguish the leakage current from the leakage current generated when the biological electric shock fault occurs, so that the residual current protection device frequently malfunctions, and the power distribution network cannot safely and reliably operate. In practical application, the action threshold of the leakage protection device is generally increased or the leakage protection device is directly removed, so that potential safety hazards exist in the power distribution network.

At present, the existing electric leakage identification method is mainly researched aiming at the traditional power distribution network, and the electric leakage identification research of the photovoltaic access power distribution network is not found.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention aims to provide a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM.

In order to achieve the purpose, the invention provides the following technical scheme:

a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM comprises the following steps:

step 1: collecting residual current data of the photovoltaic access power distribution network in different states;

and 2, step: extracting characteristic quantity of the collected residual current data;

and step 3: preprocessing the extracted features;

and 4, step 4: screening the preprocessed features by using a neighbor-based component analysis (NCA);

and 5: training a KeLM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to obtain an SSA-KELM electric leakage recognition model;

and 6: and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested.

Further, in step 1, different states of the power distribution network include: normal operation, when an organism electric shock occurs and when an abiotic body electric shock occurs; in step 6, outputting categories including normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.

Further, in step 2, the extracted features include 9 dimensions: root of square amplitude p ₁ Absolute average value p ₂ Root mean square planting of p ₃ Kurtosis index p ₄ And a skewness index p ₅ Peak index p ₆ Wave form factor p ₇ Margin coefficient p ₈ Sum pulse index p ₉ 。

Further, in step 3, the preprocessing method adopts a maximum and minimum normalization method.

Further, the step 4 specifically includes:

setting a preprocessed raw feature set S = { (x) _i ,y _i ) I =1,2, \ 8230; n }, where x _i Represents the ith sample feature, y _i Is the ith sample label; randomly taking a sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:

in the formula, w _m Representing the characteristic weight when the sample characteristic is m-dimensional; m is the feature dimension and r represents the featureThe total number of dimensions; x is a radical of a fluorine atom _j Is x _i Neighbor samples; x is a radical of a fluorine atom _im M-dimensional feature, x, representing the i-th sample _jm Representing the m-dimension feature of the j sample;

x _j is selected as x _i The probability of the nearest sample is:

in the formula, n is the number of samples; x is a radical of a fluorine atom _j Denotes the jth sample as x _i Neighboring samples, p when the number of samples i = j _ij ＝0；x _t Denotes dividing by x _i All samples except; k is a kernel function defined as:

where σ is the kernel width, then x _i The probability of being correctly classified is:

in the formula, y _j Is a sample x _j The label of (2); y is _ij ＝y _i -y _j Is a sample x _i And x _j When y is a label error of _i ＝y _j When y is _ij =1, otherwise y _ij ＝0；

Calculating p _i Average value of (d):

introducing a regularization term to obtain the following objective function:

in the formula, n is the number of samples; λ is a regularization parameter; r represents the total number of feature dimensions;

the objective of the neighbor component analysis is to compute the weight w at which F (w) is maximized, and the equivalence problem is also expressed as:

representing a minimization objective function; f (w) represents a maximization objective function;

thus, the weight of each feature in the sample set is calculated, the greater F (w) is or

Smaller indicates higher correlation of the current feature with the leakage recognition model, and vice versa.

And screening the characteristics with high characteristic weight values according to the size of the characteristic weight values.

Further, the step 5 specifically includes:

step 501: dividing feature data after NCA screening into training data and testing data;

step 502: training a KeLM model of a kernel extreme learning machine by adopting training data;

step 503: optimizing the regularization coefficient C and the kernel function parameter sigma of the KELM model by using a sparrow search algorithm SSA to obtain optimal parameters, and further establishing an SSA-KELM electric leakage identification model;

step 504: and testing the constructed SSA-KELM model by using the test data, and taking the identification accuracy as an evaluation index.

Further, the step 502 specifically includes:

suppose that x is passed through N input samples _k ＝[x _k1 ,x _k2 ,…,x _km ] ^T And N output samples t _k ＝[t _k1 ,t _k2 ,t _k3 ] ^T Training KELM modelsRefining; wherein k =1, \ 8230;, N, x _k Denotes the kth input sample, x _km Representing the m-dimensional feature of the kth input sample, t _k Denotes the kth output sample, t _k1 1 st feature representing a kth output sample;

so that the error between the predicted output and the actual output of the network approaches zero, there are:

in the formula: beta is a _i The weight between the hidden layer node and the output layer; g (x) _i ,w _i ,b _i ) As an activation function, w _i The weight between the hidden layer node and the input layer; b is a mixture of _i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:

HB＝T

wherein H is the hidden layer output matrix, B = [ beta ] ₁ ,…,β _L ] ^T A matrix formed by output weight vectors between the hidden layer nodes and the output nodes, beta _L An output weight vector between the L-th hidden layer node and the output node is obtained; t = [ T = ₁ ,…,t _N ] ^T A matrix formed for the true output values of the samples;

by solving a least squares solution, we obtain:

solving by using a Lagrange multiplier method to obtain a network output weight:

in the formula: c represents a regularization coefficient, namely a penalty coefficient; i is a diagonal matrix; the output of the ELM from the above equation is:

defining a kernel matrix based on Mercer's conditions as:

in the formula: k (x) _i ,x _k ) The kernel function representing the ith input sample and the kth input sample adopts a radial basis kernel function:

in the formula: σ represents a kernel function parameter;

the output of the KELM model from the above equation is:

in the formula: f (x) is the actual output value of the KELM model.

Further, the step 503 specifically includes:

step 5031: initializing parameters of a sparrow search algorithm SSA, including the sparrow population size Pop, the search space dimension d and the maximum iteration number iter _max The sparrow proportion SD realizing danger, the discoverer proportion PD and the early warning value ST;

step 5032: calculating the fitness value of each sparrow, and finding out the position X corresponding to the current best fitness _best Position X corresponding to worst fitness _worst ；

Step 5033: updating the positions of the discoverers, the participants and the individuals realizing danger in the sparrow population according to the sparrow searching algorithm updating rule;

step 5034: recalculating the fitness value of each sparrow after the position is updated, comparing the fitness value with the fitness value in the last iteration, and if the fitness value is higher than the original fitness value, taking the new position value as the optimal fitness value X _best (ii) a Otherwise, keeping the original fitness value unchanged;

step 5035: judging whether the maximum iteration times or the solving precision is reached, if not, returning to the step 5033; and if so, stopping the iteration process, and returning the sparrow position information with the optimal fitness, namely the optimal (C, sigma) combination.

Further, the updating rule of the sparrow search algorithm in the step 5033 includes:

the update rule of the discoverer location is as follows:

in the formula (I), the compound is shown in the specification,

representing the position information of the ith sparrow in the jth dimension in the (t + 1) th iteration;

representing the position information of the ith sparrow in the jth dimension at the tth iteration; t represents the current iteration number; iter _max Representing the maximum number of iterations of the algorithm; alpha is (0,1)]A uniform random number in between; ST ∈ [0.5,1 ]]And R ₂ ∈[0，1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 × d matrix, and elements in the matrix are all 1;

the location update rule for the enrollee is as follows:

in the formula, X _worst Representing a global worst position at the t-th iteration;

representing the optimal position occupied by the finder at the t +1 th iteration; i represents the population size; a represents a 1 × d matrix in which each element is randomly assigned a value of 1 or-1, and A ⁺ ＝A ^T (AA ^T ) ^-1 (ii) a num is the number of sparrows;

the location update rules for risk-aware individuals are as follows:

in the formula (I), the compound is shown in the specification,

the current global optimal position is obtained; beta represents a step size control parameter, and is a random number which follows normal distribution with the mean value of 0 and the variance of 1; k represents the direction of movement of sparrows and is [ -1,1]A random number in between; f. of _i Expressing the fitness value of the ith sparrow; f. of _g And f _w Respectively representing the optimal and worst fitness values of the current sparrow population; epsilon is a very small constant to avoid the situation where the denominator is zero.

Further, in step 501, the feature data after NCA selection is scaled by 7:3 randomly divided into training data and test data.

Compared with the prior art, the invention has the beneficial effects that: the invention relates to a photovoltaic access power distribution network electric leakage identification method based on NCA and SSA-KELM, which is characterized in that extracted residual current signal characteristic quantity is screened based on Neighbor Component Analysis (NCA), the screened characteristic is input into a Kernel Extreme Learning Machine (KELM) model for training, parameters in the KELM model are optimized through a Sparrow Search Algorithm (SSA) to obtain optimal parameters, then an SSA electric leakage identification model for optimizing the KELM is established, and the to-be-detected residual current data is input into the trained electric leakage identification model to obtain the electric leakage fault type. The method can accurately identify the electric leakage fault type of the photovoltaic access power distribution network, avoid the false operation of the residual current protection device, further improve the power supply reliability of the photovoltaic access power distribution network, and ensure the safe and reliable operation of the power distribution network.

Drawings

FIG. 1 is a flow chart of a method for identifying leakage of electricity in a photovoltaic access distribution network based on NCA and SSA-KELM according to the present invention;

FIG. 2 is a schematic diagram of a KeLM model of the kernel-extreme learning machine according to the present invention;

FIG. 3 is a schematic diagram of residual current signals under different states according to an embodiment of the present invention; wherein:

FIG. 3 (a) is a schematic diagram of the residual current signal in the normal state in the present embodiment;

FIG. 3 (b) is a schematic diagram of the residual current signal when the living body gets an electric shock in the embodiment;

FIG. 3 (c) is a schematic diagram of the residual current signal when the inanimate object electrocutes;

FIG. 4 is a schematic diagram of residual current sample characteristic quantities according to an embodiment of the present invention; wherein:

FIG. 4 (a) is a diagram showing the square root amplitude p as the characteristic quantity in this embodiment ₁ A characteristic quantity diagram of (1);

FIG. 4 (b) shows the absolute average value p of the characteristic quantity in this embodiment ₂ A characteristic quantity diagram of (1);

FIG. 4 (c) shows the characteristic quantity of RMS p in this embodiment ₃ A characteristic quantity diagram of (1);

FIG. 4 (d) is a diagram illustrating the kurtosis index p as a feature quantity in this embodiment ₄ A characteristic quantity diagram of (1);

FIG. 4 (e) is a diagram showing the characteristic quantity as the skewness index p in the present embodiment ₅ A characteristic quantity diagram of (1);

FIG. 4 (f) shows the peak index p as the characteristic quantity in this embodiment ₆ A characteristic quantity diagram of (1);

FIG. 4 (g) shows the waveform factor p as the characteristic quantity in this embodiment ₇ A characteristic quantity diagram of (1);

FIG. 4 (h) shows the margin coefficient p as the characteristic quantity in this embodiment ₈ A characteristic quantity diagram of (1);

FIG. 4 (i) shows the present embodimentIn the case where the characteristic quantity is a pulse index p ₉ A characteristic quantity diagram of (1);

FIG. 5 is a schematic diagram of feature weights calculated by the NCA algorithm in an embodiment of the present invention;

FIG. 6 is a diagram illustrating identification results of SSA-KELM models according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Referring to fig. 1, the present invention is a method for identifying leakage of a photovoltaic access distribution network based on NCA and SSA-KELM, comprising the following steps:

different states of the power distribution network include: normal operation, when an organism electric shock occurs and when an organism electric shock occurs; wherein, the non-biological electric shock means the electric leakage of the photovoltaic equipment.

In the embodiment of the present invention, a simulation model of a photovoltaic access distribution network is used to obtain 390 sets of residual current data when an organism is electrically shocked and an abiotic body is electrically shocked (leakage of photovoltaic devices) during normal operation, and fig. 3 shows residual current signals obtained through simulation in three states, where fig. 3 (a) shows a graph of residual current signals in a normal state, fig. 3 (b) shows a schematic diagram of residual current signals when an organism is electrically shocked, and fig. 3 (c) shows a schematic diagram of residual current signals when an abiotic body is electrically shocked.

Step 2: calculating and extracting the characteristic quantity of the collected residual current data;

converting the obtained residual current signal into: root of square amplitude p ₁ Absolute average value p ₂ Root mean square planting of p ₃ Kurtosis index p ₄ The skewness index p ₅ Peak index p ₆ Wave form factor p ₇ Margin coefficient p ₈ Sum pulse index p ₉ And 9 dimensions in total, as shown in fig. 4. The calculation formula of each feature quantity is as follows:

in the above formula, x is the sample data of each residual current collected, and N is the total number of the residual current samples collected.

And 3, step 3: preprocessing the extracted features

And considering that the numerical dispersion between different characteristics is large, the maximum and minimum normalization processing is carried out on all the characteristics of the samples.

The normalized calculation formula is as follows:

in the formula, x _imin 、x _imax Respectively the minimum and maximum of the ith sample characteristic, x _i For the feature of the ith sample,

representing the normalized ith sample characteristic.

And 4, step 4: screening the preprocessed features by using a Neighbor Component Analysis (NCA) -based method to eliminate redundant features influencing the identification accuracy;

from the aspect of correlation among the features, the high-correlation optimal feature set is selected from the nine-dimensional feature set by adopting NCA (non-uniform clustering algorithm) so as to further improve the recognition model. NCA is a distance metric learning algorithm that selects the optimal feature subset for the leakage recognition model by maximizing the classification accuracy of the leave-one-out method. The NCA feature selection procedure is as follows:

setting a preprocessed original feature set S = { (x) _i ,y _i ) I =1,2, \ 8230; n }, where x _i Representing the ith sample feature, y _i Labeling the ith sample; randomly taking a sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:

in the formula, w _m Representing the characteristic weight when the sample characteristic is m-dimensional; m is a characteristic dimension, and r represents the total quantity of the characteristic dimension; x is a radical of a fluorine atom _j Is x _i Neighbor samples; x is the number of _im M-dimensional feature, x, representing the i-th sample _jm Representing the m-dimension feature of the j sample;

x _j is selected as x _i The probability of the nearest sample is:

in the formula, n is the number of samples; x is a radical of a fluorine atom _j Denotes the jth sample as x _i Neighbor samples, when the number of samples i = j, p _ij ＝0；x _t Denotes dividing by x _i All samples except; k is a kernel function defined as:

in the formula, y _j Is a sample x _j The label of (1); y is _ij ＝y _i -y _j Is a sample x _i And x _j When y is the label error of _i ＝y _j When y is _ij =1, otherwise y _ij ＝0；

Calculating p _i Average value of (d):

introducing a regularization term to obtain the following objective function:

representing a minimization objective function;

that is, to distinguish the difference from F (w), F (w) is the maximum objective function, i.e., w when the maximum objective function F (w) is obtained in the process of obtaining the weight, and w when the minimum objective function F (w) is obtained in the process of obtaining the minimum objective function

W of (c).

Thus, the weight of each feature in the sample set is calculated, the greater F (w) or

And screening the characteristics with high characteristic weight values according to the size of the characteristic weight values. In this embodiment, fig. 5 shows that the feature weights of all feature quantities are obtained by calculation, and p is selected according to the magnitude of the feature weights ₁ 、p ₂ 、p ₃ 、p ₅ As the optimum characteristic amount.

And 5: and training the KELM model of the kernel extreme learning machine by using the screened features, and optimizing parameters of the KELM model by using a sparrow search algorithm SSA to further obtain an SSA-KELM electric leakage recognition model. The step 5 specifically comprises the following steps:

step 501: dividing feature data after NCA screening into training data and testing data; in this embodiment, the feature data after NCA selection is scaled by 7:3 randomly dividing the training data into training data and testing data;

step 502: training a KeLM model of a kernel extreme learning machine by adopting training data; step 502 specifically includes:

the structure of the model for constructing the kernel limit learning machine is shown in the attached figure 2, and the identification principle is as follows:

hypothesis generalOver N input samples x _k ＝[x _k1 ,x _k2 ,…,x _km ] ^T And N output samples t _k ＝[t _k1 ,t _k2 ,t _k3 ] ^T Training the KELM model shown in FIG. 2; wherein k =1, \8230, N, x _k Represents the kth input sample, x _km Representing the m-dimensional feature of the kth input sample, t _k Denotes the kth output sample, t _k1 1 st feature representing a kth output sample;

training causes the error between the predicted output and the actual output of the network to approach zero, when:

in the formula: beta is a beta _i The weight between the hidden layer node and the output layer; g (x) _i ,w _i ,b _i ) To activate a function, w _i The weight value between the hidden layer node and the input layer; b _i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:

HB＝T

wherein H is the hidden layer output matrix, B = [ beta ] ₁ ,…,β _L ] ^T A matrix formed by output weight vectors between the hidden layer nodes and the output nodes, beta _L An output weight vector between the L-th hidden layer node and the output node; t = [ T = ₁ ,…,t _N ] ^T A matrix formed by real output values of the samples;

by solving the least squares solution, we obtain:

and (3) solving by using a Lagrange multiplier method to obtain a network output weight:

in the formula: c represents a regularization coefficient, namely a penalty coefficient; i is a diagonal matrix; the output of an ELM (extreme learning machine) is derived from the above equation:

defining a kernel matrix based on Mercer's conditions as:

in the formula: k (x) _i ,x _k ) The present invention adopts a Radial Basis Function (RBF) kernel Function, which represents the kernel functions of the ith input sample and the kth input sample:

in the formula: σ represents a kernel function parameter;

the output of the KELM model from the above equation is:

in the formula: f (x) is the actual output value of the KELM model.

The KELM model is obtained by adopting a kernel function on the basis of the ELM, wherein the KELM model is obtained by firstly obtaining an expression of the ELM and further adopting the kernel function.

introducing a sparrow search algorithm to optimize a regularization coefficient C and a kernel function parameter sigma of the KELM so as to enhance the generalization of the KELM recognition model; the sparrow search algorithm is a novel group intelligent optimization algorithm inspired by the foraging behavior and the anti-predation behavior of sparrows in the nature. The specific steps of optimizing parameters of the KELM model by using the sparrow search algorithm are as follows:

step 5031: initializing parameters of a sparrow search algorithm SSA, including the size Pop of a sparrow population, the dimension d of a search space and the maximum iteration number iter _max The sparrow proportion SD realizing danger, the discoverer proportion PD and the early warning value ST;

Step 5033: updating the positions of the discoverers, the joiners and the individuals realizing danger in the sparrow population according to the corresponding updating rule of the sparrow searching algorithm; the sparrow search algorithm updating rule comprises the following steps:

the update rule of the discoverer location is as follows:

in the formula (I), the compound is shown in the specification,

representing the position information of the ith sparrow in the jth dimension at the tth iteration; t represents the current iteration number; iter _max Representing the maximum number of iterations of the algorithm; alpha is (0, 1)]A uniform random number in between; ST ∈ [0.5,1 ]]And R ₂ ∈[0，1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 x d matrix, and the elements in the matrix are all 1;

the location update rule for the enrollee is as follows:

representing the optimal position occupied by the finder in the t +1 iteration; i represents the population size; a represents a 1 × d matrix in which each element is randomly assigned a value of 1 or-1, and A ⁺ ＝A ^T (AA ^T ) ^-1 (ii) a num is the number of sparrows;

the location update rules for risk aware individuals are as follows:

in the formula (I), the compound is shown in the specification,

the current global optimal position is obtained; beta represents a step size control parameter, and is a random number which follows normal distribution with the mean value of 0 and the variance of 1; k represents the direction of movement of sparrows and is [ -1,1]A random number in between; f. of _i Representing the fitness value of the ith sparrow; f. of _g And f _w Respectively representing the optimal and worst fitness values of the current sparrow population; epsilon is a very small constant to avoid the situation where the denominator is zero.

Step 504: and testing the recognition effect of the constructed SSA-KELM model by using the test data, and taking the recognition accuracy as an evaluation index. In this embodiment, the residual current characteristic sample of the test sample is input into the SSA-KELM model to identify the leakage fault type of the sample to be tested, so as to obtain the leakage type of the sample to be tested. The recognition result of a certain time is shown in figure 6, and the recognition accuracy is 98.29%.

And 6: inputting the residual current characteristic sample to be tested into an SSA-KELM model for output class identification, thereby obtaining the electric leakage type of the sample to be tested; the output categories include normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.

The method establishes an efficient electric leakage identification model, accurately identifies the electric leakage fault type, ensures the correct action of the residual current protection device, and has important significance for realizing the safe and reliable operation of the photovoltaic access power distribution network.

The parts not involved in the present invention are the same as or implemented using the prior art.

The foregoing is a more detailed description of the present invention that is presented in conjunction with specific embodiments, and the practice of the invention is not to be considered limited to those descriptions. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A photovoltaic access distribution network electric leakage identification method based on NCA and SSA-KELM is characterized by comprising the following steps:

step 2: extracting characteristic quantity of the collected residual current data;

and step 3: preprocessing the extracted features;

step 6: and inputting the residual current characteristic sample to be tested into the SSA-KELM model for output type identification, thereby obtaining the electric leakage type of the sample to be tested.

2. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in the step 1, different states of the power distribution network comprise: normal operation, when an organism electric shock occurs and when an organism electric shock occurs;

in step 6, outputting categories including normal operation, organism electric shock and non-organism electric shock; wherein, the non-biological electric shock represents the electric leakage of the photovoltaic equipment; types of electrical leakage include organism electrical contact and photovoltaic device electrical leakage.

3. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in step 2, the extracted features include 9 dimensions: root of square amplitude p ₁ Absolute average value p ₂ Root mean square planting of p ₃ Kurtosis index p ₄ The skewness index p ₅ Peak index p ₆ Wave form factor p ₇ Margin coefficient p ₈ Sum pulse index p ₉ 。

4. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: in step 3, the preprocessing method adopts a maximum and minimum normalization method.

5. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: the step 4 specifically comprises:

setting a preprocessed original feature set S = { (x) _i ,y _i ) I =1,2, \ 8230; n }, where x _i Representing the ith sample feature, y _i Labeling the ith sample; taking any sample feature from the feature set S, and calculating the distance between the sample feature and the adjacent sample:

in the formula, w _m Representing the characteristic weight when the sample characteristic is m-dimensional; m is a characteristic dimension, and r represents the total quantity of the characteristic dimension; x is the number of _j Is x _i Neighbor samples; x is the number of _im Representing the m-dimensional feature, x, of the ith sample _jm Representing the m-dimension feature of the j sample;

x _j is selected as x _i The probability of the nearest sample is:

in the formula, y _j Is a sample x _j The label of (1); y is _ij ＝y _i -y _j Is a sample x _i And x _j When y is the label error of _i ＝y _j When, y _ij =1, otherwise y _ij ＝0；

Calculating p _i Average value of (d):

introducing a regularization term to obtain the following objective function:

Smaller indicates higher correlation of the current feature with the electrical leakage recognition model and vice versa.

6. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 1, characterized in that: the step 5 specifically comprises the following steps:

step 503: optimizing the regularization coefficient C and the kernel function parameter sigma of the KELM by using a sparrow search algorithm SSA to obtain optimal parameters, and further establishing an SSA-KELM electric leakage identification model;

7. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: the step 502 specifically includes:

suppose that x is passed through N input samples _k ＝[x _k1 ,x _k2 ,…,x _km ] ^T And N output samples t _k ＝[t _k1 ,t _k2 ,t _k3 ] ^T Training the KELM model; wherein k =1, \ 8230;, N, x _k Represents the kth input sample, x _km Representing the m-dimensional feature of the kth input sample, t _k Denotes the kth output sample, t _k1 1 st feature representing a kth output sample;

in the formula: beta is a _i The weight value between the hidden layer node and the output layer; g (x) _i ,w _i ,b _i ) As an activation function, w _i The weight value between the hidden layer node and the input layer; b is a mixture of _i A threshold value that is a hidden layer node; l is the number of neurons in the hidden layer; the above formula is rewritten as:

HB＝T

by solving the least squares solution, we obtain:

based on Mercer's conditions, define the kernel matrix as:

in the formula: k (x) _i ,x _k ) Representing the kernel functions of the ith input sample and the kth input sample, a radial basis kernel function is adopted:

in the formula: σ represents a kernel function parameter;

the output of the KELM model from the above equation is:

in the formula: f (x) is the actual output value of the KELM model.

8. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: the step 503 is specifically:

Step 5033: updating the positions of discoverers, participants and dangerous individuals in the sparrow population according to the sparrow searching algorithm updating rule;

9. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 8, characterized in that: the sparrow search algorithm updating rule in step 5033 includes:

the update rule of the finder position is as follows:

in the formula (I), the compound is shown in the specification,

representing the position information of the ith sparrow in the jth dimension during the tth iteration; t represents the current iteration number; iter (R) _max Representing the maximum number of iterations of the algorithm; alpha is (0, 1)]A uniform random number therebetween; ST ∈ [0.5,1 ]]And R ₂ ∈[0，1]Respectively representing a safety value and an early warning value; q is a random number which follows normal distribution; l represents a 1 × d matrix, and elements in the matrix are all 1;

the location update rule for the enrollee is as follows:

the location update rules for risk-aware individuals are as follows:

in the formula (I), the compound is shown in the specification,

10. The NCA and SSA-KELM based photovoltaic access distribution network electric leakage identification method according to claim 6, characterized in that: in step 501, feature data after NCA selection is processed according to a ratio of 7:3 randomly divided into training data and test data.