Photovoltaic system fault arc detection method for integrating Adaboost with multiple classifiers
Technical Field
The invention belongs to the technical field of photovoltaic electrical fault detection, and relates to a photovoltaic system fault arc detection method by utilizing an Adaboost machine learning composite model.
Background
Solar energy has the characteristics of cleanness and safety, and has become the renewable energy source with the fastest scale development. The photovoltaic power station is generally established in unsuitable living areas, such as barren mountains, barren lands, deserts, beaches and the like, and the photovoltaic system is equivalent to a direct current power supply system, and because the output voltage of the direct current end of the photovoltaic power station is high, the phenomena of poor contact or oxidation corrosion and the like of electronic components can occur at any position of the direct current end, and an arc is easy to generate in a formed gap. According to the volt-ampere characteristic of the photovoltaic cell panel, once an electric arc is generated, stable combustion is easy to form, the voltage is further increased, the temperature of the electric arc is suddenly increased, combustion of nearby combustible materials and conductors is caused, the safety of a power supply and a circuit is endangered, even fire is caused, and property loss and even casualties are caused. Although the occurrence of faults of the photovoltaic power station is mostly attributed to direct-current side fault arcs, the existing protection device can only protect faults caused by circuit overcurrent, and cannot detect the arcs, so that the power generation efficiency of the photovoltaic power station is reduced, and potential safety hazards such as fire disasters exist.
The development of fault arc detection technology is crucial to guaranteeing safe, reliable and economic operation of a photovoltaic system, the technology generally detects occurrence of fault arc in a circuit according to arc characteristics and actively generates output fault signals, so that a sectionalizer is started to protect the circuit, the main effect of the technology is to prevent harmful electric shock or fire caused by the fault arc, the technology is an important means for effectively avoiding component damage and economic loss caused by the fault arc, and meanwhile, system operation parameters are automatically acquired and faults are identified under the condition of eliminating human intervention monitoring, so that the technology reduces the number of times of human maintenance, is an important way for improving the operation performance of the system, and is beneficial to improving the intelligent operation degree of the system.
Chinese patent CN112180312a discloses a current sensor composite fault diagnosis method, which inputs a current sensor sample to be tested to a combined model, the combined model extracts fault characteristics of the sample to be tested with optimized parameters and performs diagnosis, the patent extracts fault characteristics from the angles of a plurality of time domain characteristic values, and omission of fault information is avoided; the dependence on the accurate physical model is low; the fault detection device can more accurately diagnose gain faults, bias faults and composite faults of the gain faults and the bias faults in the current sensor, but the fault arc and arc-like working conditions cannot be distinguished by the device. According to the machine learning multi-time-frequency characteristic fusion photovoltaic system fault arc detection method disclosed in the Chinese patent CN107086855A, a plurality of effective time-frequency characteristics are fused to accurately identify a plurality of fault arc forms in the grid-connected photovoltaic system, so that the fault arc action is accelerated, and meanwhile, the fault arc working conditions of a plurality of types are ensured not to malfunction. However, this patent uses only a single hidden Markov model and has a high false positive rate. According to the photovoltaic system fault arc detection method based on the self-adaptive kernel function and the instantaneous frequency estimation disclosed in the Chinese patent CN109560770A, the state of the photovoltaic system in the current period is judged by extracting a plurality of characteristic values and inputting the characteristic values into a trained naive Bayesian model, and the fault arc in the photovoltaic system can be accurately identified by utilizing a plurality of effective time-frequency characteristics and meanwhile, the fault arc can be prevented from misoperation under various arc working conditions. However, the patent requires manual qualitative and quantitative analysis of the feature quantity to achieve a high accuracy, which may change as the environment changes.
The Adaboost algorithm can be easily integrated with a new model and is easy to modify, and a method for integrating multiple classifiers is relatively universal for more accurate detection methods and detection models which may occur in the future, and can be used for integrating a new high-precision model with extremely low code quantity, so that the precision of the integrated model can be gradually optimized along with the updating of the sub-model along with the time, and the integrated model can be integrated into a better model with the lowest cost. However, the current fault arc detection method focuses more on the selection of detection early-stage characteristics, and an Adaboost fusion multi-classifier model for detecting the fault arc of the photovoltaic system is not found in published reports.
Disclosure of Invention
The invention provides a photovoltaic system fault arc detection method of an Adaboost fusion multi-classifier, which aims to solve the problems of accurate, reliable and quick identification of fault arcs and arc-like working conditions in a grid-connected photovoltaic system.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the fault arc detection method of the photovoltaic system comprises the following steps:
sampling an electric quantity detection signal of the photovoltaic system, quantifying the characteristics of the sampling signal in the current time window, inputting the characteristics into an Adaboost fusion multi-classifier model, and judging the real-time state of the photovoltaic system according to the output of the Adaboost fusion multi-classifier model; if the Adaboost fuses the multi-classifier model to output values corresponding to fault arc events in the continuous K time windows, judging that fault arcs occur in the photovoltaic system; otherwise, judging that the photovoltaic system is normal in operation.
Preferably, the Adaboost fusion multi-classifier model comprises a plurality of sub-models which are trained and used as classifiers (the number of the sub-models can be changed according to the actual detection algorithm), the characteristic quantity is calculated according to the sampling data of the current detection signals of the photovoltaic system, the calculated characteristic quantity values are respectively input into the corresponding sub-models after training, the judgment result output by each sub-model is input into the Adaboost model after training for fusion calculation, and the high-level or low-level output for indicating the state of the photovoltaic system is obtained.
Preferably, the sub-model adopts a supervised learning mode or a semi-supervised learning mode.
Preferably, the Adaboost fusion multi-classifier model comprises a support vector machine model, a random forest model and a decision tree model.
Preferably, the two kernel function types of the support vector machine model are radial and basic, the parameter C is 38-44, and the gamma parameter of the kernel function is 2-3; the number of subtrees of the random forest model is 1000-1500; the maximum depth of the decision tree model is 40-60, and the number of leaf nodes is 2.
Preferably, in the training of the Adaboost fusion multi-classifier model, an iteration model adopted by the Adaboost is a neural network NN, wherein the number of neurons is 128-256, the forgetting rate is 0.15-0.4, the number of output neurons is 2, and the iteration times is 500-1000.
Preferably, in the training of the Adaboost fusion multi-classifier model, feature values of the system output current signals under different types of arc and fault arc working conditions are calculated, the calculated feature values are used as learning samples of the Adaboost fusion multi-classifier model, and a training set and a testing set are generated by using the learning samples.
Preferably, the training set and the testing set are obtained by adopting a k-fold cross validation method, and the value of k is 3-7.
Preferably, the data used for training the Adaboost fusion multi-classifier model is 1/2-2/3 of the sample capacity, the rest sample data is used for model detection, and the value of the sample capacity is 10000 ~ 200000000, so that the Adaboost model is fully trained in a short time.
Preferably, the value of K is 4-12.
The beneficial effects of the invention are as follows:
according to the fault arc detection method for the photovoltaic system by fusing the Adaboost with the multi-classifier, disclosed by the invention, the fault arc in the photovoltaic system can be identified more sensitively, and particularly under the condition that the change of a signal to be detected is not obvious, the fault arc which cannot be detected under a single classifier model can be effectively detected by fusing the multi-classifier.
According to the method, the fault arc in the photovoltaic system can be rapidly identified, and for the problem that most classifier models need longer input, the whole system can receive intermediate signals from the sub-models earlier by fusing multiple classifiers, so that the possible fault arc is reflected more rapidly, and corresponding fault branches under the action fault arc working condition are accelerated.
According to the method, the fault arc in the photovoltaic system can be accurately identified, under the condition that a plurality of fault arcs are similar to the similar arc, the possibility of misjudgment is reduced to the minimum by means of fusion of the multiple classifiers, the misjudgment easily caused under a single classifier model can be effectively avoided, the condition that the multiple types of arc working conditions do not malfunction is ensured, and the safe and stable operation capability of the direct current photovoltaic system is improved.
The method effectively solves the problem of unbalanced class, effectively improves the data processing efficiency, and improves the identification capability of complex fault arcs; meanwhile, the method can be better adapted to critical data and improves the anti-interference capability of an algorithm.
The beneficial effects of the above aspects also indicate that the method can be used for reliably and quickly operating a plurality of fault arc conditions.
In addition, the method for detecting the fault arc of the photovoltaic system allows a model structure to be quickly edited under the condition that a source code is not modified, and the model structure is compounded into a machine learning model in a reasonable compounding mode, so that the whole model effect is better, and accurate identification of fault arcs and arc-like working conditions can be realized by changing learning sample data to be applied to direct current photovoltaic systems under different inverter loads.
The invention further achieves the technical effects that:
1) Aiming at the direct-current photovoltaic fault arc detection, a supervised or semi-supervised learning mode, such as three sub-models of a support vector machine, a random forest and a decision tree, is adopted for learning according to the learning accuracy and the convergence speed of the classifier. Aiming at the fact that various fault arcs possibly occur in an actual photovoltaic system, different detection signals are generated, adaboost fuses the use of multiple classifiers, the accuracy of fault arc working condition detection is greatly increased, the problem of refusal action caused by the unknown of the fault arc working conditions is solved, and safety threats brought by the fault arcs to the running of the direct current photovoltaic system and personal and property are effectively prevented.
2) The method does not need priori knowledge of weak classifiers (support vector machines, random forests and decision trees), the classification precision of the finally obtained strong classifier depends on all weak classifiers, and the method can remarkably improve the learning precision no matter applied to analog data or real data.
3) The method does not need to know the upper limit of the error rate of the weak classifiers (support vector machine, random forest and decision tree) in the sub-model in advance, the classification precision of the finally obtained composite model depends on the classification precision of all the weak classifiers, the capability of the classifier can be deeply dug, the assumed error rate can be adaptively adjusted according to the feedback of the weak classifiers, and the execution efficiency is high.
4) In order to distinguish fault arc and arc-like working conditions, the standard of cutting off the fault arc is that the Adaboost models output high level in continuous K periods, and the K value can be selected to realize rapid cutting off of the fault arc and cut off the arc-like working conditions without misoperation.
Drawings
Fig. 1 is a schematic block diagram of a fault arc detection method of a photovoltaic system.
Fig. 2a is a flowchart of the training of the Adaboost fusion multi-classifier model.
Fig. 2b is a flow chart of a method for detecting a fault arc of a photovoltaic system.
Fig. 3a is an output current signal of a dc photovoltaic system fault arc.
Fig. 3b shows a real-time judgment output signal of the system state of the fault arc detection of the direct current photovoltaic system by using the support vector machine model.
Fig. 3c is a system state real-time judging output signal of direct current photovoltaic system fault arc detection by using random forest model.
Fig. 3d shows a real-time judgment of the output signal of the system state of fault arc detection of the direct current photovoltaic system by applying the decision tree model.
Fig. 3e shows a real-time judging output signal of the system state of fault arc detection of the direct current photovoltaic system by applying the Adaboost fusion multi-classifier model of the invention.
Fig. 4a is an output current signal of a dc photovoltaic system arc-like.
Fig. 4b shows a real-time judgment of output signals of the system state of fault arc detection of the direct current photovoltaic system by using a support vector machine model.
Fig. 4c is a system state real-time judging output signal of direct current photovoltaic system fault arc detection by using random forest model.
Fig. 4d shows a real-time judgment of the output signal of the system state of fault arc detection of the dc photovoltaic system by using the decision tree model.
Fig. 4e shows a real-time judging output signal of the system state of fault arc detection of the direct current photovoltaic system by applying the Adaboost fusion multi-classifier model of the invention.
Detailed Description
The invention will be described in further detail with reference to the drawings and examples. The examples are given solely for the purpose of illustration and are not intended to limit the scope of the invention.
Referring to fig. 1, the principle of the fault arc detection method of the photovoltaic system of the present invention is as follows: firstly, detecting signals (current signals are taken here) with fault arc characteristics of a direct current photovoltaic system under different types of arcs and fault arc working conditions are sampled in real time, the sampling signals are subjected to time domain mean value, variance, skewness and kurtosis, short-time Fourier and wavelet transformation calculation of a time domain is carried out, corresponding characteristic vector groups are extracted, the characteristic vector groups and working condition labels are used as training and learning samples of three sub-models (support vector machines, random forests and decision trees) and an Adaboost model, and after the three sub-models and the Adaboost model are learned (the Adaboost fusion multi-classifier model is obtained), a plurality of fault arc characteristics can be fused to identify correct state judgment results (arc-like, fault arc or normal) for system sampling signals in an input time window. When actually detecting whether a fault arc occurs in the grid-connected photovoltaic system, only sampling the current of the system in a time window to be identified in real time, performing a series of feature calculation to obtain a plurality of feature values, inputting the feature values into a trained Adaboost fusion multi-classifier model, and identifying through output values. The Adaboost fusion multi-classifier model can output a 0/1 judgment result of whether a fault arc occurs in the direct current photovoltaic system in real time, output 1 (corresponding to high level) when the fault arc occurs in the system is judged, and output 0 (corresponding to low level) when the system is judged to normally operate. And judging the triggering condition of the fault arc cutting signal only when the Adaboost fuses the multi-classifier model to output 1, otherwise, considering that the direct current photovoltaic system is in normal operation, and directly judging the fault arc of the direct current photovoltaic system in the next time window, thereby being beneficial to the improvement of the fault arc detection speed by a photovoltaic fault arc detection algorithm. If the Adaboost fusion multi-classifier model continuously outputs a plurality of 1s in a plurality of detection periods of time windows, and before a detection period of the preset output 1 is not reached, only one low-level 0 output is needed, the system state at the moment is considered to be caused by the interference of arc-like working conditions, and is not a real fault arc working condition, and a cutting-off signal is not sent out. When the high level is output in a given period, the system is confirmed to generate fault arc, the triggering condition for cutting off the fault arc is met, the detection algorithm can send out a fault arc branch cutting-off signal, and the direct current photovoltaic system is prevented from being damaged by fault arc working conditions.
Referring to fig. 2a, the adaboost fusion multi-classifier model adopts a fusion type learning method, and the statistical rule and core characteristics of the direct current photovoltaic system fault arc which are reflected by the feature layer and are different from the class arc can be mastered after sample learning, and then the method can be used for identifying the grid-connected photovoltaic system fault arc.
Firstly, collecting required direct current photovoltaic system output current signals, carrying out a series of feature quantity calculation, obtaining a plurality of feature quantity values, taking the feature quantity values as a learning sample of an Adaboost fusion multi-classifier model, and generating a training set and a test set by utilizing the learning sample (namely, dividing the learning sample into slice sets firstly, dividing each slice set into subsets to form a subset array, and selecting the training set and the test set row by row in the subset array). The specific training and testing process of the Adaboost fusion multi-classifier model is as follows:
1) Carrying out characteristic calculation on the sampled current data of the photovoltaic system, and averagely dividing a characteristic calculation result into k slice sets according to the number of the samples;
2) Dividing each slice set into k sub-sets according to the number of samples, forming a k multiplied by k sub-set array, selecting a sub-set with the same column sequence number as the row sequence number in each row of the sub-set array as a test set, and the rest sub-sets of the row as training sets;
3) Selecting three algorithms of a support vector machine, a random forest and a decision tree, sequentially learning training sets of each row of the sub-array to obtain 3 xk models, inputting test sets of each row of the sub-array into corresponding models, splicing test results of the same algorithm into one row, and obtaining three rows of test results after splicing;
4) And inputting the three columns of test results into an Adaboost model for training and learning, and obtaining the influence of the three algorithms on the system state judgment result according to the accuracy of the classifier.
In the training process of the Adaboost fusion multi-classifier model, a slice set is obtained by adopting a k-fold cross validation method, a data set is divided into k stacks by a non-repeated sampling technology, one stack is selected as a test set, and the k-1 stack is further selected as a training set, and the steps are repeated for k times, wherein the training sets selected each time are different; based on the model tuning analysis method, a super-parameter value which enables the model generalization performance to be optimal is found.
In order to put the Adaboost fusion multi-classifier model into use as soon as possible, the speed of learning and training the Adaboost fusion multi-classifier model must be increased, and by giving a certain training precision standard, the learning and training process is performed until the training precision can accurately distinguish between a fault state and a normal state, so that when the states cannot be distinguished after the model is trained for many times, the learning and training process of the Adaboost fusion multi-classifier model must be finished under acceptable training precision through initialization parameter setting. For a large number of obtained multi-characteristic value sample sets under fault arcs and arc-like working conditions of the grid-connected photovoltaic system, the sample capacity is 15000, the data for learning the Adaboost fusion multi-classifier model is 1/2-2/3 of the sample capacity, the rest sample data are used for model testing, and the detection effect of the proposed photovoltaic system fault arc detection algorithm is clear.
Adaboost fuses optimal super-parameters of the multi-classifier model: the two kernel function types of the support vector machine model are radial and basic, the parameter C is set to 42.2243, and the gamma parameter of the kernel function is 2.639; the number of subtrees of the random forest model is 1000; the maximum depth of the decision tree model is 50, and the number of leaf nodes is 2; the iteration model adopted by Adaboost is a neural network NN, wherein the number of neurons is 128, the forgetting rate is 0.23, the number of output neurons is 2, the iteration number is 500, and the iteration model has higher accuracy and higher convergence rate.
Referring to fig. 2b, the steps of the fault arc detection method of the direct current photovoltaic system with the Adaboost fusion multi-classifier in the invention are specifically described:
the first step, the parameter setting process includes setting the sampling frequency f (for example, the value of 0.5-3 MHz) of the current signal by the detection signal device, the number N of time window points (for example, the value of 4000-12000), the fault arc triggering condition, the characteristic calculation of the collected signal and the like. In the running of the direct current photovoltaic system, sampling the output current signal of the grid-connected photovoltaic system point by point at the frequency f, and setting the time window length T s Analyzing the current signal, calculating the characteristic quantity of the current sampling signal in one analysis period to obtain the value of the required characteristic quantity, and transferring to the second step to detect the signal.
And secondly, detecting the characteristic quantity by adopting a support vector machine, a random forest and a decision tree model, outputting a sub-model classification result, combining the sub-model classification result into a three-dimensional array, and transferring to the third step to fuse the three-dimensional array.
Step three, inputting the three-dimensional array into a trained Adaboost model, fusing the three-dimensional array in a machine learning mode, judging whether a fault arc exists or not through an output value of the trained Adaboost model, wherein an output 0 of the Adaboost model represents that a direct current photovoltaic system in the time window is in a normal running state, an output 1 of the Adaboost model represents that the fault arc possibly occurs in the direct current photovoltaic system in the time window, and switching to the step four to carry out specific judgment.
Step four, the running state of the direct current photovoltaic system at the moment is primarily judged according to the trained Adaboost model output value, if 0 is output, the direct current photovoltaic system in the time window is judged to be in a normal running state, and the state detection of the output current signal of the direct current photovoltaic system in the next time window is returned to the step one; if the output is 1, judging that the direct current photovoltaic system in the time window is likely to generate fault arc, and further judging whether the fault arc occurs or not according to the following standard: whether the period of the continuous output 1 reaches the triggering standard of the period number for cutting off the fault arc or not, if the period reaches the triggering standard (namely, the continuous K time window outputs 1, K is 5), determining that the fault arc occurs in the direct current photovoltaic system, and sending out a fault arc cutting-off branch signal; if the trigger standard is not met, the condition that the insufficient number of continuous 1 outputs are formed by the arc-like working condition of the direct current photovoltaic system is judged, and the state detection of the output current signal of the direct current photovoltaic system in the next time window is carried out in the first step.
The detection model (Adaboost fusion multi-classifier model) provided by the invention has stronger fault arc identification capability, avoids misoperation of the direct current fault arc detection device caused by accidental factors, and reduces loss caused by misjudgment of the model and branch circuit cutting. The fault arc detection method of the photovoltaic system is applied to a direct current photovoltaic system, and the identification effect of the fault arc and the arc-like in the method is shown as follows.
As shown in fig. 3a, the dc photovoltaic system output current detection signal is obtained at a sampling frequency f=200 kHz. Before 5.4s, the direct current photovoltaic system is in a normal operation state, and the output current of the direct current photovoltaic system is 18A. After 5.4s, the direct current photovoltaic system fails, and the output current of the corresponding direct current photovoltaic system rapidly drops.
As shown in fig. 4a, the dc photovoltaic system output current detection signal is obtained at a sampling frequency f=200 kHz. Before 1.1s, the direct current photovoltaic system is not started, and the current is zero. And starting the system, wherein the current rises once every 4.7s, the current increment is 2.7A, and the normal working current output by the direct current photovoltaic system is 18A.
By means of feature calculation, three models of a support vector machine, a random forest and a decision tree are applied to judge the system state in real time, and the result shows that (as shown in figures 3 b-3 d and figures 4 b-4 d) misjudgment exists in the real-time judgment of the fault arc and the class arc. And the real-time judgment results of the three models are input into the trained Adaboost model for judgment, the obtained real-time judgment results are used for giving correct low-level indication to normal working current by the detection algorithm, and correct high-level indication to all fault state current signals, as shown in figures 3e and 4 e. As can be seen from the results shown in fig. 3e and 4e, the fault arc detection method of the photovoltaic system can give correct low-level indication to normal starting current and correct high-level indication to all fault state current signals, so that the fault arc and arc-like working conditions in the direct current photovoltaic system are accurately distinguished, misoperation of the direct current fault arc detection device caused by accidental factors is avoided, and loss caused by misjudgment of a cut-off branch of a model is reduced.
In a word, the fault arc form identification and the arc-like working condition identification result of the photovoltaic system fault arc detection method are combined in the direct current photovoltaic system, and the detection method can accurately distinguish fault arcs and various arc-like working conditions in the direct current photovoltaic system.