CN112818912A - Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting - Google Patents
Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting Download PDFInfo
- Publication number
- CN112818912A CN112818912A CN202110204997.2A CN202110204997A CN112818912A CN 112818912 A CN112818912 A CN 112818912A CN 202110204997 A CN202110204997 A CN 202110204997A CN 112818912 A CN112818912 A CN 112818912A
- Authority
- CN
- China
- Prior art keywords
- early warning
- signal
- electric field
- component
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 25
- 230000005684 electric field Effects 0.000 claims abstract description 56
- 238000012360 testing method Methods 0.000 claims description 8
- 238000001514 detection method Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 4
- 238000002156 mixing Methods 0.000 claims description 4
- 238000007500 overflow downdraw method Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 27
- 230000008859 change Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000006698 induction Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000010355 oscillation Effects 0.000 description 2
- 238000011896 sensitive detection Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005670 electromagnetic radiation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 229910001220 stainless steel Inorganic materials 0.000 description 1
- 239000010935 stainless steel Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Signal Processing (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Testing Or Calibration Of Command Recording Devices (AREA)
Abstract
The invention discloses a thunder and lightning early warning method based on integrated empirical mode decomposition and extreme gradient promotion, which comprises the steps of decomposing an electric field signal observed by an atmospheric electric field instrument by adopting EEMD (ensemble empirical mode decomposition), calculating the sample entropy of original data and each modal function, carrying out classification reconstruction according to random components, detail components and trend components, respectively extracting statistics and self-encoder characteristics of reconstruction components, establishing an early warning model by adopting XGboost algorithm, and fusing classifiers of each component; the invention can effectively improve the early warning effect and reduce the false alarm rate.
Description
Technical Field
The invention relates to the field of meteorological lightning early warning, in particular to a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting.
Background
Thunder and lightning is a natural phenomenon accompanied by strong discharge, and the generated high voltage, large current and strong electromagnetic radiation not only damage communication and power supply systems, but also bring serious threat to human life safety and cause serious economic loss. The electric charges carried by the thunderstorm cloud can generate a strong atmospheric electric field on the ground, and the formation, development and dissipation processes of the thunderstorm cloud can be inverted through the change of the ground atmospheric electric field. The observation and analysis of the atmospheric electric field are effective ways for improving the lightning early warning efficiency.
The atmospheric electric field instrument is an instrument for measuring the ground average atmospheric electric field, the atmospheric electric field is the most direct physical quantity for reflecting the charge change in the thunderstorm cloud, and the atmospheric electric field signal under the disturbance weather is observed and analyzed to directly promote the improvement of the thunder and lightning early warning probability.
The existing lightning early warning technology generally adopts an atmospheric electric field amplitude threshold value method to give a lightning early warning signal, the method is simple to implement, but ignores the inherent physical characteristics of the atmospheric electric field signal, and has low success early warning probability; meanwhile, the probability of lightning early warning is not high, and the false alarm rate is higher at the moment. In recent years, scholars at home and abroad begin to deeply mine the relationship between the signal characteristics of the atmospheric electric field and lightning, but due to the lack of an effective atmospheric electric field data processing algorithm, the early warning time is extremely short, and the characteristic mining of the atmospheric electric field is insufficient.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the defects in the prior art, the invention aims to provide a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting, which can effectively improve the early warning effect and reduce the false alarm rate.
The technical scheme is as follows: the invention discloses a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting, which comprises the following steps of:
(1) and collecting and processing atmospheric electric field signal data, collecting a ground atmospheric electric field by using an atmospheric electric field instrument and calibrating data to obtain an atmospheric electric field signal x (n).
(2) Decomposing and reconstructing the atmospheric electric field signal into modal components with different scales and containing different local characteristic information;
(3) analyzing sample entropy, calculating the sample entropy of each IMF component, dividing the IMF components into three types of random, detail and trend components according to the entropy, and summing the three types of signals to obtain corresponding reconstruction components;
(4) extracting features, namely extracting feature values of the reconstructed components by utilizing a statistical and deep learning method;
(5) constructing an early warning model, and training the extracted features as input quantity after normalization processing; in the training process, model parameters are adjusted by using a grid search method, and the accuracy of the model is improved.
(6) And (5) lightning early warning, predicting the test set by using the model obtained in the step (5), decomposing the test set to obtain prediction results of all components, fusing the prediction results, and giving an early warning signal.
The invention provides a thunder and lightning early warning method based on integrated empirical mode decomposition (EEMD) and extreme gradient boost (XGboost) based on the nonlinear non-stationary characteristic of an atmospheric electric field signal; the method comprises the steps of decomposing data collected by an atmospheric electric field instrument by adopting an EEMD algorithm, calculating sample entropies of original data and various modal functions, carrying out classification reconstruction according to random components, detail components and trend components, respectively extracting statistics and self-encoder characteristics of reconstruction components, establishing an early warning model by adopting an XGboost algorithm, fusing classifiers of various components, and giving out an early warning signal.
In the step (2), the decomposition and reconstruction of the atmospheric electric field signal comprises the following steps:
(2.1) setting an atmospheric electric field signal as x (n), wherein n is 1, 2.., L;
p Gaussian white noises { omega ] with different normal distributions are selectedp(n)},p=1,2,...,P;n=1,2,...,L;
Superposing the original atmospheric electric field signal and noise to obtain: x is the number ofp(n)=x(n)+ωp(n),p=1,...,P;
(2.2) mixing the signals xp(n) local maxima connected by a curve, defined as the upper envelope fp max(n); all local minimum points are connected by a curve and defined as a lower envelope fp min(n); and the average value of the upper envelope and the lower envelope is mp(n), then:
mp(n)=[fp max(n)+fp min(n)]/2;
(2.3) mixing the signals xp(n) average value m of envelopep(n) performing a difference operation: c. Cp1(n)=xp(n)-mp(n);
(2.4) c if first obtainedp1(n) not satisfying intrinsic mode function IMFIf required, it is continuously regarded as a new signal xp(n) repeating the steps (2.2) and (2.3) until cp1(n) is an IMF component;
obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n); signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n); namely:
(2.5) for P sequences { x ] after white noise superpositionp(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n);
Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n);
in the step (3), the sample entropy analysis comprises the following steps:
(3.1) IMF c from k orderk(n), starting with the ith element in L, taking τ consecutive elements to form τ -dimensional vector cp2(n)、cp3(n)、...、cpK(n);
If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) constitutes a matrix C of (L- τ +1) × τ:
will vector Cτ(i) And Cτ(j) A distance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, then:
d[Cτ(i),Cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
(3.2) given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
ComputingThe average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)];
calculating to obtain Aτ+1(r); electric field signal ckThe entropy of (n) can be estimated as:
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) isStandard deviation of the signal); the sample entropy H of the original signal x (n) can be calculated;
(3.4) dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula,
u(n)=∑ck(n),where(Hk>σ1H)
v(n)=∑ck(n),where(σ1H<Hk<σ2H)。
w(n)=∑ck(n)+r(n),where(Hk<σ2H)
in the step (4), the feature extraction includes the following steps:
(4.1) extracting statistics such as peak-to-peak value, energy, variance, difference and the like of random, detail and trend components respectively, wherein the statistics for the ith atmospheric electric field sample can be expressed as:
(4.2) extracting the characteristics of the original signal by adopting depth self-coding; each reconstructed component is taken as input, encoded by an encoder and decoded by a decoder, so that the decoded data is kept consistent with the input as much as possible, and finally the output of the encoder is taken as a new characteristic.
In the step (5), the construction of the early warning model comprises the following steps:
(5.1) P atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstructing; extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi};
(a) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(b) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
(c) and finally obtaining a final prediction result through T-round iteration:
the target function expression in the t round is:
wherein,as a regular term, R is the number of leaf nodes of the current lifting tree, omegajIs the score of each leaf node, λ and γ are regularization parameters;
the objective of each iteration is to minimize the objective function, which becomes:
in the formula,since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties of each leaf node, i.e. the penalty function
IiRepresented as samples falling at leaf node j,the score representing the leaf node where the ith sample is located may be given by:
in this case, the objective function is further simplified to
Traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui);
Constructing a detail component and trend component early warning model by the same method;
(5.2) adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
wherein e iskRepresenting the kth classifier.
(5.3) defining the following cost function to measure the comprehensive performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
where OA is the overall accuracy, POD is the detection probability, FAR is the false alarm rate, m1、m2、m3Respectively representing cost coefficients of three types of errors, namely total recognition error, false alarm missing and false alarm;
and setting the cost-substitution ratio of the overall recognition error and the early warning error (including the false alarm and the false alarm) of the classifier as 1: alpha and the cost-substitution ratio of the false alarm and the false alarm as 1: beta, namely
m1/(m2+m3)=1/α,m2/m3=1/β
Then, the cost function is further expressed as:
the values of alpha and beta are adjusted to meet the requirements of different applications.
The invention utilizes EEMD self-adaptive decomposition algorithm to preprocess the atmospheric electric field signal, and can carefully grasp the change of the atmospheric electric field signal in different time scales; the XGBoost learning algorithm has low requirements on the number of samples, and combines EEMD decomposition reconstruction components to establish multi-model prediction to make up for the limitation of a single model on signal prediction. Finally, the model is verified to have a good early warning effect through testing.
Has the advantages that:
(1) the traditional thunder early warning method ignores the oscillation scale characteristic of the atmospheric electric field signal, so that the detection probability is low, and the method is a reliable and effective method by combining a non-stationary signal processing algorithm and machine learning; the invention utilizes EEMD self-adaptive decomposition algorithm to preprocess the atmospheric electric field signal, and can carefully grasp the change of the atmospheric electric field signal in different time scales; the XGBoost learning algorithm has low requirements on the number of samples, and combines EEMD decomposition reconstruction components to establish multi-model prediction to make up the limitation of a single model on signal prediction; and tests verify that the model has a good early warning effect, and a new thought and method are provided for lightning early warning.
(2) When the beta value is compared, the early warning performance of the model is established by respectively adopting common voting, multi-classifier fusion and single-component characteristics; compared with a single model XGboost method, the multi-classifier fusion method has the advantages that the detection probability is improved by 9.6% -14.3%, and the false alarm rate is reduced by 11.1% -16.7%; compared with the common voting method, the detection probability is improved by 4.8 percent at most, and the false alarm rate is reduced by 5.2 to 6.4 percent. Therefore, the early warning performance can be effectively improved by adopting the multi-classifier fusion.
Drawings
FIG. 1 is an electric field diagram of the present invention, wherein (a) is a physical diagram of a sensor, (b) is a schematic diagram of a sensor structure, and (c) is an overall appearance diagram;
FIG. 2 is a flow chart of the algorithm of the present invention;
FIG. 3 is a flow chart of EEMD decomposition;
FIG. 4 is an original signal and reconstructed components;
fig. 5 shows the results of the training set experiment, where (a) is the overall accuracy, (b) is the probability of detection, and (c) is the false alarm rate.
Detailed Description
The present invention will be described in further detail with reference to examples.
As shown in fig. 1 to 3, the lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting of the invention comprises the following steps:
step S1: data acquisition and processing, specifically:
s11: as shown in fig. 1(a) and (b), the sensor mainly comprises a moving plate 1, a fixed plate 2, a small blade 5, a photoelectric pair tube, a photoelectric switch 4, a motor 6 and the like. The moving plate (rotor) 1 and the fixed plate (induction electrode) 2 are respectively made of stainless steel and copper and are composed of four fans with the same shape; below is a cylindrical metal body 3. The moving plate 1 and the small blade 5 are both fixed on a motor shaft of a motor 6.
S12: when the microcontroller drives the motor to rotate at a fixed speed, the moving plate and the small blade rotate at the same speed.
S13: the moving plate can periodically shield the atmospheric electric field above the stator plate, the stator plate is periodically exposed in the space electric field, the stator plate (induction electrode) can generate induction current, the current signal outputs direct current voltage after I/V conversion, amplification filtering and phase-sensitive detection,
s14: the electric field strength is calculated by measuring the magnitude of the dc voltage. The small blade and the moving plate rotate at the same speed, the light path of the photoelectric switch is periodically controlled during rotation, a synchronous switch signal is generated to serve as a reference for phase-sensitive detection, and the polarity of an electric field is determined according to the reference signal. The overall appearance photograph of the atmospheric electric field instrument is shown in fig. 1(c), wherein a sensor 7 is protruded at the upper end, a power supply main control case 8 is arranged in the middle box body, and a support rod 9 is arranged at the bottom.
Step S2: the method comprises the following steps of decomposing and reconstructing atmospheric electric field signals:
s21: the atmospheric electric field signal is set as x (n), n is 1, 2, and L, and white noise with small amplitude is added to overcome mode aliasing, so that the signal is distributed on a proper oscillation scale. P Gaussian white noises { omega ] with different normal distributions are selectedp(n), P ═ 1, 2.., P; n is 1, 2. Superposing the original atmospheric electric field signal and noise to obtain:
xp(n)=x(n)+ωp(n),p=1,...,P
s22: will signal xp(n) the local maxima are connected by a curve, defined as the upper envelope fp max(n); similarly, all local minimum points are connected by a curve to define a lower envelope fp min(n); and the average value of the upper envelope and the lower envelope is mp(n) then:
mp(n)=[fp max(n)+fp min(n)]/2
s23: will signal xp(n) average value m of envelopep(n) performing a difference operation:
cp1(n)=xp(n)-mp(n)
s24: if c is obtained for the first timep1(n) does not satisfy the requirement of Intrinsic Mode Function (IMF)Then continue to be regarded as a new signal xp(n) repeating the steps (2) and (3) until cp1(n) is an IMF component. Obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n) of (a). Signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n) of (a). Namely:
s25: for P sequences { x after superposition of white noisep(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n) of (a). Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n). The original atmospheric electric field signal x (n) can be expressed as:
step S3: sample entropy analysis, as follows:
s31: IMF c from kk(n), starting with the ith element in L, taking τ consecutive elements to form τ -dimensional vector cp2(n)、cp3(n)、...、cpK(n) of (a). If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) forms a matrix C of (L- τ +1) × τ
Vector Cτ(i) And Cτ(j) In-line with the aboveDistance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, i.e.:
d[Cτ(i),Cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
s32: given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
ComputingThe average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)]
calculating to obtain Aτ+1(r) of (A). Electric field signal ckThe entropy of (n) can be estimated as:
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) is the standard deviation of the signal); the sample entropy H of the original signal x (n) can likewise be calculated.
S34: dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula.
u(n)=∑ck(n),where(Hk>σ1H)
v(n)=∑ck(n),where(σ1H<Hk<σ2H)
w(n)=∑ck(n)+r(n),where(Hk<σ2H)
Step S4: and (3) feature extraction, which comprises the following steps:
s41: and respectively extracting statistics such as peak-to-peak value, energy, variance, difference and the like of random, detail and trend components, wherein the statistics for the ith atmospheric electric field sample can be expressed as:
s42: and performing feature extraction on the original signal by adopting depth self-coding. Each reconstructed component is taken as input, encoded by an encoder and decoded by a decoder, so that the decoded data is kept consistent with the input as much as possible, and finally the output of the encoder is taken as a new characteristic.
Step S5: establishing an early warning model, specifically as follows:
s51: p atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstruction. Extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi}。
(1) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(2) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
(3) and finally obtaining a final prediction result through T-round iteration:
the target function expression in the t round is:
wherein,the method is a regular term and is used for balancing the complexity of the model and avoiding overfitting; r is the number of the current leaf nodes of the lifting tree, omegajIs the score of each leaf node, and λ and γ are regularization parameters. L is a logarithmic loss function, i.e.
The objective of each iteration is to minimize the objective function, which becomes:
in the formula,since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties of each leaf node, i.e. the penalty function
IjRepresented as samples falling at leaf node j,the score representing the leaf node where the ith sample is located may be given by:
in this case, the objective function is further simplified to
Traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui). And constructing a detail component and trend component early warning model by the same method.
S52: by adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
wherein e iskRepresenting the kth classifier.
S53: in lightning early warning, detection Probability (POD) and False Alarm Rate (FAR) are important parameters for evaluating early warning performance, and are defined as follows:
the method comprises the following steps of obtaining the number of times of lightning occurrence and early warning success by adopting EA (Ethernet, Internet and data bus), obtaining FTW (fiber to the Home) and FA (fiber to the Home), wherein EA is the number of times of lightning occurrence and early warning success, FTW is the number of times of lightning occurrence but no early warning, and FA is the number of times of lightning occurrence but no. Meanwhile, the Overall Accuracy (OA) is a measure of the overall recognition performance of the classifier.
The following cost function L is defined to measure the overall performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
wherein m is1,m2,m3And respectively representing the cost coefficients of three types of errors, namely total recognition error, false alarm and false alarm. Assuming that the cost-to-cost ratio of the overall recognition error to the early warning error (including false alarm and false alarm) of the classifier is 1: alpha, the cost-to-cost ratio of the false alarm and the false alarm is 1: beta, that is, the classifier
m1/(m2+m3)=1/α,m2/m3=1/β
The cost function can be written as:
the values of alpha and beta are adjusted to meet the requirements of different application occasions.
The following table 1 is a test set experimental result, and it can be seen that the multi-classification fusion algorithm has more obvious advantages compared with a single model and a common voting algorithm, and both the accuracy and the false alarm rate are improved; fig. 4 shows the original signal and the reconstructed component, and fig. 5 shows the training set experiment result, which shows that the early warning probability and the false alarm probability both decrease with the increase of the beta component value and are in accordance with the expectation.
Table 1 test set of experimental results
Claims (7)
1. A lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting is characterized by comprising the following steps:
(1) collecting and processing atmospheric electric field signal data;
(2) decomposing the atmospheric electric field signal into modal components with different scales, wherein the modal components contain different local characteristic information;
(3) calculating sample entropies of the IMF components, dividing the IMF components into random, detail and trend components according to the entropy, and summing the three types of signals to obtain corresponding reconstruction components;
(4) extracting a characteristic value of the reconstruction component;
(5) constructing an early warning model, and training the extracted features as input quantity after normalization processing; in the training process, model parameters are adjusted by using a grid search method;
(6) and (5) lightning early warning, predicting the test set by using the model obtained in the step (5), decomposing the test set to obtain prediction results of all components, fusing the prediction results, and giving an early warning signal.
2. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient improvement as claimed in claim 1, characterized in that the method decomposes the data collected by the atmospheric electric field instrument by using the EEMD algorithm, calculates the sample entropies of the original data and each modal function, performs classification reconstruction according to the random component, the detail component and the trend component, respectively extracts the statistics and the self-encoder characteristics of the reconstruction component, establishes the early warning model by using the BooXGate algorithm, and fuses the classifiers of each component to give an early warning signal.
3. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: and (2) decomposing and reconstructing the atmospheric electric field signal, comprising the following steps:
(2.1) setting an atmospheric electric field signal as x (n), wherein n is 1, 2.., L; p Gaussian white noises { omega ] with different normal distributions are selectedp(n)},p=1,2,...,P;n=1,2,...,L;
Superposing the original atmospheric electric field signal and noise to obtain: x is the number ofp(n)=x(n)+ωp(n),p=1,...,P;
(2.2) mixing the signals xp(n) local maxima connected by a curve, defined as the upper envelope fpmax(n);
All local minimum points are connected by a curve and defined as a lower envelope fpmin(n);
And the average value of the upper envelope and the lower envelope is mp(n), then:
mp(n)=[fpmax(n)+fpmin(n)]/2;
(2.3) mixing the signals xp(n) average value m of envelopep(n) performing a difference operation: c. Cp1(n)=xp(n)-mp(n);
(2.4) c if first obtainedp1(n) not satisfying the IMF requirement, it is considered as a new signal xp(n) repeating the steps (2.2) and (2.3) until cp1(n) is an IMF component;
obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n); signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n); namely:
(2.5) for P sequences { x ] after white noise superpositionp(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n);
Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n);
4. the lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: the step (3) comprises the following steps:
(3.1) IMFc from k-orderk(n), where n is 1, 2, …, starting with the ith element in L, and τ consecutive elements are taken to form τ -dimensional vector cp2(n)、cp3(n)、…、cpK(n);
If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) constitutes a matrix C of (L- τ +1) × τ:
will vector Cτ(i) And Cτ(j) A distance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, then:
d[Cτ(i),cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
(3.2) given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
ComputingThe average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)];
calculating to obtain Aτ+1(r); electric field signal ckThe entropy of (n) is:
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) is the standard deviation of the signal); then calculating the sample entropy H of the original signal x (n);
(3.4) dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula,
5. the lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: and (4) extracting features, which comprises the following steps:
(4.1) extracting peak-to-peak, energy, variance, difference statistics of random, detail and trend components respectively, and expressing the statistics for the ith atmospheric electric field sample as:
and (4.2) carrying out feature extraction on the original signal by adopting depth self-coding.
6. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boosting according to claim 1, wherein in the step (5), the constructing of the early warning model comprises the following steps:
(5.1) P atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstructing; extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi};
(a) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(b) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
(c) and finally obtaining a final prediction result through T-round iteration:
the target function expression in the t round is:
wherein,as a regular term, R is the number of leaf nodes of the current lifting tree, omegajIs the score of each leaf node, λ and γ are regularization parameters;
the objective of each iteration is to minimize the objective function, which becomes:
in the formula,since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties for each leaf node, i.e.:
Ijrepresented as samples falling at leaf node j,the score representing the leaf node where the ith sample is located is given by:
the objective function is simplified to:
traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui);
Constructing a detail component and trend component early warning model by the same method;
(5.2) adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
wherein e iskRepresenting the kth classifier.
7. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: the following cost function is defined for measuring the overall performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
where OA is the overall accuracy, POD is the detection probability, FAR is the false alarm rate, m1、m2、m3Respectively represent the wholeAnd identifying cost coefficients of errors, false alarm omission and false alarm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204997.2A CN112818912A (en) | 2021-02-24 | 2021-02-24 | Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204997.2A CN112818912A (en) | 2021-02-24 | 2021-02-24 | Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112818912A true CN112818912A (en) | 2021-05-18 |
Family
ID=75865292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110204997.2A Pending CN112818912A (en) | 2021-02-24 | 2021-02-24 | Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112818912A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113255510A (en) * | 2021-05-21 | 2021-08-13 | 南京信息工程大学 | Rainbow cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy |
CN114252706A (en) * | 2021-12-15 | 2022-03-29 | 华中科技大学 | Lightning early warning method and system |
CN117617921A (en) * | 2024-01-23 | 2024-03-01 | 吉林大学 | Intelligent blood pressure monitoring system and method based on Internet of things |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109142896A (en) * | 2018-07-25 | 2019-01-04 | 南京信息工程大学 | Lightning Warning method based on three-dimensional atmospheric electric field and MEMD |
-
2021
- 2021-02-24 CN CN202110204997.2A patent/CN112818912A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109142896A (en) * | 2018-07-25 | 2019-01-04 | 南京信息工程大学 | Lightning Warning method based on three-dimensional atmospheric electric field and MEMD |
Non-Patent Citations (2)
Title |
---|
周盛山等: "EEMD和CNN-XGBoost在风电功率短期预测的应用研究", 《电子测量技术》, pages 55 - 61 * |
徐伟等: "基于集成经验模态分解和极端梯度提升的雷电预警方法", 《仪器仪表学报》, pages 235 - 243 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113255510A (en) * | 2021-05-21 | 2021-08-13 | 南京信息工程大学 | Rainbow cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy |
CN113255510B (en) * | 2021-05-21 | 2023-07-25 | 南京信息工程大学 | Thunderstorm cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy |
CN114252706A (en) * | 2021-12-15 | 2022-03-29 | 华中科技大学 | Lightning early warning method and system |
CN114252706B (en) * | 2021-12-15 | 2023-03-14 | 华中科技大学 | Lightning early warning method and system |
CN117617921A (en) * | 2024-01-23 | 2024-03-01 | 吉林大学 | Intelligent blood pressure monitoring system and method based on Internet of things |
CN117617921B (en) * | 2024-01-23 | 2024-03-26 | 吉林大学 | Intelligent blood pressure monitoring system and method based on Internet of things |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112818912A (en) | Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting | |
CN110609477B (en) | Electric power system transient stability discrimination system and method based on deep learning | |
CN116229380B (en) | Method for identifying bird species related to bird-related faults of transformer substation | |
CN109884419A (en) | A kind of wisdom grid power quality on-line fault diagnosis method | |
CN111537853A (en) | Intelligent detection method for partial discharge of switch cabinet based on multi-source heterogeneous data analysis | |
CN112668611B (en) | Kmeans and CEEMD-PE-LSTM-based short-term photovoltaic power generation power prediction method | |
CN114169374B (en) | Cable-stayed bridge stay cable damage identification method and electronic equipment | |
CN112686093A (en) | Fusion partial discharge type identification method based on DS evidence theory | |
CN115879048A (en) | Series arc fault identification method and system based on WRFMDA model | |
CN113987910A (en) | Method and device for identifying load of residents by coupling neural network and dynamic time planning | |
CN115239971A (en) | GIS partial discharge type recognition model training method, recognition method and system | |
CN116756594A (en) | Method, system, equipment and medium for detecting abnormal points of power grid data | |
CN114280490A (en) | Lithium ion battery state of charge estimation method and system | |
CN113283474A (en) | SVM-based partial discharge pattern recognition method | |
CN117093894A (en) | Partial discharge modeling analysis method, system and device based on neural network | |
CN117390368A (en) | Lightning probability calculation method, device and equipment for wind turbine and storage medium | |
CN114818827A (en) | Non-invasive load decomposition method based on seq2point network | |
CN112163494A (en) | Video false face detection method and electronic device | |
CN117171702A (en) | Multi-mode power grid fault detection method and system based on deep learning | |
CN113780346B (en) | Priori constraint classifier adjustment method, system and readable storage medium | |
CN111025100A (en) | Transformer ultrahigh frequency partial discharge signal mode identification method and device | |
CN113902581A (en) | Power utilization abnormity detection method based on depth self-encoder Gaussian mixture model | |
CN117496223A (en) | Light insulator defect detection method and device based on deep learning | |
CN115983507B (en) | Method and system for predicting broadband oscillation risk of section of power grid of transmitting end source | |
CN110261773B (en) | Aviation generator fault symptom extraction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210518 |