CN112818912A - Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting - Google Patents

Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting Download PDF

Info

Publication number
CN112818912A
CN112818912A CN202110204997.2A CN202110204997A CN112818912A CN 112818912 A CN112818912 A CN 112818912A CN 202110204997 A CN202110204997 A CN 202110204997A CN 112818912 A CN112818912 A CN 112818912A
Authority
CN
China
Prior art keywords
early warning
signal
electric field
component
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110204997.2A
Other languages
Chinese (zh)
Inventor
夏志祥
徐伟
郑玉兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202110204997.2A priority Critical patent/CN112818912A/en
Publication of CN112818912A publication Critical patent/CN112818912A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention discloses a thunder and lightning early warning method based on integrated empirical mode decomposition and extreme gradient promotion, which comprises the steps of decomposing an electric field signal observed by an atmospheric electric field instrument by adopting EEMD (ensemble empirical mode decomposition), calculating the sample entropy of original data and each modal function, carrying out classification reconstruction according to random components, detail components and trend components, respectively extracting statistics and self-encoder characteristics of reconstruction components, establishing an early warning model by adopting XGboost algorithm, and fusing classifiers of each component; the invention can effectively improve the early warning effect and reduce the false alarm rate.

Description

Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting
Technical Field
The invention relates to the field of meteorological lightning early warning, in particular to a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting.
Background
Thunder and lightning is a natural phenomenon accompanied by strong discharge, and the generated high voltage, large current and strong electromagnetic radiation not only damage communication and power supply systems, but also bring serious threat to human life safety and cause serious economic loss. The electric charges carried by the thunderstorm cloud can generate a strong atmospheric electric field on the ground, and the formation, development and dissipation processes of the thunderstorm cloud can be inverted through the change of the ground atmospheric electric field. The observation and analysis of the atmospheric electric field are effective ways for improving the lightning early warning efficiency.
The atmospheric electric field instrument is an instrument for measuring the ground average atmospheric electric field, the atmospheric electric field is the most direct physical quantity for reflecting the charge change in the thunderstorm cloud, and the atmospheric electric field signal under the disturbance weather is observed and analyzed to directly promote the improvement of the thunder and lightning early warning probability.
The existing lightning early warning technology generally adopts an atmospheric electric field amplitude threshold value method to give a lightning early warning signal, the method is simple to implement, but ignores the inherent physical characteristics of the atmospheric electric field signal, and has low success early warning probability; meanwhile, the probability of lightning early warning is not high, and the false alarm rate is higher at the moment. In recent years, scholars at home and abroad begin to deeply mine the relationship between the signal characteristics of the atmospheric electric field and lightning, but due to the lack of an effective atmospheric electric field data processing algorithm, the early warning time is extremely short, and the characteristic mining of the atmospheric electric field is insufficient.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the defects in the prior art, the invention aims to provide a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting, which can effectively improve the early warning effect and reduce the false alarm rate.
The technical scheme is as follows: the invention discloses a lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting, which comprises the following steps of:
(1) and collecting and processing atmospheric electric field signal data, collecting a ground atmospheric electric field by using an atmospheric electric field instrument and calibrating data to obtain an atmospheric electric field signal x (n).
(2) Decomposing and reconstructing the atmospheric electric field signal into modal components with different scales and containing different local characteristic information;
(3) analyzing sample entropy, calculating the sample entropy of each IMF component, dividing the IMF components into three types of random, detail and trend components according to the entropy, and summing the three types of signals to obtain corresponding reconstruction components;
(4) extracting features, namely extracting feature values of the reconstructed components by utilizing a statistical and deep learning method;
(5) constructing an early warning model, and training the extracted features as input quantity after normalization processing; in the training process, model parameters are adjusted by using a grid search method, and the accuracy of the model is improved.
(6) And (5) lightning early warning, predicting the test set by using the model obtained in the step (5), decomposing the test set to obtain prediction results of all components, fusing the prediction results, and giving an early warning signal.
The invention provides a thunder and lightning early warning method based on integrated empirical mode decomposition (EEMD) and extreme gradient boost (XGboost) based on the nonlinear non-stationary characteristic of an atmospheric electric field signal; the method comprises the steps of decomposing data collected by an atmospheric electric field instrument by adopting an EEMD algorithm, calculating sample entropies of original data and various modal functions, carrying out classification reconstruction according to random components, detail components and trend components, respectively extracting statistics and self-encoder characteristics of reconstruction components, establishing an early warning model by adopting an XGboost algorithm, fusing classifiers of various components, and giving out an early warning signal.
In the step (2), the decomposition and reconstruction of the atmospheric electric field signal comprises the following steps:
(2.1) setting an atmospheric electric field signal as x (n), wherein n is 1, 2.., L;
p Gaussian white noises { omega ] with different normal distributions are selectedp(n)},p=1,2,...,P;n=1,2,...,L;
Superposing the original atmospheric electric field signal and noise to obtain: x is the number ofp(n)=x(n)+ωp(n),p=1,...,P;
(2.2) mixing the signals xp(n) local maxima connected by a curve, defined as the upper envelope fp max(n); all local minimum points are connected by a curve and defined as a lower envelope fp min(n); and the average value of the upper envelope and the lower envelope is mp(n), then:
mp(n)=[fp max(n)+fp min(n)]/2;
(2.3) mixing the signals xp(n) average value m of envelopep(n) performing a difference operation: c. Cp1(n)=xp(n)-mp(n);
(2.4) c if first obtainedp1(n) not satisfying intrinsic mode function IMFIf required, it is continuously regarded as a new signal xp(n) repeating the steps (2.2) and (2.3) until cp1(n) is an IMF component;
obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n); signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n); namely:
Figure BDA0002950003810000021
(2.5) for P sequences { x ] after white noise superpositionp(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n);
Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n);
the original atmospheric electric field signal x (n) can be expressed as:
Figure BDA0002950003810000031
in the step (3), the sample entropy analysis comprises the following steps:
(3.1) IMF c from k orderk(n), starting with the ith element in L, taking τ consecutive elements to form τ -dimensional vector cp2(n)、cp3(n)、...、cpK(n);
If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) constitutes a matrix C of (L- τ +1) × τ:
Figure BDA0002950003810000032
will vector Cτ(i) And Cτ(j) A distance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, then:
d[Cτ(i),Cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
(3.2) given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
Figure BDA0002950003810000033
Figure BDA0002950003810000034
Computing
Figure BDA0002950003810000035
The average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
Figure BDA0002950003810000036
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)];
calculating to obtain Aτ+1(r); electric field signal ckThe entropy of (n) can be estimated as:
Figure BDA0002950003810000037
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) isStandard deviation of the signal); the sample entropy H of the original signal x (n) can be calculated;
(3.4) dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula,
u(n)=∑ck(n),where(Hk>σ1H)
v(n)=∑ck(n),where(σ1H<Hk<σ2H)。
w(n)=∑ck(n)+r(n),where(Hk<σ2H)
in the step (4), the feature extraction includes the following steps:
(4.1) extracting statistics such as peak-to-peak value, energy, variance, difference and the like of random, detail and trend components respectively, wherein the statistics for the ith atmospheric electric field sample can be expressed as:
Figure BDA0002950003810000041
(4.2) extracting the characteristics of the original signal by adopting depth self-coding; each reconstructed component is taken as input, encoded by an encoder and decoded by a decoder, so that the decoded data is kept consistent with the input as much as possible, and finally the output of the encoder is taken as a new characteristic.
In the step (5), the construction of the early warning model comprises the following steps:
(5.1) P atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstructing; extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi};
(a) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(b) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
Figure BDA0002950003810000042
(c) and finally obtaining a final prediction result through T-round iteration:
Figure BDA0002950003810000043
the target function expression in the t round is:
Figure BDA0002950003810000051
wherein,
Figure BDA0002950003810000052
as a regular term, R is the number of leaf nodes of the current lifting tree, omegajIs the score of each leaf node, λ and γ are regularization parameters;
l is a logarithmic loss function, then:
Figure BDA0002950003810000053
the objective of each iteration is to minimize the objective function, which becomes:
Figure BDA0002950003810000054
in the formula,
Figure BDA0002950003810000055
since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties of each leaf node, i.e. the penalty function
Figure BDA0002950003810000056
IiRepresented as samples falling at leaf node j,
Figure BDA0002950003810000057
the score representing the leaf node where the ith sample is located may be given by:
Figure BDA0002950003810000058
in this case, the objective function is further simplified to
Figure BDA0002950003810000059
Traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui);
Constructing a detail component and trend component early warning model by the same method;
(5.2) adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
Figure BDA00029500038100000510
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
Figure BDA0002950003810000061
Figure BDA0002950003810000062
wherein e iskRepresenting the kth classifier.
(5.3) defining the following cost function to measure the comprehensive performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
where OA is the overall accuracy, POD is the detection probability, FAR is the false alarm rate, m1、m2、m3Respectively representing cost coefficients of three types of errors, namely total recognition error, false alarm missing and false alarm;
and setting the cost-substitution ratio of the overall recognition error and the early warning error (including the false alarm and the false alarm) of the classifier as 1: alpha and the cost-substitution ratio of the false alarm and the false alarm as 1: beta, namely
m1/(m2+m3)=1/α,m2/m3=1/β
Then, the cost function is further expressed as:
Figure BDA0002950003810000063
the values of alpha and beta are adjusted to meet the requirements of different applications.
The invention utilizes EEMD self-adaptive decomposition algorithm to preprocess the atmospheric electric field signal, and can carefully grasp the change of the atmospheric electric field signal in different time scales; the XGBoost learning algorithm has low requirements on the number of samples, and combines EEMD decomposition reconstruction components to establish multi-model prediction to make up for the limitation of a single model on signal prediction. Finally, the model is verified to have a good early warning effect through testing.
Has the advantages that:
(1) the traditional thunder early warning method ignores the oscillation scale characteristic of the atmospheric electric field signal, so that the detection probability is low, and the method is a reliable and effective method by combining a non-stationary signal processing algorithm and machine learning; the invention utilizes EEMD self-adaptive decomposition algorithm to preprocess the atmospheric electric field signal, and can carefully grasp the change of the atmospheric electric field signal in different time scales; the XGBoost learning algorithm has low requirements on the number of samples, and combines EEMD decomposition reconstruction components to establish multi-model prediction to make up the limitation of a single model on signal prediction; and tests verify that the model has a good early warning effect, and a new thought and method are provided for lightning early warning.
(2) When the beta value is compared, the early warning performance of the model is established by respectively adopting common voting, multi-classifier fusion and single-component characteristics; compared with a single model XGboost method, the multi-classifier fusion method has the advantages that the detection probability is improved by 9.6% -14.3%, and the false alarm rate is reduced by 11.1% -16.7%; compared with the common voting method, the detection probability is improved by 4.8 percent at most, and the false alarm rate is reduced by 5.2 to 6.4 percent. Therefore, the early warning performance can be effectively improved by adopting the multi-classifier fusion.
Drawings
FIG. 1 is an electric field diagram of the present invention, wherein (a) is a physical diagram of a sensor, (b) is a schematic diagram of a sensor structure, and (c) is an overall appearance diagram;
FIG. 2 is a flow chart of the algorithm of the present invention;
FIG. 3 is a flow chart of EEMD decomposition;
FIG. 4 is an original signal and reconstructed components;
fig. 5 shows the results of the training set experiment, where (a) is the overall accuracy, (b) is the probability of detection, and (c) is the false alarm rate.
Detailed Description
The present invention will be described in further detail with reference to examples.
As shown in fig. 1 to 3, the lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting of the invention comprises the following steps:
step S1: data acquisition and processing, specifically:
s11: as shown in fig. 1(a) and (b), the sensor mainly comprises a moving plate 1, a fixed plate 2, a small blade 5, a photoelectric pair tube, a photoelectric switch 4, a motor 6 and the like. The moving plate (rotor) 1 and the fixed plate (induction electrode) 2 are respectively made of stainless steel and copper and are composed of four fans with the same shape; below is a cylindrical metal body 3. The moving plate 1 and the small blade 5 are both fixed on a motor shaft of a motor 6.
S12: when the microcontroller drives the motor to rotate at a fixed speed, the moving plate and the small blade rotate at the same speed.
S13: the moving plate can periodically shield the atmospheric electric field above the stator plate, the stator plate is periodically exposed in the space electric field, the stator plate (induction electrode) can generate induction current, the current signal outputs direct current voltage after I/V conversion, amplification filtering and phase-sensitive detection,
s14: the electric field strength is calculated by measuring the magnitude of the dc voltage. The small blade and the moving plate rotate at the same speed, the light path of the photoelectric switch is periodically controlled during rotation, a synchronous switch signal is generated to serve as a reference for phase-sensitive detection, and the polarity of an electric field is determined according to the reference signal. The overall appearance photograph of the atmospheric electric field instrument is shown in fig. 1(c), wherein a sensor 7 is protruded at the upper end, a power supply main control case 8 is arranged in the middle box body, and a support rod 9 is arranged at the bottom.
Step S2: the method comprises the following steps of decomposing and reconstructing atmospheric electric field signals:
s21: the atmospheric electric field signal is set as x (n), n is 1, 2, and L, and white noise with small amplitude is added to overcome mode aliasing, so that the signal is distributed on a proper oscillation scale. P Gaussian white noises { omega ] with different normal distributions are selectedp(n), P ═ 1, 2.., P; n is 1, 2. Superposing the original atmospheric electric field signal and noise to obtain:
xp(n)=x(n)+ωp(n),p=1,...,P
s22: will signal xp(n) the local maxima are connected by a curve, defined as the upper envelope fp max(n); similarly, all local minimum points are connected by a curve to define a lower envelope fp min(n); and the average value of the upper envelope and the lower envelope is mp(n) then:
mp(n)=[fp max(n)+fp min(n)]/2
s23: will signal xp(n) average value m of envelopep(n) performing a difference operation:
cp1(n)=xp(n)-mp(n)
s24: if c is obtained for the first timep1(n) does not satisfy the requirement of Intrinsic Mode Function (IMF)Then continue to be regarded as a new signal xp(n) repeating the steps (2) and (3) until cp1(n) is an IMF component. Obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n) of (a). Signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n) of (a). Namely:
Figure BDA0002950003810000081
s25: for P sequences { x after superposition of white noisep(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n) of (a). Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n). The original atmospheric electric field signal x (n) can be expressed as:
Figure BDA0002950003810000082
step S3: sample entropy analysis, as follows:
s31: IMF c from kk(n), starting with the ith element in L, taking τ consecutive elements to form τ -dimensional vector cp2(n)、cp3(n)、...、cpK(n) of (a). If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) forms a matrix C of (L- τ +1) × τ
Figure BDA0002950003810000091
Vector Cτ(i) And Cτ(j) In-line with the aboveDistance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, i.e.:
d[Cτ(i),Cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
s32: given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
Figure BDA0002950003810000092
Figure BDA0002950003810000093
Computing
Figure BDA0002950003810000094
The average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
Figure BDA0002950003810000095
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)]
calculating to obtain Aτ+1(r) of (A). Electric field signal ckThe entropy of (n) can be estimated as:
Figure BDA0002950003810000096
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) is the standard deviation of the signal); the sample entropy H of the original signal x (n) can likewise be calculated.
S34: dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula.
u(n)=∑ck(n),where(Hk>σ1H)
v(n)=∑ck(n),where(σ1H<Hk<σ2H)
w(n)=∑ck(n)+r(n),where(Hk<σ2H)
Step S4: and (3) feature extraction, which comprises the following steps:
s41: and respectively extracting statistics such as peak-to-peak value, energy, variance, difference and the like of random, detail and trend components, wherein the statistics for the ith atmospheric electric field sample can be expressed as:
Figure BDA0002950003810000101
s42: and performing feature extraction on the original signal by adopting depth self-coding. Each reconstructed component is taken as input, encoded by an encoder and decoded by a decoder, so that the decoded data is kept consistent with the input as much as possible, and finally the output of the encoder is taken as a new characteristic.
Step S5: establishing an early warning model, specifically as follows:
s51: p atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstruction. Extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi}。
(1) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(2) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
Figure BDA0002950003810000102
(3) and finally obtaining a final prediction result through T-round iteration:
Figure BDA0002950003810000103
the target function expression in the t round is:
Figure BDA0002950003810000111
wherein,
Figure BDA0002950003810000112
the method is a regular term and is used for balancing the complexity of the model and avoiding overfitting; r is the number of the current leaf nodes of the lifting tree, omegajIs the score of each leaf node, and λ and γ are regularization parameters. L is a logarithmic loss function, i.e.
Figure BDA0002950003810000113
The objective of each iteration is to minimize the objective function, which becomes:
Figure BDA0002950003810000114
in the formula,
Figure BDA0002950003810000115
since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties of each leaf node, i.e. the penalty function
Figure BDA0002950003810000116
IjRepresented as samples falling at leaf node j,
Figure BDA0002950003810000117
the score representing the leaf node where the ith sample is located may be given by:
Figure BDA0002950003810000118
in this case, the objective function is further simplified to
Figure BDA0002950003810000119
Traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui). And constructing a detail component and trend component early warning model by the same method.
S52: by adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
Figure BDA0002950003810000121
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
Figure BDA0002950003810000122
Figure BDA0002950003810000123
wherein e iskRepresenting the kth classifier.
S53: in lightning early warning, detection Probability (POD) and False Alarm Rate (FAR) are important parameters for evaluating early warning performance, and are defined as follows:
Figure BDA0002950003810000124
the method comprises the following steps of obtaining the number of times of lightning occurrence and early warning success by adopting EA (Ethernet, Internet and data bus), obtaining FTW (fiber to the Home) and FA (fiber to the Home), wherein EA is the number of times of lightning occurrence and early warning success, FTW is the number of times of lightning occurrence but no early warning, and FA is the number of times of lightning occurrence but no. Meanwhile, the Overall Accuracy (OA) is a measure of the overall recognition performance of the classifier.
The following cost function L is defined to measure the overall performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
wherein m is1,m2,m3And respectively representing the cost coefficients of three types of errors, namely total recognition error, false alarm and false alarm. Assuming that the cost-to-cost ratio of the overall recognition error to the early warning error (including false alarm and false alarm) of the classifier is 1: alpha, the cost-to-cost ratio of the false alarm and the false alarm is 1: beta, that is, the classifier
m1/(m2+m3)=1/α,m2/m3=1/β
The cost function can be written as:
Figure BDA0002950003810000125
the values of alpha and beta are adjusted to meet the requirements of different application occasions.
The following table 1 is a test set experimental result, and it can be seen that the multi-classification fusion algorithm has more obvious advantages compared with a single model and a common voting algorithm, and both the accuracy and the false alarm rate are improved; fig. 4 shows the original signal and the reconstructed component, and fig. 5 shows the training set experiment result, which shows that the early warning probability and the false alarm probability both decrease with the increase of the beta component value and are in accordance with the expectation.
Table 1 test set of experimental results
Figure BDA0002950003810000131

Claims (7)

1. A lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting is characterized by comprising the following steps:
(1) collecting and processing atmospheric electric field signal data;
(2) decomposing the atmospheric electric field signal into modal components with different scales, wherein the modal components contain different local characteristic information;
(3) calculating sample entropies of the IMF components, dividing the IMF components into random, detail and trend components according to the entropy, and summing the three types of signals to obtain corresponding reconstruction components;
(4) extracting a characteristic value of the reconstruction component;
(5) constructing an early warning model, and training the extracted features as input quantity after normalization processing; in the training process, model parameters are adjusted by using a grid search method;
(6) and (5) lightning early warning, predicting the test set by using the model obtained in the step (5), decomposing the test set to obtain prediction results of all components, fusing the prediction results, and giving an early warning signal.
2. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient improvement as claimed in claim 1, characterized in that the method decomposes the data collected by the atmospheric electric field instrument by using the EEMD algorithm, calculates the sample entropies of the original data and each modal function, performs classification reconstruction according to the random component, the detail component and the trend component, respectively extracts the statistics and the self-encoder characteristics of the reconstruction component, establishes the early warning model by using the BooXGate algorithm, and fuses the classifiers of each component to give an early warning signal.
3. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: and (2) decomposing and reconstructing the atmospheric electric field signal, comprising the following steps:
(2.1) setting an atmospheric electric field signal as x (n), wherein n is 1, 2.., L; p Gaussian white noises { omega ] with different normal distributions are selectedp(n)},p=1,2,...,P;n=1,2,...,L;
Superposing the original atmospheric electric field signal and noise to obtain: x is the number ofp(n)=x(n)+ωp(n),p=1,...,P;
(2.2) mixing the signals xp(n) local maxima connected by a curve, defined as the upper envelope fpmax(n);
All local minimum points are connected by a curve and defined as a lower envelope fpmin(n);
And the average value of the upper envelope and the lower envelope is mp(n), then:
mp(n)=[fpmax(n)+fpmin(n)]/2;
(2.3) mixing the signals xp(n) average value m of envelopep(n) performing a difference operation: c. Cp1(n)=xp(n)-mp(n);
(2.4) c if first obtainedp1(n) not satisfying the IMF requirement, it is considered as a new signal xp(n) repeating the steps (2.2) and (2.3) until cp1(n) is an IMF component;
obtaining a first IMF component cp1After (n), the signal x is reusedp(n) minus cp1(n) obtaining a residual signal xp1(n) continuing to treat the residual signal as a new signal xp(n) repeating the above steps to obtain cp2(n),cp3(n),...,cpK(n); signal xp(n) is finally decomposed into K IMF components cpk(n) and a residual amount rp(n); namely:
Figure FDA0002950003800000021
(2.5) for P sequences { x ] after white noise superpositionp(n) each of which is decomposed, the k-th IMF component obtained by the p-th signal decomposition is cpk(n) the remaining amount is rp(n);
Respectively carrying out integrated average on IMF components and residual quantity of all signals, and taking the average value as a k-th order IMF component c of the original signal x (n)k(n) and the balance r (n);
the original atmospheric electric field signal x (n) is expressed as:
Figure FDA0002950003800000022
4. the lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: the step (3) comprises the following steps:
(3.1) IMFc from k-orderk(n), where n is 1, 2, …, starting with the ith element in L, and τ consecutive elements are taken to form τ -dimensional vector cp2(n)、cp3(n)、…、cpK(n);
If the value of i is 1, 2, 1, L-tau +1, the vector C is obtained respectivelyτ(1),Cτ(2),…,Cτ(L- τ +1) constitutes a matrix C of (L- τ +1) × τ:
Figure FDA0002950003800000023
will vector Cτ(i) And Cτ(j) A distance d [ C ] betweenτ(i),Cτ(j)]Defined as the maximum of the absolute values of the differences between the two corresponding elements, then:
d[Cτ(i),cτ(j)]=maxk=0,…,τ-1(|Cτ(i+k),Cτ(j+k)|(1≤j≤L-τ+1,j≠i)
(3.2) given a real number r, for each i, count d [ C ]τ(i),Cτ(j)]The number of r is less than or equal to r, and the ratio of the number of r to the total distance L-tau is calculated and recorded as
Figure FDA0002950003800000024
Figure FDA0002950003800000025
Computing
Figure FDA0002950003800000031
The average value of all i (i is more than or equal to 1 and less than or equal to L-tau +1) is marked as Aτ(r):
Figure FDA0002950003800000032
From the IMF signal c of order kkStarting with the ith element in (n), taking τ +1 consecutive ckThe values constitute a τ + 1-dimensional vector:
Cτ+1(i)=[ck(i),ck(i+1),…,ck(i+τ)];
calculating to obtain Aτ+1(r); electric field signal ckThe entropy of (n) is:
Figure FDA0002950003800000033
the parameter τ is 2, r is 0.2 std (c)k(n))(std(ck(n)) is the standard deviation of the signal); then calculating the sample entropy H of the original signal x (n);
(3.4) dividing all IMFs into three classes according to the sample entropy of each IMF, and respectively combining the three classes into a random component u (n), a detail component v (n) and a trend component w (n) according to a sample entropy calculation formula,
Figure FDA0002950003800000034
5. the lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: and (4) extracting features, which comprises the following steps:
(4.1) extracting peak-to-peak, energy, variance, difference statistics of random, detail and trend components respectively, and expressing the statistics for the ith atmospheric electric field sample as:
Figure FDA0002950003800000035
and (4.2) carrying out feature extraction on the original signal by adopting depth self-coding.
6. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boosting according to claim 1, wherein in the step (5), the constructing of the early warning model comprises the following steps:
(5.1) P atmospheric electric field sample sequences xiThe sequence u may be the sequence of (n), n 1, 2, P, i sample sequencei(n)、vi(n)、wi(n) reconstructing; extracting the characteristics of the reconstruction sequence and respectively recording the characteristics as a set { U }i}、{Vi}、{Wi};
(a) For the random component, an initial lifting tree f is first defined0(Ui)=0;
(b) By minimizing the objective function obj(1)To obtain the corresponding output f of the ith sample1(Ui) The current prediction result is expressed as:
Figure FDA0002950003800000041
(c) and finally obtaining a final prediction result through T-round iteration:
Figure FDA0002950003800000042
the target function expression in the t round is:
Figure FDA0002950003800000043
wherein,
Figure FDA0002950003800000044
as a regular term, R is the number of leaf nodes of the current lifting tree, omegajIs the score of each leaf node, λ and γ are regularization parameters;
l is a logarithmic loss function, then:
Figure FDA0002950003800000045
the objective of each iteration is to minimize the objective function, which becomes:
Figure FDA0002950003800000046
in the formula,
Figure FDA0002950003800000047
since each sample will fall in one leaf node, the penalty function can be understood as the sum of the penalties for each leaf node, i.e.:
Figure FDA0002950003800000048
Ijrepresented as samples falling at leaf node j,
Figure FDA0002950003800000049
the score representing the leaf node where the ith sample is located is given by:
Figure FDA00029500038000000410
the objective function is simplified to:
Figure FDA0002950003800000051
traversing all the characteristics for division to obtain a minimum loss function value and determining a tree function ft(Ui);
Constructing a detail component and trend component early warning model by the same method;
(5.2) adopting a multi-classifier fusion method, firstly solving confusion matrixes of different classifiers:
Figure FDA0002950003800000052
k, K denotes the number of classifiers, nijRepresenting the number of samples from class i that are judged as j by the classifier (i j 1.., M is the number of sample classes), the final decision function is:
Figure FDA0002950003800000053
Figure FDA0002950003800000054
wherein e iskRepresenting the kth classifier.
7. The lightning early warning method based on integrated empirical mode decomposition and extreme gradient boost of claim 1, wherein: the following cost function is defined for measuring the overall performance of the classifier:
L=m1(1-OA)+m2(1-POD)+m3FAR
where OA is the overall accuracy, POD is the detection probability, FAR is the false alarm rate, m1、m2、m3Respectively represent the wholeAnd identifying cost coefficients of errors, false alarm omission and false alarm.
CN202110204997.2A 2021-02-24 2021-02-24 Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting Pending CN112818912A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110204997.2A CN112818912A (en) 2021-02-24 2021-02-24 Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110204997.2A CN112818912A (en) 2021-02-24 2021-02-24 Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting

Publications (1)

Publication Number Publication Date
CN112818912A true CN112818912A (en) 2021-05-18

Family

ID=75865292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110204997.2A Pending CN112818912A (en) 2021-02-24 2021-02-24 Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting

Country Status (1)

Country Link
CN (1) CN112818912A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255510A (en) * 2021-05-21 2021-08-13 南京信息工程大学 Rainbow cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy
CN114252706A (en) * 2021-12-15 2022-03-29 华中科技大学 Lightning early warning method and system
CN117617921A (en) * 2024-01-23 2024-03-01 吉林大学 Intelligent blood pressure monitoring system and method based on Internet of things

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109142896A (en) * 2018-07-25 2019-01-04 南京信息工程大学 Lightning Warning method based on three-dimensional atmospheric electric field and MEMD

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109142896A (en) * 2018-07-25 2019-01-04 南京信息工程大学 Lightning Warning method based on three-dimensional atmospheric electric field and MEMD

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周盛山等: "EEMD和CNN-XGBoost在风电功率短期预测的应用研究", 《电子测量技术》, pages 55 - 61 *
徐伟等: "基于集成经验模态分解和极端梯度提升的雷电预警方法", 《仪器仪表学报》, pages 235 - 243 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255510A (en) * 2021-05-21 2021-08-13 南京信息工程大学 Rainbow cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy
CN113255510B (en) * 2021-05-21 2023-07-25 南京信息工程大学 Thunderstorm cloud point charge moving path imaging method based on multi-time scale DBSCAN and sample entropy
CN114252706A (en) * 2021-12-15 2022-03-29 华中科技大学 Lightning early warning method and system
CN114252706B (en) * 2021-12-15 2023-03-14 华中科技大学 Lightning early warning method and system
CN117617921A (en) * 2024-01-23 2024-03-01 吉林大学 Intelligent blood pressure monitoring system and method based on Internet of things
CN117617921B (en) * 2024-01-23 2024-03-26 吉林大学 Intelligent blood pressure monitoring system and method based on Internet of things

Similar Documents

Publication Publication Date Title
CN112818912A (en) Lightning early warning method based on integrated empirical mode decomposition and extreme gradient lifting
CN110609477B (en) Electric power system transient stability discrimination system and method based on deep learning
CN116229380B (en) Method for identifying bird species related to bird-related faults of transformer substation
CN109884419A (en) A kind of wisdom grid power quality on-line fault diagnosis method
CN111537853A (en) Intelligent detection method for partial discharge of switch cabinet based on multi-source heterogeneous data analysis
CN112668611B (en) Kmeans and CEEMD-PE-LSTM-based short-term photovoltaic power generation power prediction method
CN114169374B (en) Cable-stayed bridge stay cable damage identification method and electronic equipment
CN112686093A (en) Fusion partial discharge type identification method based on DS evidence theory
CN115879048A (en) Series arc fault identification method and system based on WRFMDA model
CN113987910A (en) Method and device for identifying load of residents by coupling neural network and dynamic time planning
CN115239971A (en) GIS partial discharge type recognition model training method, recognition method and system
CN116756594A (en) Method, system, equipment and medium for detecting abnormal points of power grid data
CN114280490A (en) Lithium ion battery state of charge estimation method and system
CN113283474A (en) SVM-based partial discharge pattern recognition method
CN117093894A (en) Partial discharge modeling analysis method, system and device based on neural network
CN117390368A (en) Lightning probability calculation method, device and equipment for wind turbine and storage medium
CN114818827A (en) Non-invasive load decomposition method based on seq2point network
CN112163494A (en) Video false face detection method and electronic device
CN117171702A (en) Multi-mode power grid fault detection method and system based on deep learning
CN113780346B (en) Priori constraint classifier adjustment method, system and readable storage medium
CN111025100A (en) Transformer ultrahigh frequency partial discharge signal mode identification method and device
CN113902581A (en) Power utilization abnormity detection method based on depth self-encoder Gaussian mixture model
CN117496223A (en) Light insulator defect detection method and device based on deep learning
CN115983507B (en) Method and system for predicting broadband oscillation risk of section of power grid of transmitting end source
CN110261773B (en) Aviation generator fault symptom extraction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210518