CN112651463A - Construction method of double-forecast model of hail weather in plateau area - Google Patents

Construction method of double-forecast model of hail weather in plateau area Download PDF

Info

Publication number
CN112651463A
CN112651463A CN202110008969.3A CN202110008969A CN112651463A CN 112651463 A CN112651463 A CN 112651463A CN 202110008969 A CN202110008969 A CN 202110008969A CN 112651463 A CN112651463 A CN 112651463A
Authority
CN
China
Prior art keywords
hail
model
short
samples
constructing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110008969.3A
Other languages
Chinese (zh)
Inventor
张军
张晏
王萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202110008969.3A priority Critical patent/CN112651463A/en
Publication of CN112651463A publication Critical patent/CN112651463A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for constructing a double-forecast model of hail weather in a plateau area, which comprises the following steps of: 1) collecting sufficient hail samples and short-time strong precipitation samples, and dividing all the samples into a training set, a verification set and a test set according to the ratio of 6:2: 2; 2) transplanting 6 mechanical characteristics of the plain area, and constructing 2 elevation characteristics based on 90m resolution elevation data; 3) constructing a random forest-based hail/short-time strong rainfall classification identification model by jointly adopting characteristics reflecting hail formation mechanisms and elevation characteristics; 4) providing a Bayes minimum error decision of hail/short-time strong precipitation based on principal component analysis; 5) two models of the AND fusion strategy are proposed. The hail weather forecast model provided by 6 characteristics and 2 melting distance characteristics reflecting the hail monomer forming mechanism can distinguish the hail in about 90% of plateau areas from the short-time strong rainfall process.

Description

Construction method of double-forecast model of hail weather in plateau area
Technical Field
The invention relates to hail disaster prediction in the field of meteorology, in particular to a construction method of a double-forecast model of hail weather in a plateau area.
Background
Hail is a severe meteorological disaster caused by a strong convective weather system, and has the advantages of small spatial range, short life history and rapid arrival. The need for hail formationAn unstable layer with considerable thickness exists in the atmosphere, ascending airflow with uneven inclination in the cloud and capable of supporting hailstones for a long time, a liquid supercooled water accumulation zone above the maximum ascending speed and the like[1]. In addition, the vegetation of the underlying surface, the terrain and other factors can influence the development of strong convection, for example, the heat capacity of soil is smaller than that of seawater, so the inland is easier to be heated by radiation than the seawater, the convection is unstable, the inland hail is more than the coastal, and the size and the space-time distribution of the hail are obviously influenced by the terrain[2]. In 2018, Zheng and Sa analyzed the influence of geographical factors on hail formation[3](ii) a Research on forecasting and early warning method of Yunnan hail by crane and the like[4]. The terrain and the topography of China have the characteristics of obvious three-level step distribution, the mountains are uneven and often play a role in forced convection lifting, so that the probability of hail occurrence is increased, the high-occurrence areas of hail are mainly concentrated in Qinghai-Tibet plateau, Qinghai, eastern part of inner Mongolia and northeast region of China, and the incidence frequency of the plain areas of China is lower[5]However, quantitative research is rarely carried out on the characteristics and the differences of hailstones at different altitudes.
The approach forecast of the strong convection weather usually depends on remote sensing data such as radar, satellite and the like, and radar parameters designed by a specific method can be used for forecasting the occurrence of certain strong convection. In 2013, characteristics such as drapability, kurtosis, strong echo ratio and the like are provided by Wangman and Pangzao, in 2014, Li clever introduces weighted nuclear height (monomer nuclear liquid ratio), monomer nuclear mean value, 2016, and based on the development research of hail and short-term strong precipitation caused by a thunderstorm system within 50km, characteristics such as density characteristic and accumulated liquid water content are provided, the characteristics form the description of strong convection monomers, a machine learning model is constructed to complete training, and good classification and identification effects of hail and short-term strong precipitation are obtained in plain areas of China[6]-[8](ii) a In 2017, 10 characteristics capable of being used for identifying early hail are provided, and the blank of early hail identification is filled[9](ii) a In 2019, Shi provides a method for detecting a weak echo region in real time through a radar echo bottom high image and also provides a parameter capable of quantifying the regional scale of the weak echo region[10]. In 2019, Czernecki and the like forecast hail by combining remote sensing data and sounding data with machine learning technology[11]
In the process of implementing the invention, the inventor finds that at least the following disadvantages and shortcomings exist in the prior art:
according to related data, the plateau area and peripheral mountain areas in China are high-incidence areas of strong convection weather in almost sixty years, although relevant expert scholars obtain many achievements on the research of the strong convection weather at present, quantitative research is still available for the prediction of the strong convection weather in the plateau area, the hail indexes and parameters in the plateau area are not enough when the hail is directly used in the hail prediction in the plateau area in China, and business personnel accumulate certain subjective prediction experiences, how to combine the related experiences with the objective environment where the hail occurs is urgent to solve in the hail prediction.
[ reference documents ]
[1]Dennis E J,KumjianMR.The impact of vertical wind shear on hail growth in simulated supercells[J].Journal of the At-mospheric Sciences,2017,74(3):641–663.
[2] Environmental physical quantity statistical characteristics of hail weather in two-stage step terrain areas of China [ J ]. plateau weather, 2018, 37 (01): 185-196.
[3] Zheng valia, Yang you flood, Liuzhi, Liu Xiao Lu, Sichuan province hail distribution and topographic factor relationship analysis [ J ] meteorological science, 2018, 46 (06): 1280-1286.
[4] The forecast and early-warning method of hail in Yunnan province researches [ J ] weather, 2014, 40 (2): 174-185.
[5] Xiao glu, yu yuan, sun xibao, etc. small and medium scale strong convective weather climatological features in china [ J ] climate and environmental studies, 2019, 24 (2): 199-213.
[6] Wangping, Pangzhou. hail identification model [ J ] physics report based on significant features, 2013(06): 515-.
[7] Li Smart, Strong hail automatic identification technology and hail suppression operation decision method research [ D ]. Tianjin university, 2014.
[8] Wangli, Kaoyeti, Li smart, etc. Classification and identification method of thunderstorm system within 50km research [ J ] meteorological data, 2016,42(2): 230-.
[9]Wang P,Shi J,Hou J,et al.The Identification of Hail Storms in the Early Stage Using Time Series Analysis[J].Journal of Geophysical Research,2018,123(3):929-947.
[10]Shi Junzhi,Wang Ping,Wang Di,et al.Radar-based automatic identification and quantification of weak echo regions for hail nowcasting[J].Atmosphere,2019,10(6):325.
[11]Czernecki B,Taszarek M,Marosz M,et al.Application of machine learning to large hail prediction-The importance of radar reflectivity,lightning occurrence and convective parameters derived from ERA5[J].Atmospheric Research,2019,227:249–262.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a dual-model forecasting scheme for hail weather in a plateau area, and solves the problems of low accuracy and high false alarm rate of the nowcasting of the hail weather in the plateau area.
The technical scheme of the invention is as follows: a dual-model forecast scheme for hail weather in a plateau area comprises the following steps:
1) collecting and manufacturing a sample; sufficient hail samples in the plateau area and short-time strong precipitation samples which are easily confused with hail are collected as the basis of a model, and the model is divided into a training set, a verification set and a test set according to a certain proportion; 2) and (5) establishing characteristics. Firstly, transplanting 6 mechanistic characteristics of hail weather in plain areas; constructing 2 elevation features; 3) and constructing a classification recognition model based on the random forest. A random forest-based hail/short-time strong rainfall classification identification model is constructed by jointly adopting the characteristics of reflecting hail formation mechanism and elevation characteristics; 4) and constructing a probability model based on Bayes minimum error decision. Firstly, performing spatial transformation of features through principal component analysis, and secondly, constructing a probability recognition model of hail weather based on Bayes minimum error decision; 5) and (5) fusion strategy is proposed.
The method comprises the following steps: collecting and manufacturing a sample;
1-1) the invention researches the classification and identification of hail clouds in plateau areas of China, so hail is selected as a positive sample, and short-time strong precipitation which is easily confused with the hail is selected as a negative sample. Sufficient hail samples and short duration heavy precipitation samples are collected. The construction and result analysis of the hail identification model are developed by using historical data of 2010-2015, which is 6 years in the Guizhou province, wherein 95 hail processes, 1402 hail monomers, 110 short-time strong precipitation processes in the same time period or the similar time period with the 95 hail processes and 1210 strong precipitation monomers are used.
1-2) dividing all samples into a training set, a verification set and a test set according to the ratio of 6:2:2, wherein the training set is used for training the model, the verification set is used for adjusting parameters, and the test set is used for testing the classification and identification performance of the model.
Step two: establishing characteristics;
2-1) selecting the sag, the effective thickness, the monomer core liquid state ratio, the core average value, the high echo ratio and the kurtosis as the mechanism characteristics of the strong convection monomer;
and 2-2) extracting the maximum elevation (maximum elevation) and the average elevation (mean elevation) of the ground area corresponding to the monomer nuclear area in view of the influence of the altitude height on the classification and identification of hail and short-time strong rainfall. The maximum elevation represents the highest height of the ground area, and the shortest melting distance for the hail to land first can be calculated, namely the altitude of the area is subtracted from the height of the 0 ℃ layer, as shown in the formula (1).
Hmin_melt=H0-Hmax (1)
In the formula, HmaxThe maximum elevation value in the ground area corresponding to the single nuclear area. The average melting distance can also be calculated in the same way, as in equation (2).
Figure BDA0002884553850000031
In the formula (I), the compound is shown in the specification,
Figure BDA0002884553850000032
the average elevation value in the ground area corresponding to the monomer core area is obtained.
Step three: constructing a classification recognition model based on random forests;
3-1) selecting the 8 features to form a feature vector, and constructing a hail recognition model based on a random forest by taking a short-time strong precipitation monomer as a counter example;
3-2) number C of adjustable parameter base classifiers (decision trees) in random forest model1The number of features C assigned to each decision tree2Maximum depth of tree C3Minimum number of samples C defined for a detachable node4And the minimum number of samples C of leaf nodes5. Considering that the positive and negative samples of the problem of the present invention are thousands of examples, the number of the characteristic quantities is less than 10, and C is particularly set2Fixed as the whole feature quantity, and for C3And C4Without limitation, i.e. only for C1And C5And (3) performing optimization, wherein the optimization index is a Critical Success Index (CSI):
Figure BDA0002884553850000041
in the formula, A is the number of correctly predicted hailstones, B is the number of short-time strong precipitation monomers which are mistakenly reported as the hailstones, and C is the number of missed hailstones.
Obtaining an optimal solution C on a verification set by using a grid search method1=17,C5At 27, the CSI is 72.14%.
Step four: constructing a probability model based on Bayes minimum error decision;
4-1) carrying out normalization processing on the 8 characteristics. First, 8 features are abbreviated in turn as
Y=(y1,y2,y3,y4,y5,y6,y7,y8) (4)
Respectively obtaining the mean values of 8 characteristics by using training samples
Figure BDA0002884553850000048
Hezhong FangDifference (D)
Figure BDA0002884553850000042
The 8 raw features were then normalized to
Figure BDA0002884553850000043
Wherein
Figure BDA0002884553850000044
Obtaining new characteristic vector by principal component analysis
Pca=(pca1,pca2,pca3,pca4,pca5,pca6,pca7,pca8) (7)
Wherein the ith principal component
Figure BDA0002884553850000045
And pcai⊥pcajI ≠ j. As can be seen, the principal component analysis is to combine 8-dimensional vectors
Figure BDA0002884553850000046
Into a new 8-dimensional feature vector Pca, where no correlation between two components is present and each component is
Figure BDA0002884553850000047
Is a linear combination of all components, 8 weighting coefficients being the characteristic root λ of the sample covariance matrixi(i=1,2,···,8;λ1≧λ2≧···λ7≧λ8) A corresponding feature vector;
4-2) selecting a first main component (one dimension) to describe a hail monomer and a short-time strong precipitation monomer;
4-3) selecting a Bayesian classification model to solve the classification problem of hail and short-time strong precipitation. In order to simplify the analysis, converting the frequency distribution histogram of the first principal component into a percentage stacking chart, thereby fitting a continuous curve, and using the continuous curve as a probability identification model of the hail weather;
4-4) setting the total number of hail and heavy precipitation to be N respectively1And N2The number of missed calls to hail is S2The number of false readings is S1,ω1And ω2Respectively indicating hail and short-time strong precipitation samples, and calculating the total error rate E by taking alpha as a classification threshold value:
Figure BDA0002884553850000051
4-5) determination of optimum threshold α is equivalent to finding S from equation (9)1+S2Is measured. Calculation of pca for all validation set samples by equation (9)1Then, different S values are obtained by setting different probability threshold values1+S2And obtaining the optimal probability threshold solution of 0.4.
Step five: proposing a fusion strategy;
5-1) for convenience of expression, respectively recording a single scanning recognition model based on random forests and a probability recognition model based on Bayes minimum error decision as a model 1 and a model 2.
5-2) synthesizing the following fusion strategies: if the model 1 and the model 2 identify hail at the same time, giving a hail identification result, otherwise, giving strong short-term precipitation.
Compared with the prior art, the invention has the beneficial effects that:
the random forest model selected at first has high classification generalization capability and overfitting inhibition capability, and feature components do not need to be subjected to standardization treatment and orthogonalization treatment; secondly, constructing a Bayesian minimum error decision construction probability recognition model based on principal component analysis; finally, a fusion scheme is provided, so that the recognition rate of hail weather in the plateau area can be ensured, and the false alarm rate can be greatly reduced.
Drawings
FIG. 1 is a flow chart of a dual forecast model scheme of hail weather in a plateau area provided by the invention;
FIG. 2a is a histogram of the frequency distribution of sag (the three colors light, medium and dark in the various sub-graphs of FIG. 2 represent short-time heavy precipitation samples, hail samples and their overlap regions, respectively);
FIG. 2b is a histogram of the frequency distribution of the effective thickness;
FIG. 2c is a histogram of the frequency distribution of the liquid ratio of the monomer core;
FIG. 2d is a frequency distribution histogram of the kernel mean;
FIG. 2e is a histogram of the frequency distribution with a high echo ratio;
FIG. 2f is a histogram of the frequency distribution of kurtosis;
FIG. 2g is a histogram of the frequency distribution of the shortest melting distance;
FIG. 2h is a histogram of the frequency distribution of the mean melting distance;
FIG. 3a is pca1The histogram of the frequency distribution (the three colors of light, medium and dark in each sub-graph of fig. 3 represent short-time strong precipitation samples, hail samples and their overlapping regions, respectively);
FIG. 3b is pca2The frequency distribution histogram of (1);
FIG. 3c is pca3The frequency distribution histogram of (1);
FIG. 4 is a graph of the percentage packing of the first principal component;
FIG. 5 is a schematic diagram of determining an optimal threshold;
FIG. 6a is a hail process test result;
fig. 6b shows the results of the short term heavy precipitation process.
Detailed Description
The technical solutions of the present invention are further described in detail with reference to the accompanying drawings and specific embodiments, which are only illustrative of the present invention and are not intended to limit the present invention.
The invention provides a double-model forecast scheme for hail weather in a plateau area, which is designed according to the following steps: based on the hail mechanism class characteristics and the elevation characteristics, a random forest-based hail/short-time strong precipitation classification and identification model is constructed, a Bayes minimum error decision hail/short-time strong precipitation classifier established on the basis of principal component analysis is provided, and finally, an 'AND' fusion strategy of the two models is provided.
As shown in FIG. 1, the method mainly comprises the steps of sample collection and production, feature establishment and feasibility analysis, classification recognition model construction based on random forests, probability model construction based on Bayesian minimum error decision and fusion strategy development. The specific contents are as follows:
the method comprises the following steps: collecting and manufacturing a sample;
1-1) the invention researches the classification and identification of hail clouds in plateau areas of China, so hail is selected as a positive sample, and short-time strong precipitation which is easily confused with the hail is selected as a negative sample. Sufficient hail samples and short duration heavy precipitation samples are collected. The construction and result analysis of the hail identification model are developed by using historical data of 2010-2015, which is 6 years in the Guizhou province, wherein 95 hail processes, 1402 hail monomers, 110 short-time strong precipitation processes in the same time period or the similar time period with the 95 hail processes and 1210 strong precipitation monomers are used.
1-2) dividing all samples into a training set, a verification set and a test set according to the ratio of 6:2:2, wherein the training set is used for training the model, the verification set is used for adjusting parameters, and the test set is used for testing the classification and identification performance of the model.
Step two: establishing characteristics;
2-1) selecting the sag, the effective thickness, the monomer core liquid state ratio, the core mean value, the high echo ratio and the kurtosis as the mechanism characteristics of the strong convection monomer, and showing a frequency distribution histogram of 8 characteristics in figure 2. (ii) a
And 2-2) extracting the maximum elevation (maximum elevation) and the average elevation (mean elevation) of the ground area corresponding to the monomer nuclear area in view of the influence of the altitude height on the classification and identification of hail and short-time strong rainfall. The maximum elevation represents the highest height of the ground area, and the shortest melting distance for the hail to land first can be calculated, namely the altitude of the area is subtracted from the height of the 0 ℃ layer, as shown in the formula (1).
Hmin_melt=H0-Hmax (1)
In the formula, HmaxThe maximum elevation value in the ground area corresponding to the single nuclear area. The average melting distance can also be calculated in the same way, as in equation (2).
Figure BDA0002884553850000071
In the formula (I), the compound is shown in the specification,
Figure BDA0002884553850000072
the average elevation value in the ground area corresponding to the monomer core area is obtained.
Step three: constructing a classification recognition model based on random forests;
3-1) selecting 8 features provided in the tables 1 and 2 to form a feature vector, and constructing a hail identification model by taking a short-time strong precipitation monomer as a counter example;
3-2) number C of adjustable parameter base classifiers (decision trees) in random forest model1The number of features C assigned to each decision tree2Maximum depth of tree C3Minimum number of samples C defined for a detachable node4And the minimum number of samples C of leaf nodes5. Considering that the positive and negative samples of the problem of the present invention are thousands of examples, the number of the characteristic quantities is less than 10, and C is particularly set2Fixed as the whole feature quantity, and for C3And C4Without limitation, i.e. only for C1And C5And (3) performing optimization, wherein the optimization index is a critical success index CSI (critical success index):
Figure BDA0002884553850000073
in the formula, A is the number of correctly predicted hailstones, B is the number of short-time strong precipitation monomers which are mistakenly reported as the hailstones, and C is the number of missed hailstones.
Obtaining an optimal solution C on a verification set by using a grid search method1=17,C5At 27, the CSI is 72.14%.
Step four: constructing a probability model based on Bayes minimum error decision;
4-1) carrying out normalization processing on the 8 characteristics. First, 8 features are abbreviated in turn as
Y=(y1,y2,y3,y4,y5,y6,y7,y8) (4)
Respectively obtaining the mean values of 8 characteristics by using training samples
Figure BDA0002884553850000074
Sum variance
Figure BDA0002884553850000075
The 8 raw features were then normalized to
Figure BDA0002884553850000076
Wherein
Figure BDA0002884553850000077
Obtaining new characteristic vector by principal component analysis
Pca=(pca1,pca2,pca3,pca4,pca5,pca6,pca7,pca8) (7)
Wherein the ith principal component
Figure BDA0002884553850000081
And pcai⊥pcajI ≠ j. As can be seen, the principal component analysis is to combine 8-dimensional vectors
Figure BDA0002884553850000082
Into a new 8-dimensional feature vector Pca, where no correlation between two components is present and the scores are differentAll amounts are
Figure BDA0002884553850000083
Is a linear combination of all components, 8 weighting coefficients being the characteristic root λ of the sample covariance matrixi(i=1,2,···,8;λ1≧λ2≧···λ7≧λ8) The corresponding feature vector.
The present invention performs principal component analysis using training data to obtain the contribution rate and the cumulative contribution rate of each principal component, as shown in table 1. It is easy to see that the joint contribution rate of the first three principal components occupies a share close to 80%, and the weight coefficient a of the three componentsij(i, j ═ 1,2,3), as shown in table 2. The values of the training samples with respect to the first three principal components are calculated to form a distribution histogram as shown in fig. 3.
TABLE 1 contribution rate of each principal component
Figure BDA0002884553850000084
TABLE 2 weight coefficients of the first three principal components
Figure BDA0002884553850000085
4-2) the first principal component (one dimension) is selected to describe hail monomer and short-term strong precipitation monomer, as in formula (9)
Figure BDA0002884553850000086
4-3) selecting a Bayesian classification model to solve the classification problem of hail and short-time strong precipitation. The conversion of FIG. 3(a) to a percentage packing diagram is shown in FIG. 4. In FIG. 4, for each pca1Is called this pca in the invention1The following strong precipitation probability and hail probability, the continuous fitting curve between the upper strong precipitation probability region and the lower hail probability region is as follows (10):
Figure BDA0002884553850000087
4-4) setting the total number of hail and heavy precipitation to be N respectively1And N2The number of missed calls to hail is S2The number of false readings is S1,ω1And ω2Respectively indicating hail and short-time strong precipitation samples, and calculating the total error rate E by taking alpha as a classification threshold value:
Figure BDA0002884553850000091
4-5) determination of optimum threshold α is equivalent to the determination of S, as can be seen from equation (11)1+S2Is measured. Calculation of pca for all validation set samples by equation (9)1Then, different S values are obtained by setting different probability threshold values1+S2And obtaining the optimal probability threshold solution of 0.4.
Step five: proposing a fusion strategy;
5-1) for convenience of expression, respectively recording a single scanning recognition model based on random forests and a probability recognition model based on Bayes minimum error decision as a model 1 and a model 2.
5-2) synthesizing the following fusion strategies: if the model 1 and the model 2 identify hail at the same time, giving a hail identification result, otherwise, giving strong short-term precipitation.
The feasibility of the construction method of the double-forecast model of hail weather in the plateau area provided by the embodiment of the invention is verified by specific tests, and the following descriptions are provided:
under the and fusion mechanism, 248 hail monomers (from 19 processes) and 271 short-time strong precipitation monomers (from 22 processes) used for the test were analyzed for test results per process. First, with the body sweep time associated with the onset of hail in the process as the origin of coordinates, two body sweeps are retained from the origin backward and forward until the first body sweep identified as a hail monomer, forming fig. 6(a), and all 22 body sweeps for testing are arranged to form fig. 6 (b). In the figure, the light-colored boxes represent that the recognition result is hail, and the dark-colored boxes represent that the recognition result is short-time strong precipitation.
As can be seen from fig. 6(a), for the hail process taking part in the test:
(1) the hail double recognition models proposed by the invention both give correct recognition, wherein only one instance (19# hail process) neglects to recognize two body scans after the first recognition.
(2) The recognition model of the invention gives early warning 36 minutes ahead of time to more than half (52.6%) of the hail process, and more detailed early warning capability information is summarized in table 3.
TABLE 3 early warning capability of the model of the present invention to hail
Figure BDA0002884553850000092
As can be seen from fig. 6(b), for short term precipitation (counter example) taking part in the test:
(1) the hit rate of the identification model to the heavy precipitation monomer is 75.3%, and the hit rate to the heavy precipitation process is 81.8% (the hit sweep number is more than or equal to 50% of the whole process sweep number).
(2) Of the 67 heavy precipitation monomers mistakenly identified as hail, 46 (greater than 2/3) were concentrated in the 4 (less than 1/5) processes of # 2, # 7, # 13 and # 20.
As can be seen from the detailed analysis given in FIG. 6a and FIG. 6b, the hail recognition model trained by using 6 features reflecting the hail monomer formation mechanism and 2 melting distance features has the capability of distinguishing about 90% of hail and short-time heavy precipitation processes; the inability to give early warning of the conditions of 'strong, high and overhung' suddenly rising and falling hail at the same time; hail can be misreported to the strong precipitation process with high strength, height and overhang. The latter two cases account for about 10%.
While the present invention has been described with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments, which are illustrative only and not restrictive, and various modifications which do not depart from the spirit of the present invention and which are intended to be covered by the claims of the present invention may be made by those skilled in the art.

Claims (6)

1. A construction method of a double-forecast model of hail weather in a plateau area is characterized by comprising the following steps:
(1) collecting and preparing a sample: sufficient hail samples in the plateau area and short-time strong precipitation samples which are easily confused with hail are collected as the basis of a model, and the model is divided into a training set, a verification set and a test set according to a certain proportion; (2) the establishment of the characteristics: firstly, transplanting 6 mechanistic characteristics of hail weather in plain areas; constructing 2 elevation features; (3) constructing a classification recognition model based on random forests; constructing a random forest-based hail/short-time strong rainfall classification identification model by jointly adopting the characteristics of reflecting hail formation mechanism and elevation characteristics; (4) constructing a probability model based on Bayes minimum error decision: firstly, performing spatial transformation of features through principal component analysis, and secondly, constructing a probability recognition model of hail weather based on Bayes minimum error decision; (5) and (5) fusion strategy is proposed.
2. The method for constructing a dual prediction model of hailstone weather in plateau areas according to claim 1, wherein the step (1) of collecting and preparing samples comprises the following steps:
1-1) researching the classification and identification of hail clouds in plateau areas in China, so that hails are selected as positive samples, and short-time strong rainfall which is easily confused with the hails is selected as negative samples; collecting sufficient hail samples and short-time strong precipitation samples; the method comprises the following steps of (1) developing the establishment and result analysis of a hail identification model by using historical data of 6 years, namely 2010-2015, of the Guizhou province, wherein 95 hail processes, 1402 hail monomers, 110 short-time strong precipitation processes in the same time period or similar time period with the 95 hail processes and 1210 strong precipitation monomers are used;
1-2) dividing all samples into a training set, a verification set and a test set according to the ratio of 6:2:2, wherein the training set is used for training the model, the verification set is used for adjusting parameters, and the test set is used for testing the classification and identification performance of the model.
3. The method for constructing dual prediction model of hailstone weather in plateau area according to claim 1, wherein said step (2) of characteristic building comprises the steps of:
2-1) selecting the sag, the effective thickness, the monomer core liquid state ratio, the core average value, the high echo ratio and the kurtosis as the mechanism characteristics of the strong convection monomer;
2-2) extracting the maximum elevation and the average elevation of the ground area corresponding to the monomer nuclear area in view of the influence of the altitude height on the classification and identification of hail and short-time strong rainfall; the maximum elevation represents the highest height of the ground area, and meanwhile, the shortest melting distance of the hail which firstly falls to the ground can be calculated, namely the altitude of the area is subtracted from the height of the 0 ℃ layer, as shown in the formula (1);
Hmin_melt=H0-Hmax (1)
in the formula, HmaxThe maximum elevation value in the ground area corresponding to the single nuclear area. In the same way, the average melting distance can also be calculated, as shown in formula (2);
Figure FDA0002884553840000011
in the formula (I), the compound is shown in the specification,
Figure FDA0002884553840000012
the average elevation value in the ground area corresponding to the monomer core area is obtained.
4. The method for constructing a dual prediction model of hailstone weather in plateau areas according to claim 1, wherein said step (3) of constructing a classification recognition model based on random forests comprises the steps of:
3-1) selecting 8 features of the 6 mechanical features and the 2 elevation features to form a feature vector, and constructing a hail recognition model based on a random forest by taking a short-time strong precipitation monomer as a counter example;
3-2) in the random forest model, the adjustable parameter has a base classifier, namely a decision tree, and the number C1The number of features C assigned to each decision tree2Maximum depth of tree C3Minimum number of samples C defined for a detachable node4And the minimum number of samples C of leaf nodes5(ii) a Considering that the positive and negative samples of the problem of the present invention are thousands of examples, the number of the characteristic quantities is less than 10, and C is particularly set2Fixed as the whole feature quantity, and for C3And C4Without limitation, i.e. only for C1And C5And (4) performing tuning, wherein the tuning index adopts a critical success index CSI:
Figure FDA0002884553840000021
in the formula, A is the number of correctly predicted hailstones, B is the number of short-time strong precipitation monomers which are mistakenly reported as the hailstones, and C is the number of missed hailstones;
obtaining an optimal solution C on a verification set by using a grid search method1=17,C5At 27, the CSI is 72.14%.
5. The method for constructing a dual prediction model of hailstone weather in highland as claimed in claim 1, wherein said step (4) of constructing a probability model based on bayes minimum error decision comprises the steps of:
4-1) carrying out normalization processing on 8 features including 6 mechanical features and 2 elevation features; first, 8 features are abbreviated in turn as
Y=(y1,y2,y3,y4,y5,y6,y7,y8) (4)
Respectively obtaining the mean values of 8 characteristics by using training samples
Figure FDA0002884553840000022
Sum variance
Figure FDA0002884553840000023
The 8 raw features were then normalized to
Figure FDA0002884553840000024
Wherein
Figure FDA0002884553840000025
Obtaining new characteristic vector by principal component analysis
Pca=(pca1,pca2,pca3,pca4,pca5,pca6,pca7,pca8) (7)
Wherein the ith principal component
Figure FDA0002884553840000026
And pcai⊥pcajI ≠ j; as can be seen, the principal component analysis is to combine 8-dimensional vectors
Figure FDA0002884553840000027
Into a new 8-dimensional feature vector Pca, where no correlation between two components is present and each component is
Figure FDA0002884553840000028
Is a linear combination of all components, 8 weighting coefficients being the characteristic root λ of the sample covariance matrixi(i=1,2,…,8;λ1≧λ2≧…λ7≧λ8) A corresponding feature vector;
4-2) selecting the first main component to describe a hail monomer and a short-time strong precipitation monomer;
4-3) a Bayesian classification model is selected to solve the classification problem of hail and short-time strong precipitation; in order to simplify the analysis, converting the frequency distribution histogram of the first principal component into a percentage stacking chart, thereby fitting a continuous curve, and using the continuous curve as a probability identification model of the hail weather;
4-4) setting the total number of hail and heavy precipitation to be N respectively1And N2The number of missed calls to hail is S2The number of false readings is S1,ω1And ω2Respectively indicating hail and short-time strong precipitation samples, and calculating the total error rate E by taking alpha as a classification threshold value:
Figure FDA0002884553840000031
4-5) determination of optimum threshold α is equivalent to finding S from equation (9)1+S2Minimum value of (d); calculation of pca for all validation set samples by equation (9)1Then, different S values are obtained by setting different probability threshold values1+S2And obtaining the optimal probability threshold solution of 0.4.
6. The method for constructing dual forecast model of hailstone weather in plateau area according to claim 1, wherein said step (5) of fusion strategy is proposed by comprising the steps of:
5-1) respectively recording a single scanning recognition model based on a random forest and a probability recognition model based on Bayes minimum error decision as a model 1 and a model 2 for convenience of expression;
5-2) synthesizing the following fusion strategies: if the model 1 and the model 2 identify hail at the same time, giving a hail identification result, otherwise, giving strong short-term precipitation.
CN202110008969.3A 2021-01-05 2021-01-05 Construction method of double-forecast model of hail weather in plateau area Pending CN112651463A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110008969.3A CN112651463A (en) 2021-01-05 2021-01-05 Construction method of double-forecast model of hail weather in plateau area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110008969.3A CN112651463A (en) 2021-01-05 2021-01-05 Construction method of double-forecast model of hail weather in plateau area

Publications (1)

Publication Number Publication Date
CN112651463A true CN112651463A (en) 2021-04-13

Family

ID=75367407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110008969.3A Pending CN112651463A (en) 2021-01-05 2021-01-05 Construction method of double-forecast model of hail weather in plateau area

Country Status (1)

Country Link
CN (1) CN112651463A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966785A (en) * 2021-04-14 2021-06-15 赵辉 Intelligent constellation state identification method and system
CN113447931A (en) * 2021-06-10 2021-09-28 天津大学 Short-time strong precipitation identification method based on Doppler radar data
CN117214916A (en) * 2023-11-08 2023-12-12 北京英视睿达科技股份有限公司 Short-time hail prediction method and system based on satellite remote sensing observation data
CN117290810A (en) * 2023-11-27 2023-12-26 南京气象科技创新研究院 Short-time strong precipitation probability prediction fusion method based on cyclic convolutional neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11183615A (en) * 1997-12-18 1999-07-09 Toshiba Corp Meteorological-information observation system
CN105182450A (en) * 2015-10-15 2015-12-23 成都信息工程大学 Short-time early warning system for severe convection weather
CN105354241A (en) * 2015-10-15 2016-02-24 西藏自治区气象台 Highland severe convection weather short-term nowcasting and pre-warning system
CN106600046A (en) * 2016-12-09 2017-04-26 东南大学 Multi-classifier fusion-based land unused condition prediction method and device
CN108020840A (en) * 2017-11-20 2018-05-11 天津大学 A kind of Hail Cloud By Using Weather EARLY RECOGNITION method based on Doppler radar data
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11183615A (en) * 1997-12-18 1999-07-09 Toshiba Corp Meteorological-information observation system
CN105182450A (en) * 2015-10-15 2015-12-23 成都信息工程大学 Short-time early warning system for severe convection weather
CN105354241A (en) * 2015-10-15 2016-02-24 西藏自治区气象台 Highland severe convection weather short-term nowcasting and pre-warning system
CN106600046A (en) * 2016-12-09 2017-04-26 东南大学 Multi-classifier fusion-based land unused condition prediction method and device
CN108020840A (en) * 2017-11-20 2018-05-11 天津大学 A kind of Hail Cloud By Using Weather EARLY RECOGNITION method based on Doppler radar data
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
吕伟: "适于地域特点的强对流天气分类识别建模方法研究", 《中国优秀博硕士学位论文全文数据库(硕士) 基础科学辑》 *
宁珂: "电子鼻与电子舌融合技术及其应用", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
武文杰: "基于机器学习的空气质量分析和预测方法研究", 《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅰ辑》 *
王静静 等: "基于灰度相关的帧间差分和背景差分相融合的实时目标检测", 《基于灰度相关的帧间差分和背景差分相融合的实时目标检测 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966785A (en) * 2021-04-14 2021-06-15 赵辉 Intelligent constellation state identification method and system
CN113447931A (en) * 2021-06-10 2021-09-28 天津大学 Short-time strong precipitation identification method based on Doppler radar data
CN117214916A (en) * 2023-11-08 2023-12-12 北京英视睿达科技股份有限公司 Short-time hail prediction method and system based on satellite remote sensing observation data
CN117214916B (en) * 2023-11-08 2024-04-05 北京英视睿达科技股份有限公司 Short-time hail prediction method and system based on satellite remote sensing observation data
CN117290810A (en) * 2023-11-27 2023-12-26 南京气象科技创新研究院 Short-time strong precipitation probability prediction fusion method based on cyclic convolutional neural network
CN117290810B (en) * 2023-11-27 2024-02-02 南京气象科技创新研究院 Short-time strong precipitation probability prediction fusion method based on cyclic convolutional neural network

Similar Documents

Publication Publication Date Title
CN112651463A (en) Construction method of double-forecast model of hail weather in plateau area
Zhou et al. Forecasting different types of convective weather: A deep learning approach
Thompson et al. Convective modes for significant severe thunderstorms in the contiguous United States. Part II: Supercell and QLCS tornado environments
CN110728411B (en) High-low altitude area combined rainfall prediction method based on convolutional neural network
CN108693534A (en) NRIET X band radars cooperate with networking analysis method
Jaramillo et al. Mesoscale convective systems and other precipitation features over the tropical Americas and surrounding seas as seen by TRMM
KR101531224B1 (en) Quantitative precipitation estimation system based dual polarization radars and method thereof
CN107356926A (en) Difference cloud cluster extrapolation precipitation predicting algorithm based on Hu squares
CN113866770A (en) Hail cloud early identification method and storage medium
WO2018168165A1 (en) Weather forecasting device, weather forecasting method, and program
CN113933845B (en) Ground hail reduction identification and early warning method based on double-linear polarization radar
CN114325874B (en) Method for establishing strong convection weather personal library system
Li et al. Applications of radar-based nowcasting techniques for mesoscale weather forecasting in Hong Kong
Guo et al. Correction of sea surface wind speed based on SAR rainfall grade classification using convolutional neural network
Zuo et al. Identification of convective and stratiform clouds based on the improved DBSCAN clustering algorithm
Sánchez et al. Analysis of mesoscale convective systems with hail precipitation
CN116027333B (en) Method for generating three-dimensional scanning elevation angle parameters of microwave rain-measuring radar
Matyas A spatial analysis of radar reflectivity regions within Hurricane Charley (2004)
Giordani et al. Characterizing hail-prone environments using convection-permitting reanalysis and overshooting top detections over south-central Europe
Murata A mechanism for heavy precipitation over the Kii Peninsula accompanying Typhoon Meari (2004)
Zhang et al. Improved Forest Signal Detection for Space-borne Photon-counting LiDAR Using Automatic Machine Learning
Wang et al. Kinematics and Microphysical Characteristics of the First Intense Rainfall Convective Storm Observed by Jiangsu Polarimetric Radar Network
Jung et al. Bulk microphysical characteristics of a heavy-rain complex thunderstorm system in the Taipei Basin
CN116821626B (en) Hydropower station meteorological data monitoring, inquiring and alarming system
Siqueira et al. Tracking and short-term forecasting of mesoscale convective cloud clusters over southeast Brazil using satellite infrared imagery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210413

WD01 Invention patent application deemed withdrawn after publication