CN112508237B - Rain type region division method based on data analysis and real-time rain type prediction method - Google Patents

Rain type region division method based on data analysis and real-time rain type prediction method Download PDF

Info

Publication number
CN112508237B
CN112508237B CN202011312636.1A CN202011312636A CN112508237B CN 112508237 B CN112508237 B CN 112508237B CN 202011312636 A CN202011312636 A CN 202011312636A CN 112508237 B CN112508237 B CN 112508237B
Authority
CN
China
Prior art keywords
rainfall
rain
type
event
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011312636.1A
Other languages
Chinese (zh)
Other versions
CN112508237A (en
Inventor
王瑛
李雨欣
张馨仁
王霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Priority to CN202011312636.1A priority Critical patent/CN112508237B/en
Publication of CN112508237A publication Critical patent/CN112508237A/en
Application granted granted Critical
Publication of CN112508237B publication Critical patent/CN112508237B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a rain type region division method based on data analysis. Acquiring historical rainfall data of each observation station in an area to be analyzed, and dividing a field rain event according to the rainfall data, wherein the rainfall data comprises rainfall and rainfall time which change along with time; constructing a rain type standard template; determining the rain type of each rain event by using a DTW algorithm based on the rain type standard template; and carrying out cluster analysis on the rain types of the field rain events of all the observation sites by utilizing K-means cluster analysis to obtain a rain type partition map of the area to be analyzed. The invention further provides a real-time rain type prediction method based on the DTW. The method researches the rainfall pattern rule of regional rainfall by deeply excavating the rainfall observation data of the meteorological site, and has important significance for flood disaster early warning and disaster reduction measures.

Description

Rain type region division method based on data analysis and real-time rain type prediction method
Technical Field
The invention relates to the technical field of research on rainfall rules in flood control and disaster reduction, in particular to a rainfall region division method and a real-time rainfall prediction method based on data analysis.
Background
The research of rainfall laws has an important effect on flood control and disaster reduction. The rainfall characteristics can be more objectively and accurately described by adopting what indexes, and the method is a long-term concern in the field of meteorological disasters. In the current meteorological standard of China, the rainstorm rating index is generally rated according to 24h total rainfall, and the rainfall is counted at 20-20 moments, namely, the rainfall is accumulated from 20 days to 20 days. However, the rain is often caused by a rain field for flood, landslide and torrential flood and debris flow disaster events triggered by various rains, the rain field may be strong rainfall for 1-2 hours or continuous thin rain for several days, and therefore the rain field has a greater significance for natural disaster research.
The rain type is one of the main methods for studying the rainfall process. In the 40 s of the 20 th century, the soviet union gamazowa and the like statistically analyze rainfall data of places such as the Ukrainian and the like, and 7 rain types are classified. On the basis, according to the domestic rainfall process analysis, Cen nationality Ping summarizes the rainfall in 7 modes. Knifer et al propose a statistical method of rainfall peak times and Chicago rain type formulas for calculating instantaneous rainstorm intensities around the peak, and apply them to rain type classification. Huff et al have the effect of dividing the rainfall process equally into 4 stages, and the different types of rainfall can be divided according to the peak occurring in the fifth stage. In 1975 Pilgrim et al proposed a rank-order averaging method for estimating rainstorm patterns based on mathematical statistics. In China, Zhang Xingqi et al divide rainfall in Bijie, Guizhou province into a forward type, a middle type, a later type and a uniform type according to the change of accumulated rainfall with time on the basis of a Huff rain type curve. Yishuqing and the like combine a Huff rain type curve and 7 rain type definitions of Cen Guoping to perform rain type characteristic research by utilizing 14 weather station data in China. Silver Lei et al used fuzzy recognition method, represented by the data of the Longxi gate station, to classify the rainstorm type of Guangzhou city for 24h and count the result.
The above-mentioned documents are limited, on the one hand, by the imperfection of the observation sites and, on the other hand, by the limitations of computer technology, by the research carried out only on the basis of data of a part of representative sites, involving daily observations of an amount of data of at most tens of sites. With the improvement of a national meteorological observation system, observation stations are gradually increased, acquired meteorological data are increasingly huge, taking the north river province as an example, the number of available stations for meteorological observation is increased from more than 20 to 3189, data are changed from daily rainfall data to hourly rainfall data, and the characteristics of 5V of big data are met, namely the data volume is large, the types and sources are diversified, the data value density is relatively low, the data growth speed is high, the data have accuracy and reliability, and the traditional method for performing trend regression analysis only on individual data is not applicable any more.
Disclosure of Invention
The invention aims to provide a regional rainfall law research method based on data analysis under the situation of big data.
According to one aspect of the invention, a rain type region division method based on data analysis is provided, and the method comprises the following steps
Acquiring historical rainfall data of each observation station in an area to be analyzed, and dividing a field rain event according to the rainfall data, wherein the rainfall data comprises rainfall and rainfall time which change along with time;
constructing a rain type standard template A;
determining the rain type of each rain event by using a DTW algorithm based on the rain type standard template;
and carrying out cluster analysis on the rain types of the field rain events of all the observation sites by utilizing K-means cluster analysis to obtain a rain type partition map of the area to be analyzed.
Preferably, determining the rain type to which each rain event belongs using the DTW algorithm based on the rain type standard template a further comprises,
construction of rain type standard template vector Ai
Ai={a1,a2,……,aj},i=1,2,…,I;j=1,2,…,J
In the formula: i is a positive integer which is the number of rain types in the rain type standard template, ajRepresenting the proportion of the rainfall of the jth stage of each rain model to the total rainfall, wherein J is a positive integer and is the number of stages of each rain model,
the process rainfall vector for each rain event is represented as
T={t1,t2,……,tg},g=1,2,……,G
In the formula: t is tgRepresenting the proportion of the rainfall in the total rainfall in the unit time length of the G, G is the time length of the rain event,
calculating the DTW of the process rainfall vector and the I standard template vectors of each rain eventi(J, G), wherein the minimum DTW, i.e. the rain type standard template vector corresponding to the shortest cumulative path from point (1,1) to point (J, G), is the rain type of the rain event.
Preferably, the rain patterns of each rain event of each observation site are clustered using K-means clustering, further comprising,
obtaining each data point m of the area based on the rain type of each rain event of each observation station in the area to be analyzedpThe percentage of the number of the i-th rain events in the rain type field to the total number of the rain events in the data point field is mpiP is 1,2, … …, P, taking a positive integer as the number of data points in the region to be analyzed;
setting a category number k, and randomly selecting k data points as an initial clustering center B ═ B1,b2,……bk}; for each data point mpAnd calculating the similarity of the cluster centers to each cluster center, wherein the calculation formula is as follows:
Figure GDA0003517683750000031
in the formula: b is a mixture ofkiData representing the percentage of the number of the ith rain event type in the kth clustering center to the total number of the rain events in the data point field,
each data point mpThe cluster center with the highest similarity belongs to the category,
recalculating a new clustering center based on the classification result;
and when the moving distance of the clustering center is smaller than a preset value, stopping clustering analysis to obtain a rain type partition map of the area to be analyzed.
Preferably, the method further comprises
Calculating the percentage of the number of rain events of each rain type field of each observation station to the total number of rain events of the observation station,
and carrying out spatial interpolation on the obtained data to obtain the occupation ratio distribution map of various rain types in the region to be analyzed.
Preferably, the method further comprises
Respectively counting the number of various rain types of each observation station in each rain type subarea according to the obtained rain type subarea graph,
and defining the attribute of each rain type subarea based on each rain type proportion in each rain type subarea.
Preferably, the rainfall events are sorted based on the total accumulated rainfall of each rainfall event, and the rainfall type area is divided by utilizing the rainfall events of which the total accumulated rainfall of each observation station is the top 10% -20%.
Preferably, the rain type standard template is
Figure GDA0003517683750000032
According to another aspect of the invention, a real-time rain type prediction method based on DTW is provided, which comprises, using the rain type region division method as described above, obtaining rain type regions of the region where the local rain is located,
acquiring rainfall data of the rainfall of the site, including rainfall changing with time, estimated rainfall duration and rainfall duration,
constructing a real-time process rainfall vector of the rainfall event of the site,
taking the process rainfall vector of each historical rain event in the area where the rainfall is predicted as a template, and calculating the DTW (delay tolerant turn) of the rainfall vector of the real-time process of the rainfall event of the local area and the previous g process of the rainfall vector of the process of each historical rain eventR(G, R), wherein G is G R/R, R is the rainfall time length, R is the predicted rainfall time length, the minimum DTW (delay time), namely the rainfall pattern of the historical rainfall event corresponding to the cumulative shortest path from the point (1,1) to the point (J, G), is predicted to be the rainfall pattern of the local rainfall event in real time,
and acquiring rainfall data of the rainfall of the local area in real time, and repeating the step of constructing the rainfall vector of the real-time process and the step of predicting the rainfall in real time.
Preferably, the real-time process rainfall vector of the local rainfall event is constructed as
TR={t1,t2,......,tr},r=1,2,……,R
In the formula: t is trRepresenting the rainfall of the r unit time length;
constructing process rainfall vectors of the rain events of each historical field as template vectors,
HG={h1,h2,......,hg},
in the formula: h isgRepresenting the amount of rainfall in the g unit time period.
Preferably, the method further comprises
Predicting the peak rainfall R of the rainfall event in real time by using the corresponding historical rain eventpeakThe calculation formula is as follows:
Figure GDA0003517683750000041
in the formula: hpeakPeak rainfall, H, representing the corresponding historical rainfall eventaccThe cumulative rainfall, R, representing the first g unit durations of the corresponding historical rainfall eventaccRepresents the accumulated rainfall of the present rainfall event.
According to the rain type region division method, the rainfall observation data of the meteorological station are deeply mined, the rain type rule of regional rainfall is researched, and the method has important significance for flood disaster early warning and disaster reduction measures. The rain type region dividing method of the invention demarcates 'rain field' by observing the rainfall of the station history hourly, and further extracts the accumulated rainfall and duration indexes of each rain field in the history; the method comprises the following steps of automatically classifying rain types of a field by adopting a DTW similarity algorithm in a data mining technology, dividing the field rain into a plurality of predefined rain types, and analyzing the difference in spatial distribution of different rain types of rainfall in an area, wherein the predefined rain types are used for distinguishing single-peak rainfall with a peak value in the front, middle and later periods, and double-peak rainfall and uniform rainfall; through K-means clustering, the analysis area is clustered into multiple rain-type areas, the rainfall trend characteristics of the various rain-type areas can be analyzed in combination with terrain distribution, the rainfall law is analyzed and summarized, and a theoretical basis and a guidance suggestion are provided for flood control and disaster reduction. Further following the analysis method of the invention, the historical data can be applied to the real-time prediction of rainfall type of the rainfall event, thereby reducing or even avoiding the occurrence of disasters caused by the rainfall event. The data mining technology combining the DTW similarity algorithm and the K-means clustering method can be more applied to future meteorological big data analysis.
Drawings
The following detailed description of embodiments of the invention is provided in conjunction with the appended drawings:
fig. 1 illustrates a flowchart of a rain type area division method according to a first embodiment of the present invention;
FIG. 2 shows an exemplary 7 major rainfall patterns;
FIG. 3 is a schematic diagram of a DTW algorithm based rain pattern determination;
FIG. 4 is a flow chart of a K-means clustering method according to the present invention;
fig. 5 is a flowchart of a real-time rain type prediction method according to a second embodiment of the present invention;
FIG. 6 shows a schematic of the location of the Hebei provincial weather site;
fig. 7 shows an example view of the rain division in the north river province;
FIG. 8 shows a distribution diagram of rain patterns in Hebei provinces;
FIG. 9 shows a K-means clustered Hebei field rain three major rain type zoning map.
Detailed Description
In order to more clearly illustrate the invention, the invention is further described below with reference to preferred embodiments and the accompanying drawings. Similar parts in the figures are denoted by the same reference numerals. It is to be understood by persons skilled in the art that the following detailed description is illustrative and not restrictive, and is not to be taken as limiting the scope of the invention.
Fig. 1 illustrates a flowchart of a rain type region division method according to an embodiment of the present invention.
The rain type expectation division method according to the present invention includes an S1 rain event division step, a rain type determination step of S2 based on DTW similarity calculation; s3 rain type partition step based on K-means cluster analysis; and S4 a step of defining the attribute of the rain type partition. A rain type region division method according to the present invention will be described in detail with reference to the accompanying drawings. For simplicity, the technical feature "rain events" in this specification is sometimes referred to as "rain events".
S1 rain dividing step
Firstly, historical rainfall data of each observation station in an area to be analyzed, such as occurrence time, ending time, rainfall amount changing along with time and the like, are acquired. The rainfall events need to be divided before calculating the cumulative rainfall, the length of the rainfall and determining the rain pattern for a rain event. There are various ways to divide a field rain event, and there is no limitation here. For example, Melillo, etc. divides the rainfall process, in which no rain is detected 6 hours before and after the occurrence of rainfall, into a single rainfall event according to the statistical conditions. In the description herein, field rain partitioning is also performed according to this standard. Herein, the hourly rainfall is taken as the rainfall varying with time.
S2 rain type determining step based on DTW similarity calculation
After the division of the rain events is completed, the duration length of each rain field is different, the rainfall which changes along with the time is different, and in order to research the regional rainfall law, the rain type of each rain field needs to be determined firstly. The rainfall events observed at all stations are classified into rain types by adopting a DTW (Dynamic Time Warping) algorithm. The DTW algorithm realizes data matching of different time sequence lengths by stretching and bending a time axis, and is an important similarity measurement method in time sequence data mining. And setting various rain types in the selected rain type classification as standard templates, constructing a single rainfall event as a template to be detected, and classifying each rain according to the accumulated shortest path by calculating the matching degree DTW of the template to be detected and the standard templates, namely the Euclidean distance of the corresponding attribute of each stage of the template to be detected and the standard templates.
Specifically, let AiIs a rain type standard template vector, A is a standard rain type template,
Ai={a1,a2,......,aj},i=1,2,...,I;j=1,2,...,J (1)
in the formula: i is a positive integer which is the number of rain types in the rain type standard template, ajRepresenting the proportion of the rainfall in the j stage to the total rainfall; j is a positive integer and is the number of stages of each rain type.
Constructing a vector T to be tested according to the rainfall capacity in the process of the single-field rain event,
T={t1,t2,......,tg},g=1,2,...,G (2)
in the formula: t is tgRepresenting the proportion of the rainfall in the total rainfall in the unit time length of the G, and G is the time length of the rain event. By way of example herein, the unit time period is 1 hour.
For example, fig. 2 shows 7 major rainfall rains classified according to the rain type classification method of soviet union gamazowa and ash country, each of which includes 6 stages, i.e., I ═ 7 and J ═ 6, as an example, where the I, II, and III types are monomodal rain types whose peaks are located at the front, rear, and middle, respectively; the IV type is a uniform rain type, and the distribution difference of the IV type in the rainfall process is small; the V, VI and VII are bimodal rain type. According to the proportion of rainfall in the total field rainfall in each stage, constructing 7 standard template vectors and obtaining standard templates which are respectively
Figure GDA0003517683750000071
Fig. 3 is a schematic diagram of rain type determination based on the DTW algorithm. The following describes the rain type determination procedure based on the DTW algorithm with reference to fig. 2 and 3 as an example.
As shown in FIG. 3, the DTW algorithm may be summarized as finding a minimum path of accumulated grid values from the start point (1,1) to the end point (J, G), where the grid value d corresponding to the point (J, G)jgIs (a)j-tg)2The DTW is calculated as follows:
Figure GDA0003517683750000072
in the formula: DTW (0,0) is 0.
And calculating the DTW (J, G) of the process rainfall vector and the I standard template vectors of each rain, wherein the minimum DTW is the standard template vector corresponding to the cumulative shortest path from the point (1,1) to the point (J, G), and is the rain type of the rain.
During rainfall-induced disaster analysis, it can be seen from statistical data that, taking north and river as an example, the disaster is caused only when the field rain events 20% before the rainfall amount in the flood season of each year, usually the field rain events with the accumulated rainfall amount being 10% before the rainfall amount. In order to reduce data statistics and accurately research the corresponding relation of rainfall laws to disaster triggering, the first 20% of field rain events of each site every year are preferably selected and acquired for data analysis. And after the rain type classification of the historical rain events selected by each observation station in the area to be analyzed is completed, counting and calculating the percentage of the rain quantity of each rain type of each observation station in the total rain quantity analyzed by the observation station. Since the observation stations in the area to be analyzed are not uniformly arranged, in order to make the data analysis result have wider applicability, it is preferable to perform spatial interpolation on the obtained rain data occupation ratios of various types of fields of each observation station, for example, by using an inverse distance interpolation method, to obtain each rain occupation ratio distribution map of the area to be analyzed. The number of interpolation data points, corresponding to the area size of the grid division of the region to be analyzed, can be set as required.
S3 rain type partition step based on K-means cluster analysis
The K-means is a relatively classical unsupervised clustering algorithm, has the advantages of a cluster initialization strategy and high calculation efficiency, and is widely applied to the fields of image segmentation, social networks and the like. Unsupervised clustering belongs to a machine learning technology, and data are divided into a plurality of classes according to given similarity measurement, so that the similarity of data points in the same class is higher, and the similarity of data points among different classes is lower.
A K-means clustering algorithm is adopted to analyze the regional difference of the rain type in the region to be analyzed, unsupervised clustering is carried out under the condition that the rain type regional characteristics of the region are unknown, and the result which can show the rain type regional characteristics most clearly is obtained by setting the category number K, so that the regional error of a small regional sample caused by manual regional division is reduced.
FIG. 4 shows a flow chart of the K-means clustering algorithm according to the present invention.
Constructing a data set M ═ M for data needing clustering1,m2,m3,...,mpP is 1,2, … …, P is a positive integer, which is the number of data points in the region to be analyzed, i.e. the size of the data set, mpFor the p-th data point, each data point has I characteristic dimensions, mpiThe ith characteristic value representing the p-th data point is the data point mpThe number of rain events in the ith rain type field accounts for the percentage of the total number of rain events in the data point field.
Selecting the category number k of the cluster analysis, and randomly selecting k data points as a cluster center B ═ B1,b2,b3,...,bkAnd step S21. The number k of classes is preferably 3-6. After performing cluster analysis with the number of categories k being 3, 4, 5, and 6, respectively, the preferred number of categories may be determined from the cluster analysis results.
For each data point mpAnd calculating the similarity of the data point to k cluster centers respectively, in step S22, the calculation formula is as follows:
Figure GDA0003517683750000081
in the formula: b is a mixture ofkiAnd data representing the percentage of the number of the i-th rain type field rain events of the k-th clustering center to the total number of the rain type field events of the data point. Data point m is calculated according to the calculation resultpThe category to which the cluster center having the greatest similarity belongs is classified, step S23. After all the data points are calculated in steps S22 and S23, differences between the data points in each group are calculated for the k types of data points determined by the center of each group, respectively, in step S24. And determining the minimum difference point as a new clustering center B' and calculating the moving distance of the clustering center. By comparing the moving distance with the preset cluster center moving distance threshold epsilon, when the calculated cluster center moving distance is smaller than the preset value, clustering is stopped, otherwise, the process returns to step S22. Therefore, a rain type partition clustering result is obtained, a rain type area to which each data point belongs is determined, and a rain type partition map is obtained.
S4 rain type partition attribute definition step
As analyzed above, each data point has a plurality of characteristic dimensions, i.e., each data point has I characteristic values respectively representing the number of field rain events of the ith rain type of the data point as a percentage of the total number of field rain events of the data point. In order to research the rainfall law of the obtained rain type subarea graph, the attributes of the rain type subareas need to be further defined and interpreted by combining the terrain characteristics and the rain type proportion characteristics in the rain type subareas.
Firstly, for each type of rain type zone, the rain quantity of each rain type field of each observation station in the zone and the percentage of the rain quantity of each rain type field in the total rain quantity of each rain type field of each observation station are counted respectively. Then, according to the number of the proportion of each rain type, defining the attributes of the rain type partition, such as the number of rain peak values and the occurrence time of the rain peak values, and the like, and predicting future rainfall, particularly rainfall in flood seasons.
According to another embodiment of the present invention, the present invention further provides a real-time rain type prediction method based on DTW, and fig. 5 shows a flowchart of the method.
The method according to the first embodiment of the present invention obtains a rain type zone predicting a rainfall area in real time. Acquiring rainfall data of the rainfall event in real time, wherein the rainfall data comprises rainfall changing along with time, estimated rainfall duration R and rainfall duration R, constructing a real-time rainfall vector of the rainfall event,
TR={t1,t2,......,tr},r=1,2,……,R
in the formula: t is trRepresenting the amount of rainfall in the r-th unit of time.
Constructing a process rainfall vector of each historical field rain event in the prediction area as a template vector,
HG={h1,h2,......,hg},g=1,2,……,G*r/R
in the formula: h isgRepresenting the amount of rainfall in the g unit time period.
Calculating rainfall vector T of real-time process of rainfall event of local siteRDTW corresponding to Rg duration of rainfall vector in rain event process of each historical fieldR(g,r),
Figure GDA0003517683750000091
In the formula: DTW (0,0) is 0.
And predicting the rain type of the most similar historical field rain corresponding to the minimum DTW (delay time) from the point (1,1) to the point (G, R) as the rain type of the rainfall event of the local field in real time. And (3) acquiring rainfall data in real time, repeating the process of constructing the rainfall vector and the template vector in the real-time process and the process of calculating the DTW based on the rainfall data updated in real time, and updating the predicted rainfall in real time. In the method according to the present invention, when the predicted rainfall duration R is changed, the updated R is substituted into h in the step of constructing the template vectorgThe formula (c) can be used to update the rainfall type prediction of the local rainfall event at any time by using the weather forecast and the progress of the local rainfall.
According to the preferred embodiment of the present invention, the peak rainfall R of the local rainfall event can be predicted by using the rainfall data of the most similar historical rainfall event calculated and the rainfall accumulated in the local rainfallpeakThe calculation formula is as follows:
Figure GDA0003517683750000092
in the formula: hpeakPeak rainfall, H, representing the corresponding historical rainfall eventaccThe accumulated rainfall represents the previous G/G process of the corresponding historical rainfall event, and G/G is R/R, RaccThe accumulated rainfall representing the rainfall event of the site gives an early warning in real time to the dangerous case that the rainfall may cause a disaster.
The rain type zoning method of the present invention will be specifically described below by taking north-river province as an example. The Hebei province is located in the North China area, east Bohai Bay, west Yitaixing mountain, and the northwest of the west region is high and southeast is low, and is a temperate continental climate area. The natural conditions of Hebei province have the geographical characteristics of obvious space-time distribution, and extreme rainfall and weather time are frequent. The rainfall data and the historical flooding situation database adopted by the embodiment are provided by the provincial Bureau of the North river. The rainfall data selects the hourly rainfall data of 3189 stations in total from 142 reference stations and 3047 regional stations in 6-8 months in 2017 for research, and the position distribution of each station is shown in fig. 6.
And carrying out field rain division on the obtained precipitation data firstly. Checking the classified field rain data, for example, fig. 7 is a classification example of the field rain in the north of river province, fig. 7(a) is a station for balancing water intensity (7 months in 2006), and fig. 7(b) is a station for chengdu city (7 months in 2011). Each gray bar in the figure represents a field of rain divided according to the definition herein, with the scale interval of the horizontal axis being 6 h.
As shown in fig. 7(a), a rainfall event of total rainfall of 43mm for 22h occurs at 7 months, 9 days, 2-23, and another rainfall event of total rainfall of 29.5mm for 36h occurs at 7 months, 10 days, 8 to 11 days, 19. If the split time threshold is set too large, the 2 rains will be classified as 1 and will last as long as 58 hours. Hebei belongs to a semi-moist area, and field rain with such a long duration obviously does not accord with the general cognition. As shown in fig. 7(b), a rainfall occurred for 16 hours from 7 months, 20 days, 21 to 21 days, 12. According to the division of the field, the rainfall of the field is 15.2mm, the rainfall of the field divided into 2 days is avoided, and the rainfall is too small. Meanwhile, the rainfall is divided according to the 6-hour interval, so that the rainfall is prevented from being divided into 2 or more rainfall events, and the total rainfall is underestimated. It can be seen that the rain division of the province field in the north of the river is carried out at intervals of 6h, meets the rainfall characteristics in the north of the river, and is accurate and feasible. According to the rain division, the occurrence time, the ending time, the duration, the hourly accumulated rainfall and the total accumulated rainfall of each rain are counted. According to the statistics of historical disaster conditions in Hebei province, 20% of events before rainfall in the flood season are caused to cause disasters and damages, and therefore, only 20% of the field rains before 6-8 months in 2005-plus 2017 are selected for rain type analysis in the example. The rain pattern of each rain event is determined in accordance with the rain pattern determination step based on the DTW similarity calculation as described above, with the dominant rain pattern in 7 shown in fig. 2 as a standard template in accordance with the field rain division step as described above.
And (4) counting the number of 7 types of rain at each station in Hebei and calculating the proportion of the number of rain in the total field of the station. A statistical table of the results of 7 rain type DTW analyses in Hebei province in 6-8 months in 2005 + 2017 is obtained by counting 3189 sites in the whole province, and is shown in Table 1. Adopting an inverse distance interpolation method, wherein the interpolation distance is 3km multiplied by 3km, obtaining 7 rain type distribution diagrams, as shown in the figures 8(I) - (VII), which are I-VII type distribution diagrams in sequence. It can be seen that the rainfall types of the whole province in Hebei are mainly type III, namely medium-term unimodal type, and the rainfall types account for 53.31% of the total rainfall events, and the rainfall types account for more than 25% in most regions. Secondly, the rainfall is single-peak rainfall in the early stage, the rainfall is double-peak rainfall in the middle stage and the later stage respectively in the VII type, and the rainfall is double-peak rainfall in the middle stage and the VI type, wherein the two peak values respectively account for 24.10 percent, 9.78 percent and 8.30 percent of the total number of rainfall events. Type II and type V rainfall occurs less frequently, and only occurs in a proportion of 5-25% in local areas. The VI type uniform rainfall occurs the least, and the occurrence proportion of the whole province is less than 1%.
TABLE 1 number of rain in 7 rain types of Hebei province (statistics according to station) (2005-2017)
Figure GDA0003517683750000111
In order to better study the regional characteristics of rainfall in each place of Hebei province, K-means clustering is carried out on the data presented in the graph 8. In performing K-means clustering analysis, the present example attempts clustering with the number of classes K being 3, 4, 5, 6, respectively. After clustering, the results of the 4, 5, 6-class are less different from the results of the 3-class. For simplicity, the example shows three examples of rain type partitions of rainfall in different places in Hebei province with the K-means clustering result of 3 categories, as shown in FIG. 9. And (3) counting the number of rain types of rain in each station field in each rain type zone and the proportion of the rain types to the total number of the rain events, as shown in a table 2. It can be seen that the rain type characteristics of the three divided rain type subareas are as follows: in the area I, rainfall is more in types III and I; II, area III, I, VI and VII of rainfall are repeated; and thirdly, in the area, rainfall is mainly of type III.
Further analysis shows that the rainfall process in Hebei province is more than single peak rainfall in type I early period and single peak rainfall in type III middle period, and the two rainfall events account for 77.41% of the total rainfall field. In rainfall in every city in Hebei and Hebei, the occurrence frequency of type III middle peak rainfall type is the most, and the maximum number is more than 25%; secondly, the type I rain type, namely the peak value is positioned at the front part; the occurrence frequency of the IV rain type, namely the uniform rain type is the least, and the rain type accounts for less than 5% in each city. The late unimodal rain ratio of II and the double-peak rain ratio of V, VI and VII are not more than 25% in most regions. The Hebei province can be divided into three types of rain partitions: in the category I area, the type III rain and the type I rain are more, and the type III rain and the type I rain are mainly single-peak rain at the middle stage and the early stage, which accounts for 51.86% and 26.37% of the total rain times; II, a class area, wherein types III, I, VI and VII are repeated, the class area comprises single-peak rainfall with a peak value in the middle stage and the early stage, and double-peak rainfall with a peak value in the front stage, the middle stage and the middle stage respectively accounts for 49.30%, 23.55%, 9.68% and 11.68% of the total rain frequency; and thirdly, the class area is mainly single-peak rainfall with the III-type rain peak value in the middle, and accounts for 58.06% of the total rain frequency. From the spatial distribution, the first category areas are mainly swallow mountain and hill climate areas, Jidong plain climate areas and mountain front plain climate areas. The second category area is mainly distributed in the climate area of the North Ji plateau and the south of Chengde City. Zone III is mainly distributed in south China Shijiazhuang city, Handan city and the Chachen city. In rainstorm disasters of Hebei province over the years, the rainstorm disasters causing great economic loss and casualties are common local rain types: type I, type III, type VI and type VII rainfall. When the rainstorm disaster risk prevention is carried out in Hebei province, the obtained rain type characteristics of rains of all regions and fields can be fully considered, and work such as disaster early warning, emergency resource deployment and the like can be carried out in a targeted mode. The main rainfall of the type I rain is concentrated in the stage of beginning of rainfall, and in the rainstorm prevention, the disaster early warning time is short, and the rainfall forecast is particularly important.
TABLE 2 rain type statistics for three rain type areas in Hebei (2005-2017)
Figure GDA0003517683750000121
And comparing the obtained analysis data with the rainstorm disaster situation in Hebei province.Selecting the rainstorm disaster cases with serious disasters in the areas of 2005-2017 provinces in the Hebei province for statistics, wherein the rainstorm disaster events cause direct economic loss of more than 500 ten thousand yuan to 2 hundred million yuan, and the disaster area is 300-25000hm2In the scope, some even cause casualties. The rain-type conditions in the field that caused the disaster are shown in table 3. The duration of the rainfalls is less than 24 hours, and the field rain dividing method with 6h as a time interval is used, so that the underestimation of characteristics such as rainfall intensity of the rainfall event by a statistical method at 20-20 moments can be avoided. The comparison result shows that the rainstorm patterns causing great loss to various regions are the most common local rain patterns. For example: i type rain is rain of '7.22' in 2009 in Yutian county in Tangshan City in class district; the rainfall of "7, 26" in 2012, which occurred in the county of peace of the city of qinhuang island, is type iii rain. (II) in a class area, 6.28 rain in 2006 of Fengning line in Chengdi city is VII-type rain; the 7.15 rainfall in 2013 of the Linxi county, the Chenchen desk city is type VI rain; (iii) type III rain is produced by 8.26 rainfall in 2009 in Xianhuang county of Shijiazhuang city and 7.01 rainfall in 2013 in Wuqiang county of Heishi city. Therefore, in the event of storm disaster risk prevention in north and river province, the rain type characteristics of the rains in the areas in fig. 9 should be fully considered, and the work of disaster early warning, emergency resource deployment and the like can be performed in a targeted manner. For example, in the first area, the rainfall is more in type I, namely, the rainfall is the highest value at the beginning, the disaster early warning time is shorter, and the rainfall forecast is particularly important.
TABLE 3 comparison of rain type and disaster situation of historical rainstorm in Hebei province
Figure GDA0003517683750000131
The invention combines DTW similarity algorithm and K-means clustering technology in data mining technology to carry out rain type classification and region division research, thereby realizing deep application of current exponentially-increased meteorological big data. The rainfall distribution characteristics in the analysis area can be obtained by performing time and space distribution analysis on the result of rain type classification and cluster analysis on the regional rainfall event in combination with the terrain characteristics and the time characteristics, and theoretical support is provided for rainstorm prevention and disaster early warning. The rain type region division method based on data analysis is applied to real-time rain type prediction of rainfall events, and further provides an accurate flood control and disaster reduction method.
The method according to the invention can also be used for the analysis of other meteorological data.
It should be understood that the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention, and it will be obvious to those skilled in the art that other variations or modifications may be made on the basis of the above description, and all embodiments may not be exhaustive, and all obvious variations or modifications may be included within the scope of the present invention.

Claims (9)

1. A real-time rain type prediction method based on DTW is characterized by comprising the following steps,
acquiring a rain type partition of an area where a rainfall event is predicted,
acquiring rainfall data of the predicted rainfall event, including rainfall changing along with time, predicted rainfall duration and rainfall duration,
constructing a rainfall vector of a real-time process for forecasting the rainfall event,
taking the process rainfall vector of each historical field rainfall event in the area where the rainfall event is predicted as a template, and calculating the DTW of the real-time process rainfall vector of the predicted rainfall event and the first g process of the process rainfall vector of each historical field rainfall eventR(G, R), wherein G is G R/R, R is the rainfall time length, R is the predicted rainfall time length, the rainfall type of the historical rainfall event corresponding to the minimum DTW is predicted as the rainfall type of the rainfall event in real time,
acquiring rainfall data of a predicted rainfall event in real time, repeating the steps of constructing the rainfall vector in the real-time process and the steps of predicting in real time,
wherein the step of obtaining the rain type partition of the area where the predicted rainfall event is located comprises:
acquiring historical rainfall data of each observation station in an area to be analyzed, and dividing field rainfall events according to the rainfall data, wherein the rainfall data comprises rainfall amount and rainfall time which change along with time;
constructing a rain type standard template;
determining the rain type of each rain event by using a DTW algorithm based on the rain type standard template;
and carrying out cluster analysis on the rain types of the field rain events of all the observation sites by utilizing K-means cluster analysis to obtain a rain type partition map of the area to be analyzed.
2. The DTW-based real-time rain pattern prediction method of claim 1,
determining the rain type of each rain event by using a DTW algorithm based on the rain type standard template A further comprises,
construction of rain type standard template vector Ai
Ai={a1,a2,......,aj},i=1,2,…,I;j=1,2,…,J
In the formula: i is a positive integer which is the number of rain types in the rain type standard template, ajRepresenting the proportion of the rainfall of the jth stage of each rain model to the total rainfall, wherein J is a positive integer and is the number of stages of each rain model,
the process rainfall vector for each rain event is represented as
T={t1,t2,......,tg},g=1,2,……,G
In the formula: t is tgRepresenting the proportion of the rainfall in the total rainfall in the G unit time length, G is the time length of the rain event,
calculating the DTW of the process rainfall vector and the I standard template vectors of each rain eventi(j, g), wherein the rain type standard template vector corresponding to the minimum DT is the rain type of the rain event.
3. The DTW-based real-time rain pattern prediction method of claim 1,
performing cluster analysis on the rain types of all rain events of all observation sites by using K-means cluster analysis, further comprising,
obtaining each data point m of the area based on the rain type of each rain event of each observation station in the area to be analyzedpThe number of the i-th rain type field rain events accounts for the percentage m of the total number of the data point field rain eventspiP is 1,2, … …, and P is a positive integer and is the number of data points in the region to be analyzed;
setting category number k, and randomly selecting k data points as initial clustering center B ═ B1,b2,......bk};
For each data point mpAnd calculating the similarity of the cluster centers to each cluster center, wherein the calculation formula is as follows:
Figure FDA0003517683740000021
in the formula: bkiData representing the percentage of the number of the ith rain event type in the kth clustering center to the total number of the rain events in the data point field,
each data point mpThe cluster center with the highest similarity belongs to the category,
recalculating a new clustering center based on the classification result;
and when the moving distance of the clustering center is smaller than a preset value, stopping clustering analysis to obtain a rain type partition map of the area to be analyzed.
4. The DTW-based real-time rain prediction method of claim 3, wherein the method further comprises
Calculating the percentage of the number of rain events of each rain type field of each observation station to the total number of rain events of the observation station,
and carrying out spatial interpolation on the obtained data to obtain the occupation ratio distribution map of various rain types in the region to be analyzed.
5. The DTW-based real-time rain prediction method of claim 1, wherein the method further comprises
Respectively counting the number of various rain types of each observation station in each rain type subarea according to the obtained rain type subarea graph,
and defining the attribute of each rain type subarea based on each rain type proportion in each rain type subarea.
6. The DTW-based real-time rain pattern prediction method of claim 1, wherein the sorting is performed based on the total accumulated rainfall for each rainfall event, and the rain pattern area division is performed by using the rainfall events of which the total accumulated rainfall for each observation site is the top 10% -20%.
7. The DTW-based real-time rain prediction method of claim 2, wherein the rain type standard template is
Figure FDA0003517683740000031
8. The DTW-based real-time rain type prediction method of claim 2,
constructing a real-time process rainfall vector for predicting a rainfall event as
TR={t1,t2,......,tr},r=1,2,……,R
In the formula: t is trRepresenting the rainfall of the r unit time length;
constructing process rainfall vectors of the rain events of the historical fields as template vectors,
HG={h1,h2,......,hg},
in the formula: h isgRepresenting the amount of rainfall in the g unit time period.
9. The DTW-based real-time rain prediction method of claim 2, wherein the method further comprises
Predicting the peak rainfall of the rainfall event in real time by using the corresponding historical field rainfall eventRpeakThe calculation formula is as follows:
Figure FDA0003517683740000032
in the formula: hpeakPeak rainfall, H, representing the corresponding historical rainfall eventaccThe cumulative rainfall, R, representing the first g unit durations of the corresponding historical rainfall eventaccRepresents the accumulated amount of rainfall for the predicted rainfall event.
CN202011312636.1A 2020-11-20 2020-11-20 Rain type region division method based on data analysis and real-time rain type prediction method Active CN112508237B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011312636.1A CN112508237B (en) 2020-11-20 2020-11-20 Rain type region division method based on data analysis and real-time rain type prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011312636.1A CN112508237B (en) 2020-11-20 2020-11-20 Rain type region division method based on data analysis and real-time rain type prediction method

Publications (2)

Publication Number Publication Date
CN112508237A CN112508237A (en) 2021-03-16
CN112508237B true CN112508237B (en) 2022-07-08

Family

ID=74959199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011312636.1A Active CN112508237B (en) 2020-11-20 2020-11-20 Rain type region division method based on data analysis and real-time rain type prediction method

Country Status (1)

Country Link
CN (1) CN112508237B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113626999A (en) * 2021-07-27 2021-11-09 北京师范大学珠海校区 Multi-site rainfall random generation method
CN114997534B (en) * 2022-07-29 2022-10-21 长江水利委员会水文局 Similar rainfall forecasting method and equipment based on visual features
CN115271255B (en) * 2022-09-19 2022-12-09 长江水利委员会水文局 Rainfall flood similarity analysis method and system based on knowledge graph and machine learning
CN115730769B (en) * 2022-11-25 2023-08-22 武汉大学 Composite rain type construction method and device for evaluating drainage capacity of urban pipe network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376940A (en) * 2018-11-02 2019-02-22 中国水利水电科学研究院 The method and apparatus for obtaining the rainfall time space distribution in rainfall
CN110930282A (en) * 2019-12-06 2020-03-27 中国水利水电科学研究院 Local rainfall type analysis method based on machine learning
WO2020194642A1 (en) * 2019-03-27 2020-10-01 株式会社Singular Perturbations Event prediction device and event prediction method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376940A (en) * 2018-11-02 2019-02-22 中国水利水电科学研究院 The method and apparatus for obtaining the rainfall time space distribution in rainfall
WO2020194642A1 (en) * 2019-03-27 2020-10-01 株式会社Singular Perturbations Event prediction device and event prediction method
CN110930282A (en) * 2019-12-06 2020-03-27 中国水利水电科学研究院 Local rainfall type analysis method based on machine learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
中小流域设计暴雨雨型研究进展;严正宵等;《地理科学进展》;20200728(第07期);全文 *
利用热带降雨测量卫星的微波成像仪观测资料反演陆地降水;李万彪等;《气象学报》;20011020(第05期);全文 *
深圳市流域暴雨雨型及变化趋势分析;柴苑苑等;《水利技术监督》;20181128(第06期);全文 *
相空间中划分大尺度异常雨型的进一步研究;任宏利等;《气象学报》;20050420(第02期);全文 *

Also Published As

Publication number Publication date
CN112508237A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN112508237B (en) Rain type region division method based on data analysis and real-time rain type prediction method
CN112070286B (en) Precipitation forecast and early warning system for complex terrain river basin
WO2017032210A1 (en) Cluster analysis based power transmission line mountain fire risky area division method
CN113642849B (en) Geological disaster risk comprehensive evaluation method and device considering spatial distribution characteristics
CN108109382A (en) A kind of congestion points based on composite network, congestion line, the discovery method of congestion regions
CN108009596B (en) Method and device for determining rainfall characteristics
Yang et al. Enhancing weather-related power outage prediction by event severity classification
Booth et al. The paths of extratropical cyclones associated with wintertime high-wind events in the northeastern United States
CN112347652B (en) Heavy rain high risk division method based on linear moment frequency analysis of hydrological region
CN109165693A (en) It is a kind of to sentence knowledge method automatically suitable for dew, frost and the weather phenomenon of icing
CN113780657B (en) Flood event identification method, flood event identification device, computer equipment and storage medium
CN114202103A (en) Machine learning-based holiday scenic spot passenger flow prediction method
Wang et al. Uncertainty in SPI calculation and its impact on drought assessment in different climate regions over China
CN103942325A (en) Method for association rule mining of ocean-land climate events with combination of climate subdivision thought
CN112070366A (en) Regional landslide risk quantitative measuring and calculating method based on multi-source monitoring data correlation analysis
CN113836808A (en) PM2.5 deep learning prediction method based on heavy pollution feature constraint
CN112949953A (en) Rainstorm forecasting method based on PP theory and AF model
CN111915068A (en) Road visibility temporary prediction method based on ensemble learning
CN112394424A (en) Method for monitoring regional extreme rainfall event
Zhu et al. A decision tree model for meteorological disasters grade evaluation of flood
CN114580171A (en) Method for identifying river basin flood type and analyzing influence factors of river basin flood type
Dkengne et al. A limiting distribution for maxima of discrete stationary triangular arrays with an application to risk due to avalanches
CN113626500A (en) Flood stage determination method based on multi-feature indexes
Zahraie et al. SST clustering for winter precipitation prediction in southeast of Iran: Comparison between modified K-means and genetic algorithm-based clustering methods
Guzman et al. Heavier inner-core rainfall of major hurricanes in the North Atlantic Basin than in other global basins

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant