CN115453662B - Abnormal site screening method combining time dimension and space dimension - Google Patents

Abnormal site screening method combining time dimension and space dimension Download PDF

Info

Publication number
CN115453662B
CN115453662B CN202210984314.4A CN202210984314A CN115453662B CN 115453662 B CN115453662 B CN 115453662B CN 202210984314 A CN202210984314 A CN 202210984314A CN 115453662 B CN115453662 B CN 115453662B
Authority
CN
China
Prior art keywords
abnormal
station
stations
rainfall
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210984314.4A
Other languages
Chinese (zh)
Other versions
CN115453662A (en
Inventor
田济扬
刘荣华
刘含影
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Institute of Water Resources and Hydropower Research
Original Assignee
China Institute of Water Resources and Hydropower Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Institute of Water Resources and Hydropower Research filed Critical China Institute of Water Resources and Hydropower Research
Priority to CN202210984314.4A priority Critical patent/CN115453662B/en
Publication of CN115453662A publication Critical patent/CN115453662A/en
Application granted granted Critical
Publication of CN115453662B publication Critical patent/CN115453662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/14Rainfall or precipitation gauges
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hydrology & Water Resources (AREA)
  • Engineering & Computer Science (AREA)
  • Atmospheric Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Ecology (AREA)
  • Environmental Sciences (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention relates to an abnormal site screening method combining time dimension and space dimension, which comprises the following steps: step 1, preliminarily judging a reference station based on annual sequence rainfall observation data by adopting a Hampel method; step 2, judging the reference station again by adopting the Grabbs criterion; step 3, judging abnormal sites by adopting a peripheral station analysis method after the reference stations are determined based on hourly rainfall monitoring data; and 4, radar auxiliary checking of abnormal sites. The method can realize the abnormal identification and the rapid processing of the data of the large-scale rainfall monitoring station, the abnormal identification rate is up to more than 95 percent, and accurate and reliable basis is provided for the rainstorm flood risk prompt and early warning.

Description

Abnormal site screening method combining time dimension and space dimension
Technical Field
The invention is applied to application date 2021, 11 and 26, and the application numbers are as follows: 202111419496.2, entitled "screening method for abnormal sites in monitoring rainfall in large scale", is a divisional application of the patent application.
The invention relates to a large-scale rainfall monitoring abnormal site screening method, belongs to the field of meteorological hydrology, and is mainly used for providing accurate and reliable rainfall monitoring information for rainstorm flood risk prompting and early warning.
Background
Rainfall monitoring is an important component of hydrological monitoring and is an otoscope and a conspire of the work of defending rainstorm flood disasters. Since the 21 st century, the support of the construction of automatic monitoring stations is increased by water conservancy departments, and particularly through the construction of mountain torrent disaster prevention and control projects, 13.2 thousands of automatic monitoring stations for mountain torrent disasters in China are achieved, and the average density of automatic rainfall station networks is 38km 2 Station, which is 22 times of station 2006 (6000 stations), the minimum flood forecasting time period is shortened to 10min, the data information amount is increased by more than 100 times, the rainfall monitoring blind area is greatly reduced, and the flood and drought disaster prevention work is powerfully supported. However, due to the fact that the construction standards of part of measuring stations are low, the operation and maintenance of the measuring stations located in hilly areas are difficult, the data quality is difficult to guarantee, the situations of large number, missing measurement and the like often occur, the problem of the measuring stations is very random, and it is impractical to discard one measuring station completely.
In order to effectively utilize the monitoring data of the sites, the sites with accurate monitoring data in different time periods need to be found out from numerous rainfall monitoring sites, and the sites with problematic data quality are eliminated. For a method with high accuracy of extremely individual anomaly identification, a large amount of distance calculation and comparison are required, the requirement on calculation resources is extremely high, and in fact, business application puts high requirements on the stability, reliability and efficiency of the method for identifying the anomalous site. The method for identifying the large-scale abnormal sites capable of being applied in a business mode is very urgent.
Disclosure of Invention
The invention provides a large-scale rainfall monitoring abnormal site screening method, which solves the technical problems that a progressive abnormal site screening system is established based on a Hampel method, a Grabas rule, a peripheral station analysis method, a radar auxiliary check method and the like, the calculation efficiency is improved through a K-d tree (K-dimension tree) high-level data structure and a parallel calculation method, and a reliable method is provided for large-scale rainfall monitoring data abnormal recognition and full utilization of effective information of rainfall monitoring stations.
In order to solve the technical problems, the invention adopts the following scheme:
a large-scale rainfall monitoring abnormal site screening method comprises the following steps:
step 1, searching abnormal years of a survey station through a rainfall time sequence of the survey station by adopting a Hampel method, and preliminarily judging a reference station on a time dimension;
step 2, adopting an improved Grubbs criterion to judge the spatial dimension, determining whether the rainfall of peripheral stations of the stations in abnormal years is abnormal or not through the rainfall of the peripheral stations, and judging the reference station again;
step 3, judging abnormal sites by adopting a peripheral station analysis method based on hourly rainfall monitoring data after the reference sites are determined;
and 4, radar-assisted checking of abnormal sites.
A multi-time-space abnormal site screening system of 'annual scale primary screening-hour scale fine judgment-hour scale verification' is established.
Further, the steps 1 and 2 provide a method for distinguishing a reference station of 'space-time dimension + precedence rule' based on a rainfall sequence, namely, from large-scale stations, the abnormal year of the station is found out through the rainfall time sequence of a certain station, and then whether the station is abnormal or not is determined through the rainfall of peripheral stations, so that the method can adapt to the current situation that the annual observation sequence of most stations in China is short, and the accuracy is improved.
Further, the survey stations with stable rainfall monitoring and relatively high data quality are preliminarily screened out by adopting a Hampel method in the step 1, and firstly, the abnormal years of the monitoring data of the single survey station are identified by utilizing the Hampel method based on the annual rainfall value of the long-sequence survey station. The Hampel method can be used for judging abnormal extreme values, and the basic principle is that a distribution and probability model is assumed for a given data set, and then a data series is processed by adopting an inconsistency test according to the assumption, and the formula is as follows:
Figure GDA0004078859790000031
wherein X i For a certain value in the data sequence X, median is the Median of X, MAD (Median absolute determination) is the Median of the data set Y, X = { X = { 1 ,x 2 ,…x n Is a data sequence of annual rainfall at the observation station, Y = { Y = 1 ,y 2 ,…,y n }={x 1 -median,x 2 -median,…,x n -median}。
Further, in the step 1, time dimension judgment is carried out by adopting a Hampel method, and for the stations with the annual sequence less than 10, the threshold value is set to be 2.24. When Z is i When the value (i =1,2, \8230;, n) is greater than 2.24, X is judged to be i As an anomaly point, i is the anomaly year of the station.
Step 1, according to the factors of the area of a research area and the distribution condition of the measuring stations, on the basis of considering the distance between a station to be measured and a peripheral station, a target is partitioned through longitude and latitude, when the measuring stations within the range of 20km are searched, a computer program only needs to search in one area, the areas of the target partition are searched and compared at the same time, the measuring stations which are judged to be abnormal do not participate in comparison as the peripheral measuring stations when the measuring stations are judged in succession, and the adverse effect of partial abnormal values on the judgment effect is reduced.
Considering that the construction age of a large number of stations is relatively close, the data sequence is short, the abnormal year of the monitoring data of the stations is judged from the time dimension only by the Hampel method, and the reliability is not enough.
Further, step 2, adopting an improved Grabbs criterion to carry out spatial dimension judgment, using the site monitoring the abnormal year as a center, using 20km as a radius to define an influence area, and verifying the abnormal year of the survey station.
Further, the Grabbs criterion in step 2 is applied to the case where the number of measurements is small (n <100 > is 3. Ltoreq. N), and a plurality of abnormal values can be obtained at one time. The basic principle is that two most important parameters, namely the average value and the variance in normal distribution, are introduced in the process of judging whether suspicious values are acceptable, so that the judgment accuracy is improved. Considering that the time sequence of the annual accumulated rainfall of a single station is short, in order to improve the accuracy of abnormal value judgment, judgment needs to be made from the space dimension by means of peripheral stations, and the Grabbs criterion judgment method is applicable.
The improved Grabbs criterion is that the average value in the original criterion formula is replaced by a median, so that the shielding effect of the ipsilateral abnormal value can be effectively eliminated, and the improved Grabbs criterion is a more stable processing method, and the method comprises the following steps:
firstly, selecting a station which is preliminarily determined to be abnormal in a certain year by a Hampel method, dividing an area around the station by taking 20km as a radius, forming about 50 stations in the area, forming samples by annual rainfall values of all stations in the area, and sequencing the samples from small to large to obtain a sample sequence X = (X =) 1 ,x 2 ,x n ) Setting the value G of the critical coefficient G (a, n) 0 ,G 0 Obtained by looking up a critical value table, a is the significance level, a is taken as 0.05, and G is calculated 1 ,G n
G 1 =(X In -x 1 )/σ
Gn=(x n -X In )/σ
Wherein n is the number of stations, X In Is the median of the samples, σ is the standard deviation.
If G is 1 ≥G n And G 1 >G 0 Then x is determined 1 Is an abnormal value and is removed; if G is n ≥G 1 And G n >G 0 Then x n The abnormal value is eliminated; if G is 1 <G 0 And G n <G 0 Then no outliers exist. If the abnormal value exists, the annual rainfall value of the rest sites is recalculated after the abnormal value is removed, and the steps are repeated until the abnormal value does not exist.
In order to improve the calculation efficiency and shorten the time for judging the distance relationship between a station to be judged and peripheral stations when a program runs, when the spatial dimension is judged by adopting an improved Grabbs criterion in the step 2, according to the area of a research area and the distribution condition factors of the stations, on the basis of considering the distance between the station to be judged and the peripheral stations, a target is partitioned by latitude and longitude, when the computer program searches the stations within the range of 20km, the stations are only searched in one area, the areas of the target partitions are searched and compared simultaneously, the stations which are judged to be abnormal do not participate in comparison as the peripheral stations when the stations to be judged are subsequently judged, and the adverse effect of partial abnormal values on the judgment effect is reduced.
Such as: the target area Fujian province is divided into 7 areas by longitude and latitude. When the program searches stations within the range of 20km, the program searches in only one area, and 7 areas within the range of the whole province are searched and compared simultaneously. The testing station which is judged to be abnormal does not participate in comparison as a peripheral testing station when the testing station to be detected is judged subsequently, and the adverse effect of part of abnormal values on the judging effect is reduced.
And after the joint judgment is carried out by a Hampel method and a Grabbs criterion, the initial judgment of the reference station is completed.
Further, in the step 3, anomaly identification is carried out on hourly rainfall monitoring data by adopting a peripheral station analysis method. When peripheral stations are used for analysis, a reference station and a station to be evaluated are preferentially selected for simultaneous rainfall comparison, and after the longer distance of the reference station (exceeding a certain threshold) is evaluated by the reference station, a non-reference station determined as a normal station and the station to be evaluated are subjected to simultaneous rainfall comparison. In order to avoid the problem of rainfall monitoring of the preliminarily determined reference station at a certain moment, the preliminarily determined reference station is still used as a correct monitoring station, and when the reference station is used for evaluating the to-be-evaluated station, the to-be-evaluated station also comprises the reference station. And judging whether the station to be evaluated is abnormal or not by comparing the rainfall of the station to be evaluated with the average rainfall of the peripheral reference stations (or the non-reference stations qualified by evaluation) of the same period.
And 3, when judging the hourly rainfall monitoring data, evaluating only stations with rainfall exceeding 10mm in 1h or 3h or 6h, rainfall exceeding 15mm in 12h or rainfall exceeding 25mm in 24h by adopting a peripheral station analysis method. And during evaluation, rain values of the stations to be tested in different time periods of 1h, 3h, 6h, 12h and 24h are respectively compared with the average rain values of the stations at the corresponding time periods of the peripheral stations, and when the difference of the rain values exceeds one grade, the stations are judged to be abnormal. Wherein, the rainfall rating is as shown in the following table:
Figure GDA0004078859790000051
in order to select a proper evaluation range, the range from 5km to 30km away from the station to be evaluated is tested, and when the peripheral distance is 5km, 10km, 15km, 20km, 25km and 30km, the average number of peripheral stations of the station to be evaluated is 4, 13, 30, 50, 78 and 115 respectively. And counting the identification accuracy rate and the calculation time length of the abnormal site.
Step 3, when the hourly rainfall monitoring data are judged, the relation between the optimal radius R of the analysis range of the peripheral analysis method and the monitoring station net density p is p =0.0267R 2 +0.4667R +12, radius value range [5km,30km]And p is in the range of [15km ] 2 Station, 50km 2 Station]. Considering that the network density of the rain monitoring station in Fujian province is about 25km 2 The station is adopted, the radius of an analysis range is set to be 15km, and the average number of stations in the analysis range is 30.
The quantity of rainfall monitoring stations in the Fujian province is large, the density is large, and the rainfall monitoring data volume is huge. In order to achieve the purpose of calculating and judging abnormal sites in real time, a K-dtree (K-dimension tree) high-level data structure and a parallel calculation method are researched and adopted, so that the calculation efficiency is greatly improved. Through testing, the provincial site carries out one-time abnormity identification, and the calculation time is about 5-8 min.
Step 1-step 3, the method for distinguishing the reference station of 'space-time dimension + precedence rule' based on the annual rainfall sequence is that from large-scale stations, the abnormal years of the stations are found out through the rainfall time sequence of a certain station, and then whether the stations are abnormal or not is determined through the rainfall of the peripheral stations,
by partitioning the target region and adopting a K-d tree (K-dimension tree) high-level data structure and a parallel computing method in a single region, the computing efficiency is greatly improved.
And 4, further verifying the screened abnormal sites through radar auxiliary verification to determine the abnormal sites. After the initial judgment of the reference station and the analysis of the peripheral stations, the screening of the abnormal stations is already completed initially, but the stations which are located at the boundary of a rain area and a non-rain area and have a large difference in rain intensity and report normally are also easily judged as the abnormal stations in the manual verification process, so that the screening result still needs to be further verified. Although the precision of radar rainfall inversion is influenced by factors such as an inversion algorithm and the like, whether the radar detection coverage area is rainfall or the rainfall magnitude can be judged according to the reflectivity, the spatial distribution characteristics of rainfall in a certain period can be fully reflected, and therefore the radar-assisted verification is reasonable in the preliminary screening result of abnormal sites.
Further, a radar-assisted verification method is adopted in the step 4, the stations which are positioned at the boundary of the rain area and the non-rain area and have large rain intensity difference and normal rain area boundary report are screened, the screening conditions are three, and one of the conditions is absent, namely, whether the radar low-layer elevation reflectivity exceeds a 20dBZ threshold value is used, and the station judgment result at the boundary of the rain area and the non-rain area is verified; secondly, the rainfall intensity is inverted through radar base data and compared with the rainfall magnitude of the measuring station, and whether the measuring station is abnormal or not is verified; and thirdly, verifying a station-measuring judgment result at the boundary of a rain area with larger rain intensity difference by comparing the reflectivity spatial variation gradient value with the rainfall station observation gradient value, judging that the station-measuring judgment result is abnormal when the reflectivity spatial variation gradient value exceeds the rainfall station observation gradient value by 1 time, and judging that the station-measuring judgment result is abnormal when the reflectivity spatial variation gradient value is smaller than the rainfall station observation gradient value by 1 time.
The abnormal site identification accuracy is high by performing combined judgment based on Hampel method, grubbs criterion, peripheral station measurement method abnormal identification and other methods. According to the invention, by establishing a progressive abnormal site screening system, the identification accuracy of large-scale abnormal sites is improved, and the method has important significance in fully utilizing effective information of the rainfall monitoring station and eliminating ineffective information.
The method for screening the large-scale rainfall monitoring abnormal sites has the following beneficial effects:
(1) The Hampel method judges the abnormal year of the station monitoring data from the time dimension, the improved Grabbs criterion judges from the space dimension by means of the peripheral stations, and the space-time dimension and the precedence rule based on the annual rainfall sequence can adapt to the current situation that most of the stations in China have short annual observation sequences, thereby improving the accuracy.
(2) When the improved Grabbs criterion is adopted to judge the spatial dimension, the detection area is partitioned according to the area and the distribution condition of the test stations, when peripheral test stations are searched, only one partition is searched, and the test stations which are judged to be abnormal do not participate in comparison as the peripheral test stations when the test stations to be detected are judged subsequently, so that the calculation efficiency and the accuracy are improved.
(3) According to the invention, a progressive abnormal site screening system is established based on a Hampel method, a Graves criterion, a peripheral station analysis method, a radar auxiliary check method and other methods, so that the accuracy and stability of data abnormality identification of a large-scale rainfall monitoring station can be improved, and an accurate and reliable basis is provided for rainstorm flood risk prompt and early warning.
(4) The radar auxiliary verification method adopted in the invention screens the normal stations which are misjudged as abnormal stations, and the stations which are positioned at the boundary of a rain area and a non-rain area and have larger rain intensity difference and have normal reporting, wherein the screening conditions are three, and one of the conditions is unavailable, so that the accuracy of abnormal identification is improved.
Description of the figures (tables)
FIG. 1: the invention discloses a flow chart of a large-scale rainfall monitoring abnormal site screening method.
FIG. 2 is a schematic diagram: the distance and the calculation duration/accuracy rate are in corresponding relation.
FIG. 3: in the invention, the number and the proportion of abnormal sites in 2015-2020 are determined.
FIG. 4: the invention relates to the number and proportion of abnormal sites at 27 days and 4 days in 6 months and 28 days.
Detailed Description
The technical scheme adopted by the invention is based on methods such as a Hampel method, a Grabbs criterion, a peripheral station analysis method and radar auxiliary check, a progressive abnormal station screening system is established, the calculation efficiency is improved through a K-d tree (K-dimension tree) high-level data structure and a parallel calculation method, and a reliable method is provided for large-scale rainfall monitoring data abnormal recognition and full utilization of effective information of rainfall monitoring stations.
A large-scale rainfall monitoring abnormal site screening method comprises the following steps:
step 1, finding out abnormal years of a survey station through a rainfall time sequence of the survey station by adopting a Hampel method, and preliminarily judging a reference station on a time dimension;
step 2, adopting an improved Grubbs criterion to carry out spatial dimension judgment, determining whether the peripheral stations are abnormal or not according to rainfall of the stations in abnormal years, and judging the reference station again;
step 3, judging abnormal sites by adopting a peripheral station analysis method after the reference station is determined based on hourly rainfall monitoring data;
and 4, radar-assisted checking of abnormal sites.
A multi-time-space abnormal site screening system of 'annual scale primary screening-hour scale fine judgment-hour scale verification' is established.
Further, the steps 1 and 2 provide a method for distinguishing a reference station of 'space-time dimension + precedence rule' based on a rainfall sequence, namely, from large-scale stations, the abnormal year of the station is found out through the rainfall time sequence of a certain station, and then whether the station is abnormal or not is determined through the rainfall of peripheral stations, so that the method can adapt to the current situation that the annual observation sequence of most stations in China is short, and the accuracy is improved.
Further, the survey stations with stable rainfall monitoring and relatively high data quality are preliminarily screened out by adopting a Hampel method in the step 1, and firstly, the abnormal years of the monitoring data of the single survey station are identified by utilizing the Hampel method based on the annual rainfall value of the long-sequence survey station. The Hampel method can be used for judging abnormal extreme values, and the basic principle is that a distribution and probability model is assumed for a given data set, and then a data series is processed by adopting an inconsistency test according to the assumption, and the formula is as follows:
Figure GDA0004078859790000091
wherein, X i For a certain value in the data sequence X, median is the Median of X, MAD (mean absolute deviation) is the Median of the data set Y, X = { X = 1 ,x 2 ,…x n Is a survey station annual rainfall data series, Y = { Y = 1 ,y 2 ,…,y n }={x 1 -median,x 2 -median,…,x n -median}。
And step 1, adopting a Hampel method to carry out time dimension judgment, and setting a threshold value to be 2.24 for a station with a sequence of years less than 10. When Z is i When the value (i =1,2, \8230;, n) is greater than 2.24, X is judged to be i As an anomaly point, i is the anomaly year of the station.
Step 1, according to the factors of the area of a research area and the distribution condition of the measuring stations, on the basis of considering the distance between a station to be measured and a peripheral station, a target is partitioned through longitude and latitude, when the measuring stations within the range of 20km are searched, a computer program only needs to search in one area, the areas of the target partition are searched and compared at the same time, the measuring stations which are judged to be abnormal do not participate in comparison as the peripheral measuring stations when the measuring stations are judged in succession, and the adverse effect of partial abnormal values on the judgment effect is reduced.
Considering that the construction age of a large number of stations is relatively close, the data sequence is short, the abnormal year of the monitoring data of the stations is judged from the time dimension only by the Hampel method, and the reliability is not enough.
Further, step 2, adopting an improved Grubbs criterion to carry out spatial dimension judgment, using the sites with abnormal years as a center, using 20km as a radius to define an influence area, and verifying the abnormal years of the stations.
Furthermore, the Grabbs criterion in step 2 is applied to the case of a small number of measurements (n <100 > is not less than 3), and a plurality of abnormal values can be obtained at one time. The basic principle is that two most important parameters, namely the average value and the variance in normal distribution, are introduced in the process of judging whether suspicious values are acceptable, so that the judgment accuracy is improved. Considering that the time sequence of the annual accumulated rainfall of a single station is short, in order to improve the accuracy of abnormal value judgment, judgment needs to be made from the space dimension by means of peripheral stations, and the Grabbs criterion judgment method is applicable. The improved Grabbs criterion is to replace the average value in the original criterion formula by a median, can effectively eliminate the shielding effect of the ipsilateral abnormal value, and is a more stable processing method, and the method comprises the following steps:
firstly, selecting a station which is preliminarily determined to be abnormal in a certain year by a Hampel method, dividing an area around the station by taking 20km as a radius, forming about 50 stations in the area, forming samples by annual rainfall values of all stations in the area, and sequencing the samples from small to large to obtain a sample sequence X = (X =) 1 ,x 2 ,x n ) Setting the value G of the critical coefficient G (a, n) 0 ,G 0 Obtained by looking up a critical value table, a is the significance level, a is taken as 0.05, and G is calculated 1 ,G n
G 1 =(X In (1) -x 1 )/σ
Gn=(x n -X In )/σ
Wherein n is the number of stations, X In Is the median of the samples, σ is the standard deviation.
If G is 1 ≥G n And G 1 >G 0 Then x is determined 1 Is an abnormal value and is removed; if G is n ≥G 1 And G n >G 0 Then x n Is an abnormal value andremoving; if G is 1 <G 0 And G n <G 0 Then no outliers exist. If the abnormal value exists, the annual rainfall value of the rest sites is recalculated after the abnormal value is removed, and the steps are repeated until the abnormal value does not exist.
In order to improve the calculation efficiency and shorten the time for judging the distance relationship between a station to be judged and peripheral stations when a program runs, when the spatial dimension is judged by adopting an improved Grabbs criterion in the step 2, according to the area of a research area and the distribution condition factors of the stations, on the basis of considering the distance between the station to be judged and the peripheral stations, a target is partitioned by latitude and longitude, when the computer program searches the stations within the range of 20km, the stations are only searched in one area, the areas of the target partitions are searched and compared simultaneously, the stations which are judged to be abnormal do not participate in comparison as the peripheral stations when the stations to be judged are subsequently judged, and the adverse effect of partial abnormal values on the judgment effect is reduced.
Such as: and dividing the target area Fujian province into 7 areas according to the longitude and latitude. When the program searches stations within the range of 20km, the program searches in only one area, and 7 areas within the range of the whole province are searched and compared simultaneously. The testing station which is judged to be abnormal does not participate in comparison as a peripheral testing station when the testing station to be detected is judged subsequently, and the adverse effect of part of abnormal values on the judging effect is reduced.
And after the joint judgment is carried out by a Hampel method and a Grabas criterion, the initial judgment of the reference station is completed.
Further, in the step 3, anomaly identification is carried out on hourly rainfall monitoring data by adopting a peripheral station analysis method. When peripheral stations are used for analysis, a reference station and a station to be evaluated are preferentially selected for simultaneous rainfall comparison, and after the longer distance (exceeding a certain threshold value) of the reference station is evaluated by the reference station, a non-reference station determined as a normal station and the station to be evaluated are determined for simultaneous rainfall comparison. In order to avoid the problem of rainfall monitoring of the preliminarily determined reference station at a certain moment, the preliminarily determined reference station is still used as a correct monitoring station, and when the reference station is adopted to evaluate the to-be-evaluated station, the to-be-evaluated station also comprises the reference station. And judging whether the station to be evaluated is abnormal or not by comparing the rainfall of the station to be evaluated with the average rainfall of the peripheral reference stations (or the non-reference stations qualified by evaluation) of the same period.
And 3, when judging the hourly rainfall monitoring data, evaluating only stations with rainfall exceeding 10mm in 1h or 3h or 6h, rainfall exceeding 15mm in 12h or rainfall exceeding 25mm in 24h by adopting a peripheral station analysis method. And during evaluation, rain values of the stations to be tested in different time periods of 1h, 3h, 6h, 12h and 24h are respectively compared with the average rain values of the stations at the corresponding time periods of the peripheral stations, and when the difference of the rain values exceeds one grade, the stations are judged to be abnormal. Wherein, the rainfall rating is as shown in the following table:
Figure GDA0004078859790000121
in order to select a proper evaluation range, the range from 5km to 30km away from the station to be evaluated is tested, and when the peripheral distance is 5km, 10km, 15km, 20km, 25km and 30km, the average number of peripheral stations of the station to be evaluated is 4, 13, 30, 50, 78 and 115 respectively. And counting the identification accuracy rate and the calculation time length of the abnormal site.
Step 3, judging the hourly rainfall monitoring data, wherein the relation between the optimal radius R of the analysis range of the peripheral analysis method and the network density p of the monitoring station is p =0.0267R 2 +0.4667R +12, radius value range of [5km,30km]And p is in the range of [15km ] 2 Station, 50km 2 Station]. Considering that the network density of the rain monitoring station in Fujian province is about 25km 2 Station, the accuracy reaches the maximum when the station is 15km, the calculation time is relatively moderate (figure 2), so when peripheral stations are adopted for analysis, the radius of an analysis range is set to be 15km, and the average number of stations in the analysis range is 30.
The quantity of rainfall monitoring stations is large in Fujian province, the density is large, and the rainfall monitoring data volume is huge. In order to achieve the purpose of calculating and judging abnormal sites in real time, a K-dtree (K-dimension tree) high-level data structure and a parallel calculation method are researched and adopted, so that the calculation efficiency is greatly improved. Through testing, the provincial site carries out one-time abnormity identification, and the calculation time is about 5-8 min.
Further, in step 4, the screened abnormal sites are further verified through radar auxiliary verification, and the abnormal sites are determined. After the initial judgment of the reference station and the analysis of the peripheral stations, the screening of the abnormal station is already completed initially, but the stations which report normally at the boundary between the rain area and the non-rain area and with larger difference in rain intensity are also easily judged as the abnormal station in the manual verification process, so the screening result still needs to be further verified. Although the accuracy of radar rainfall inversion is influenced by factors such as an inversion algorithm, whether the radar detection coverage area is rainfall or the rainfall magnitude can be judged according to the reflectivity, the spatial distribution characteristics of rainfall in a certain period can be fully reflected, and therefore the result of preliminary screening of abnormal sites by radar auxiliary verification is reasonable.
Further, step 4 adopts a radar-assisted verification method, and screens the stations with normal reporting at the boundary of the rain area and the non-rain area and the boundary of the rain area with large difference of rain intensity, wherein the screening conditions are three, and one is absent, and firstly, whether the elevation reflectivity of the low layer of the radar exceeds a threshold value of 20dBZ or not is used for verifying the judgment result of the stations at the boundary of the rain area and the non-rain area; secondly, the rainfall intensity is inverted through radar base data, and is compared with the rainfall magnitude of the observation station, and whether the observation station is abnormal or not is verified; thirdly, verifying a station-measuring judgment result at the boundary of a rain area with larger rain intensity difference by comparing the reflectivity spatial variation gradient value with the rainfall station observation gradient value, and judging abnormality when the reflectivity spatial variation gradient value exceeds the rainfall station observation gradient value by 1 time; when the reflectivity spatial variation gradient value is 1 time smaller than the rainfall station observation gradient value, judging the abnormity
According to the research, a progressive abnormal site screening system is constructed by actually measuring rainfall data in 2015-2021 years by 5234 rainfall stations in Fujian province and based on a Hampel method, a Grabbs criterion, a peripheral station analysis method, radar auxiliary verification and other methods, and abnormal identification is carried out on rainfall monitoring data. The abnormal sites in 2015-2020 are judged and found by a Hampel method and a Grubbs criterion judgment method, the number of the abnormal sites in 2015 is the largest and accounts for 11.5%, and then the number of the abnormal sites is reduced year by year until 2020, and the number of the abnormal sites accounts for only 5.18% (fig. 3).
Abnormal value identification results of 1 day 8 point, 10 days 14 points and 30 days 20 points of 6 months, 7 months and 8 months of 2016-2020 each year are respectively selected from abnormal results obtained by a peripheral station measuring method for manual verification, and the abnormal identification accuracy rates are all 90% (table 1).
Table 1 rainfall abnormality recognition accuracy (%) -based on peripheral station analysis
Figure GDA0004078859790000141
To further verify the availability of the method, the rainfall monitoring station of Fujian province, 27-28 days 6/month 2021 was used for real-time judgment of abnormal stations (fig. 4). Considering that the Fujian province has wide rainfall range from 19 days 19 to 00 days 27 and from 09 days 28 days 6 months 00 to 15, and more report stations, the following abnormal results at 6 moments are selected for verifying the radar echo. The normal survey stations which are mistakenly judged as the abnormal stations are screened by the radar auxiliary verification method, the average accuracy of the abnormal recognition result before radar verification is 89%, and the average accuracy is improved to 95% after radar verification (shown in a table 2), which shows that the radar auxiliary verification method is very suitable for the situation that the normal stations which are positioned on the boundary between a rain area and a non-rain area and have larger rain intensity difference are mistakenly judged as the abnormal stations.
Table 2 accuracy (%) of results of abnormality recognition in Fujian province rainfall monitoring station before and after 6-time radar verification
Figure GDA0004078859790000142
The abnormal site identification accuracy is high by performing combined judgment based on Hampel method, grubbs criterion, peripheral station measurement method abnormal identification and other methods. According to the invention, by establishing a progressive abnormal site screening system, the identification accuracy of large-scale abnormal sites is improved, and the method has important significance in fully utilizing effective information of the rainfall monitoring station and eliminating ineffective information.
The invention is described above with reference to the accompanying drawings, and it is obvious that the implementation of the invention is not limited by the above-mentioned manner, and it is within the scope of the invention to adopt various modifications of the method concept and technical solution of the invention, or directly apply the concept and technical solution of the invention to other occasions without modification.

Claims (3)

1. A time dimension and space dimension combined abnormal site screening method comprises the following steps:
step 1, finding out abnormal years of a survey station through a rainfall time sequence of the survey station by adopting a Hampel method, and preliminarily judging a reference station on a time dimension;
step 1, adopting a Hampel method to carry out time dimension judgment, and setting a threshold value to be 2.24 for a station with a sequence of years less than 10; when the temperature is higher than the set temperature
Figure QLYQS_1
If the value is greater than 2.24, the determination is madeX i In order to be the point of the anomaly,ithe abnormal year of the station is shown as the following formula: />
Figure QLYQS_2
Wherein, the first and the second end of the pipe are connected with each other,X i as a sequence of dataXIs a certain value of (a) is,Medianis composed ofXThe median of (a) is greater than (b),MADmedian absolute deviation) As a data setYIs medium number of (4), is greater than or equal to>
Figure QLYQS_3
Is a data sequence of annual rainfall of the observation station,
Figure QLYQS_4
Figure QLYQS_5
step 2, adopting an improved Grubbs criterion to carry out spatial dimension judgment, determining whether the peripheral stations are abnormal or not according to rainfall of the stations in abnormal years, and judging the reference station again;
in the step 2, when the spatial dimension is judged by adopting the improved Grabbs criterion, according to the area of a research area and the distribution condition of the measuring stations, on the basis of considering the distance between the station to be measured and the peripheral stations,the target is partitioned through longitude and latitude, when the computer program searches the test stations within the range of 20km, the test stations are only searched in one area, the areas of the target partition are searched and compared at the same time, the test stations which are judged to be abnormal do not participate in comparison as peripheral test stations when the test stations to be detected are judged subsequently, and the adverse effect of part of abnormal values on the judgment effect is reduced; firstly, selecting the stations which are preliminarily judged to be abnormal in a certain year by a Hampel method, dividing an area around the stations by taking 20km as a radius, forming about 50 stations in the area, forming samples by annual rainfall values of all stations in the area, and sequencing the samples from small to large to form a sample sequence
Figure QLYQS_6
Setting the critical coefficientG(a, n) Value of (A)G 0 G 0 Is obtained by looking up a critical value table, taking a as the significance level and taking a as 0.05, and calculatingG 1 G n :/>
Figure QLYQS_7
Wherein the content of the first and second substances,nin order to measure the number of stations,X in Is the median of the sample, σ is the standard deviation;
if it isG 1 G n And is provided withG 1 G 0 Then, it is determinedx 1 Is an abnormal value and is removed; if it isG n G 1 And is provided withG n G 0 Then, thenx n Is an abnormal value and is removed; if it isG 1 G 0 And is provided withG n G 0 Then there is no outlier; if the abnormal value exists, the annual rainfall value of the rest sites is recalculated after the abnormal value is eliminated, and the steps are repeated until the abnormal value does not exist;
step 3, judging abnormal sites by adopting a peripheral station analysis method after the reference stations are determined based on hourly rainfall monitoring data;
step 3 hourly rainfall monitoring data judgmentIn other times, the relation between the optimal radius R of the analysis range of the peripheral analysis method and the monitoring station network density p is p =0.0267R 2 +0.4667R +12, radius value range [5km,30km]And p is in the range of [15km ] 2 Station, 50km 2 Station];
Step 4, radar-assisted checking of abnormal sites;
the step 1 adopts a Hampel method to judge the abnormal year of the station monitoring data from the time dimension, and the step 2 adopts an improved Grabbs criterion to judge from the space dimension by means of peripheral survey stations;
the radar auxiliary inspection method in the step 4 has three screening conditions, one of which is unavailable, namely, whether the elevation reflectivity of the low layer of the radar exceeds a threshold value of 20dBZ or not is used for verifying the judgment result of the measuring station at the boundary of the rain area and the non-rain area; secondly, the rainfall intensity is inverted through radar base data, and is compared with the rainfall magnitude of the observation station, and whether the observation station is abnormal or not is verified; and thirdly, verifying the station-measuring judgment result at the boundary of the rain area with larger rain intensity difference by comparing the reflectivity spatial variation gradient value with the rainfall station observation gradient value, and judging that the station-measuring judgment result is abnormal when the reflectivity spatial variation gradient is 1 time smaller than the rainfall station observation gradient value.
2. The method for screening abnormal sites in combination of time dimension and space dimension as claimed in claim 1, wherein: when the hourly rainfall monitoring data in the step 3 is judged, only the stations with rainfall of 1h or 3h or 6h exceeding 10mm, 12h exceeding 15mm or 24h exceeding 25mm are evaluated by adopting a peripheral station analysis method; during evaluation, rain values of the stations to be tested in different time periods of 1h, 3h, 6h, 12h and 24h are respectively compared with average rain values of the stations at corresponding time periods of the peripheral stations, and when the difference of the rain values exceeds one grade, the stations are judged to be abnormal; wherein, the rainfall rating is as shown in the following table:
Figure QLYQS_8
3. the method for screening abnormal sites in combination of time dimension and space dimension as claimed in claim 1, wherein: and 4, screening the stations which are positioned at the boundary of the rain area and the non-rain area and have larger rain intensity difference and normal rain area boundary report by adopting a radar-assisted verification method.
CN202210984314.4A 2021-11-26 2021-11-26 Abnormal site screening method combining time dimension and space dimension Active CN115453662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210984314.4A CN115453662B (en) 2021-11-26 2021-11-26 Abnormal site screening method combining time dimension and space dimension

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210984314.4A CN115453662B (en) 2021-11-26 2021-11-26 Abnormal site screening method combining time dimension and space dimension
CN202111419496.2A CN114236645B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring abnormal site screening method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202111419496.2A Division CN114236645B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring abnormal site screening method

Publications (2)

Publication Number Publication Date
CN115453662A CN115453662A (en) 2022-12-09
CN115453662B true CN115453662B (en) 2023-03-28

Family

ID=80751226

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202111419496.2A Active CN114236645B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring abnormal site screening method
CN202210984314.4A Active CN115453662B (en) 2021-11-26 2021-11-26 Abnormal site screening method combining time dimension and space dimension
CN202210984313.XA Active CN115327674B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring radar calibration method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111419496.2A Active CN114236645B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring abnormal site screening method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210984313.XA Active CN115327674B (en) 2021-11-26 2021-11-26 Large-scale rainfall monitoring radar calibration method

Country Status (1)

Country Link
CN (3) CN114236645B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114236645B (en) * 2021-11-26 2022-07-26 中国水利水电科学研究院 Large-scale rainfall monitoring abnormal site screening method
CN115291304B (en) * 2022-08-03 2023-06-16 河海大学 Deployment method of omnidirectional antenna spoke type microwave dense rainfall monitoring network
CN115406403A (en) * 2022-11-01 2022-11-29 广州地铁设计研究院股份有限公司 Rail transit tunnel settlement monitoring method and system

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4924457B2 (en) * 2008-02-05 2012-04-25 富士通株式会社 Method of collecting and accumulating rainfall values and cumulative rainfall values
CN104483719B (en) * 2014-12-31 2017-07-14 贵州东方世纪科技股份有限公司 A kind of method of utilization radar return diagnosis remote gauged rainfall station data exception
US10416351B2 (en) * 2015-09-10 2019-09-17 The Climate Corporation Generating probabilistic estimates of rainfall rates from radar reflectivity measurements
CN106950614B (en) * 2017-02-28 2019-03-22 中船重工鹏力(南京)大气海洋信息系统有限公司 A kind of region automatic weather station hour rainfall data method of quality control
CN107403004B (en) * 2017-07-24 2020-07-24 邱超 Remote-measuring rainfall site suspicious numerical inspection method based on terrain data
WO2020043030A1 (en) * 2018-08-25 2020-03-05 司书春 Data credibility evaluation and calibration method for air pollution monitoring device
CN109840260B (en) * 2019-02-02 2019-12-17 中国水利水电科学研究院 Large-scale real-time rainfall automatic observation station hierarchical data processing method based on dynamic interpolation
CN110008439B (en) * 2019-03-25 2020-11-03 武汉大学 Rainfall data space-time integrated interpolation algorithm based on matrix decomposition
CN111307123B (en) * 2020-04-02 2021-03-02 中国水利水电科学研究院 Real-time abnormity diagnosis and interpolation method of regimen monitoring data
CN112506990B (en) * 2020-12-03 2022-10-04 河海大学 Hydrological data anomaly detection method based on spatiotemporal information
CN113255593B (en) * 2021-06-25 2021-09-24 北京市水利自动化研究所 Sensor information anomaly detection method facing space-time analysis model
CN114236645B (en) * 2021-11-26 2022-07-26 中国水利水电科学研究院 Large-scale rainfall monitoring abnormal site screening method

Also Published As

Publication number Publication date
CN114236645A (en) 2022-03-25
CN114236645B (en) 2022-07-26
CN115327674A (en) 2022-11-11
CN115453662A (en) 2022-12-09
CN115327674B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
CN115453662B (en) Abnormal site screening method combining time dimension and space dimension
CN110471024B (en) Intelligent electric meter online remote calibration method based on measurement data analysis
CN105069537B (en) A kind of construction method of combination type air Quality Prediction
CN110197020B (en) Method for analyzing influence of environmental change on hydrological drought
CN109492683A (en) A kind of quick online evaluation method for the wide area measurement electric power big data quality of data
Dhorde et al. Three-way approach to test data homogeneity: An analysis of temperature and precipitation series over southwestern Islamic Republic of Iran
CN108647805B (en) Comprehensive testing method for critical rainfall of mountain torrent disasters
CN114114198B (en) Precipitation data quality control method and device, storage medium and equipment
Palmen et al. Regional flood frequency for Queensland using the quantile regression technique
CN110738346A (en) batch electric energy meter reliability prediction method based on Weibull distribution
CN113505471A (en) River section pollutant concentration prediction calculation method
CN117371337B (en) Water conservancy model construction method and system based on digital twin
CN115854999A (en) H-ADCP section average flow velocity self-correction method based on scene self-adaptation
CN112985460A (en) Intelligent center control positioning precision testing method and device, electronic equipment and storage medium
CN112780953B (en) Independent metering area pipe network leakage detection method based on mode detection
CN111007541B (en) Simulation performance evaluation method for satellite navigation foundation enhancement system
CN112782096A (en) Soil organic carbon density estimation method based on reflection spectrum data
CN109308375B (en) Method for measuring and calculating optimal flow velocity of drainage basin based on geomorphic parameters
CN115545112B (en) Method for automatically identifying and processing real-time automatic monitoring data of large amount of underground water
CN110749307A (en) Power transmission line displacement settlement determination method and system based on Beidou positioning
CN116541681A (en) Composite disaster space variability identification method based on collaborative kriging interpolation
CN114324800A (en) Drainage pipeline water inflow monitoring method and system and storage medium
CN115203643A (en) Hydrologic and ecological factor fused water source conservation function quantitative diagnosis method and system
Harvey et al. Developing best practice for infilling daily river flow data
CN111898314A (en) Lake water body parameter detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant