CN113159141A - Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration - Google Patents
Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration Download PDFInfo
- Publication number
- CN113159141A CN113159141A CN202110362632.2A CN202110362632A CN113159141A CN 113159141 A CN113159141 A CN 113159141A CN 202110362632 A CN202110362632 A CN 202110362632A CN 113159141 A CN113159141 A CN 113159141A
- Authority
- CN
- China
- Prior art keywords
- data
- concentration
- time
- remote sensing
- satellite
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 10
- 238000007637 random forest analysis Methods 0.000 claims abstract description 9
- 230000004927 fusion Effects 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 15
- 238000005070 sampling Methods 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 10
- 238000012952 Resampling Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000008030 elimination Effects 0.000 claims description 4
- 238000003379 elimination reaction Methods 0.000 claims description 4
- 238000002790 cross-validation Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 abstract description 25
- 238000011160 research Methods 0.000 abstract description 5
- 241000208818 Helianthus Species 0.000 description 3
- 235000003222 Helianthus annuus Nutrition 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000003344 environmental pollutant Substances 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000013618 particulate matter Substances 0.000 description 2
- 231100000719 pollutant Toxicity 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 239000000443 aerosol Substances 0.000 description 1
- 239000005427 atmospheric aerosol Substances 0.000 description 1
- 238000001311 chemical methods and process Methods 0.000 description 1
- 238000002485 combustion reaction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 239000002803 fossil fuel Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 210000005104 human peripheral blood lymphocyte Anatomy 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000010902 straw Substances 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Remote Sensing (AREA)
- Tourism & Hospitality (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Astronomy & Astrophysics (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Multimedia (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Image Processing (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
The invention relates to a remote sensing estimation method of high-resolution near real-time PM2.5 concentration, which utilizes AOD hourly data and GFS to simulate real-time meteorological data, and adopts a random forest algorithm to research an implementation method for efficiently and quickly generating the near real-time PM2.5 data, so that the hysteresis of PM2.5 remote sensing monitoring is shortened from several days to several hours, and the timeliness of atmospheric pollution remote sensing monitoring is greatly improved; the inversion result is fused with the PM2.5 concentration of the site, the problems of remote sensing monitoring data loss and limited inversion precision can be efficiently solved, and the accuracy, the space coverage and the continuous observation capability of near-ground PM2.5 concentration remote sensing estimation are greatly improved.
Description
Technical Field
The invention relates to the field of application of atmospheric pollution monitoring and satellite remote sensing technologies, in particular to a high-resolution near-real-time PM2.5 concentration remote sensing estimation method.
Background
PM2.5 is short for fine particulate matter and is generally defined as particulate matter having an aerodynamic equivalent diameter of less than 2.5 μm in an atmospheric environment. The PM2.5 is used as the primary pollutant of air quality in most cities in China, has the advantages of small particle size, strong activity, easy attachment of toxic and harmful substances, long retention time in the atmosphere, long conveying distance and large influence area, and has great harm to human health. In addition, PM2.5 is also an important component of atmospheric aerosol, has important influence on many physical and chemical processes in the atmosphere, and can directly or indirectly influence climate change. At present, PM2.5 in China is mainly generated by human sources, and mainly comprises fossil fuel combustion, industrial production, automobile exhaust emission, straw incineration and the like. Therefore, the strengthening of the monitoring research on PM2.5 has important practical significance.
Currently, there are three main technical means for monitoring PM 2.5: ground monitoring, air quality mode forecasting and satellite remote sensing monitoring. The ground monitoring can acquire hourly data, and the method has the characteristics of high precision and high time continuity, but the number of stations is limited and the distribution is not uniform, so that the distribution characteristics of regional atmospheric pollution are difficult to acquire. The air quality mode prediction can simulate and predict the distribution condition of pollutants, but is not suitable for monitoring small and medium-sized areas such as provinces and cities due to low resolution and poor precision. The method for estimating the concentration of the PM2.5 close to the ground by using the satellite remote sensing to monitor the optical thickness (AOD) of the aerosol has the characteristics of large monitoring range, high precision and low cost, and becomes an important means for monitoring the PM2.5 at present.
In recent years, PM2.5 inversion is greatly developed by using a medium-resolution imaging spectrometer (MODIS), a multi-angle imaging spectrometer (MISR) and the like, but polar orbit satellites can only transit 1-2 times every day and cannot meet the requirement of real-time monitoring of atmospheric pollution, and geostationary satellites (such as Japanese sunflower 8) can realize hourly continuous PM2.5 monitoring; the existing method mostly focuses on the improvement of monthly and annual PM2.5 inversion accuracy, and the business research of high-resolution near-real-time PM2.5 data set production is lacked so as to meet the requirement of high-timeliness monitoring of atmospheric pollution; in addition, remote sensing acquires AOD and is influenced by cloud cluster covering and the like, the problem that data covers local holes or is lost integrally exists, inversion accuracy has large fluctuation due to influences of particle composition, property change and the like, and how to improve PM2.5 remote sensing inversion accuracy and spatial continuity by using ground observation data is another important difficult problem to be solved urgently.
Disclosure of Invention
The invention aims to provide a high-resolution near-real-time PM2.5 concentration remote sensing estimation method, which is used for improving the timeliness of atmospheric pollution remote sensing monitoring, improving the accuracy, space coverage and continuous observation capability of near-ground PM2.5 concentration remote sensing estimation, realizing an hourly, high-precision and area coverage PM2.5 monitoring method which is efficiently and quickly acquired, and providing support of data product pushing service for atmospheric pollution prevention and control monitoring.
In order to achieve the aim, the invention provides a remote sensing estimation method of high-resolution near-real-time PM2.5 concentration, which comprises the following steps:
A. fully automatically acquiring hourly AOD satellite data, GFS meteorological data, station PM2.5 concentration data and DEM elevation data through a Python program;
B. converting collected multi-source data into a data set with unified space-time scale through preprocessing processes of space resampling, image clipping, kriging interpolation, format conversion, space-time matching and vacancy value elimination, and constructing a sample data set and a near-ground PM2.5 concentration space interpolation result;
C. randomly dividing a sample data set subjected to space-time matching into a training data set and a testing data set, wherein the training data set is used for training a random forest model, and the testing data set is used for verifying model precision;
D. by constructing a regression relation model, taking the grid image hour values of AOD satellite data, GFS meteorological data, DEM elevation data, month information and longitude and latitude information as input, and fully automatically calculating in real time by a Python program to obtain an hourly PM2.5 satellite inversion result;
E. setting a weight value by adopting a weighted average fusion algorithm, calculating a weighted average of hourly PM2.5 satellite inversion results and near-ground PM2.5 spatial interpolation results, and fusing images of the hourly PM2.5 satellite inversion results and the near-ground PM2.5 spatial interpolation results to obtain a high-precision near-real-time PM2.5 concentration data set with high spatial continuity;
F. using a decision coefficient R between predicted and observed values2And calculating the estimation accuracy of the near-surface PM2.5 concentration by taking the root mean square error RMSE and the slope of a scatter point fitting equation as the evaluation standard of the model.
Preferably, in step B, the pretreatment process includes:
setting a spatial sampling rate, uniformly resampling AOD satellite data, GFS meteorological data and DEM elevation data by adopting a nearest neighbor sampling method, and then interpolating PM2.5 concentration data of a station by adopting a Krigin interpolation method according to the set value of the spatial sampling rate to obtain a near-ground PM2.5 concentration spatial interpolation result.
Preferably, the step of spatiotemporal matching comprises:
the method comprises the following steps of (1) spatial matching of data, converting longitude and latitude of a site position of a foundation in space into a grid pixel row number, extracting corresponding AOD satellite data, GFS meteorological data and DEM elevation data grid pixel values according to the row number, and corresponding to PM2.5 concentration data, longitude data and latitude data of the site to form a data set;
and (3) time matching of the data, arranging the PM2.5 concentration data, AOD satellite data, GFS meteorological data and DEM elevation data of the station according to a time sequence, taking the data at the same time point as matched data, and recording corresponding time information data into a data set.
Preferably, the sample data set of the space-time matching adopts a ten-fold cross validation method, and is randomly divided into 10 parts, wherein 9 parts are used for model training, and 1 part is used for model precision validation.
Preferably, in the step D:
when grid pixel values corresponding to the PM2.5 satellite inversion result and the near-ground station interpolation result are both greater than 0, the weight values are set to be 1/2 and 1/2, and the average value of the two values is calculated to serve as final fusion data;
when the grid pixel value of the PM2.5 satellite inversion result is missing, the weight values are set to be 0 and 1, and the pixel value interpolated by the site is taken as final fusion data.
Based on the technical scheme, the invention has the advantages that:
according to the remote sensing estimation method for the high-resolution near real-time PM2.5 concentration, AOD hourly data and GFS simulated real-time meteorological data are utilized, a random forest algorithm is adopted to research an implementation method for efficient and rapid generation of the near real-time PM2.5 data, so that the hysteresis of PM2.5 remote sensing monitoring is shortened from several days to several hours, and the timeliness of atmospheric pollution remote sensing monitoring is greatly improved; the inversion result is fused with the PM2.5 concentration of the site, the problems of remote sensing monitoring data loss and limited inversion precision can be efficiently solved, and the accuracy, the space coverage and the continuous observation capability of near-ground PM2.5 concentration remote sensing estimation are greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a diagram of the steps of a high resolution near real-time PM2.5 concentration remote sensing estimation method;
FIG. 2 is a flow chart of a high resolution near real-time PM2.5 concentration remote sensing estimation method;
FIG. 3 is a result diagram of an embodiment of a high-resolution near real-time PM2.5 concentration remote sensing estimation method;
FIG. 4 is a scatter plot between predicted values and observed values for the test data set in an example embodiment;
FIG. 5 is a comparative scatter plot of hourly PM2.5 fusion data versus site PM2.5 observations in the examples.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
The invention provides a high-resolution near-real-time PM2.5 concentration remote sensing estimation method, which is shown in figures 1-5, wherein a preferred embodiment of the invention is shown.
The remote sensing estimation method of the invention takes the existing random forest machine learning algorithm as a basic model, adopts the hour-by-hour data of AOD of a stationary satellite (such as Japanese 'sunflower No. 8' (Himawari-8, H8)), introduces the simulated real-time meteorological data provided by the global prediction system (GFS) of the national environment prediction center of America, combines the hour PM2.5 foundation observation data of the national environment monitoring center of China and the SRTM Digital Elevation Model (DEM) data of the American aerospace office to carry out data preprocessing and space-time matching processing, constructs a sample data set, estimates the hour-by-hour near-ground PM2.5 concentration by establishing a regression relation model, and carries out data fusion on the satellite inversion PM2.5 concentration and the site interpolation PM2.5 concentration by adopting a weighted average fusion algorithm to realize the establishment of a high-resolution near-time PM2.5 concentration remote sensing estimation model.
Specifically, as shown in fig. 1 and fig. 2, the remote sensing estimation method includes the following steps:
A. fully automatically acquiring hourly AOD satellite data, GFS meteorological data, station PM2.5 concentration data and DEM elevation data through a Python program;
B. converting collected multi-source data into a data set with unified space-time scale through preprocessing processes of space resampling, image clipping, kriging interpolation, format conversion, space-time matching and vacancy value elimination, and constructing a sample data set and a near-ground PM2.5 concentration space interpolation result;
C. randomly dividing a sample data set subjected to space-time matching into a training data set and a testing data set, wherein the training data set is used for training a random forest model, and the testing data set is used for verifying model precision;
D. by constructing a regression relation model, taking the grid image hour values of AOD satellite data, GFS meteorological data, DEM elevation data, month information and longitude and latitude information as input, and fully automatically calculating in real time by a Python program to obtain an hourly PM2.5 satellite inversion result;
E. setting a weight value by adopting a weighted average fusion algorithm, calculating a weighted average of hourly PM2.5 satellite inversion results and near-ground PM2.5 spatial interpolation results, and fusing images of the hourly PM2.5 satellite inversion results and the near-ground PM2.5 spatial interpolation results to obtain a high-precision near-real-time PM2.5 concentration data set with high spatial continuity;
F. using a decision coefficient R between predicted and observed values2And calculating the estimation accuracy of the near-surface PM2.5 concentration by taking the root mean square error RMSE and the slope of a scatter point fitting equation as the evaluation standard of the model.
The hourly AOD satellite data, the GFS meteorological data and the station PM2.5 data are subjected to full-automatic real-time downloading and data preprocessing through a Python program; the matching data set comprises preprocessed hourly AOD data, meteorological data, station PM2.5 data, hour, day, month and year time information, longitude and latitude spatial information and DEM elevation information; the meteorological data includes 2m air temperature, ground barometric pressure, relative humidity, boundary layer height, visibility, 10m wind direction and wind speed.
Preferably, in step B, the pretreatment process includes: setting a spatial sampling rate, uniformly resampling AOD satellite data, GFS meteorological data and DEM elevation data by adopting a nearest neighbor sampling method, and then interpolating PM2.5 concentration data of a station by adopting a Krigin interpolation method according to the set value of the spatial sampling rate to obtain a near-ground PM2.5 concentration spatial interpolation result.
Further, the step of spatio-temporal matching comprises: and (3) performing spatial matching of data, converting the longitude and latitude of the position of the foundation station in space into a grid pixel row number, extracting corresponding AOD satellite data, GFS meteorological data and DEM elevation data grid pixel values according to the row number, and corresponding to PM2.5 concentration data, longitude data and latitude data of the station to form a data set. And (3) time matching of the data, arranging the PM2.5 concentration data, AOD satellite data, GFS meteorological data and DEM elevation data of the station according to a time sequence, taking the data at the same time point as matched data, and recording corresponding time information data into a data set.
Further, the random forest model is a Sklearn machine learning library based on Python, and the hyper-parameters are adjusted and determined by adopting a grid search method, so that the model precision and the verification data set precision are the best. Preferably, the sample data set of the space-time matching adopts a ten-fold cross validation method, and is randomly divided into 10 parts, wherein 9 parts are used for model training, and 1 part is used for model precision validation.
Further, the weighted average fusion algorithm is implemented by the following steps: and setting a weight value, and calculating a weighted average of the hourly PM2.5 satellite inversion result and the site interpolation result to obtain a high-precision near-real-time PM2.5 concentration data set with high spatial continuity.
According to the remote sensing estimation method for the high-resolution near real-time PM2.5 concentration, AOD hourly data and GFS simulated real-time meteorological data are utilized, a random forest algorithm is adopted to research an implementation method for efficient and rapid generation of the near real-time PM2.5 data, so that the hysteresis of PM2.5 remote sensing monitoring is shortened from several days to several hours, and the timeliness of atmospheric pollution remote sensing monitoring is greatly improved; the inversion result is fused with the PM2.5 concentration of the site, the problems of remote sensing monitoring data loss and limited inversion precision can be efficiently solved, and the accuracy, the space coverage and the continuous observation capability of near-ground PM2.5 concentration remote sensing estimation are greatly improved.
To further illustrate the remote sensing estimation method of the high-resolution near real-time PM2.5 concentration according to the present invention, an example thereof is described below.
In the embodiment, multisource data such as hourly AOD satellite data, GFS meteorological data, station PM2.5 concentration, DEM elevation and the like are acquired fully automatically through a Python program.
The hourly AOD data are derived from an AHI (advanced Himapari imager) sensor of Japanese third generation meteorological satellite sunflower No. 8 (Himapari-8, H8), website data are obtained, and the spatial resolution is 5 km; in order to obtain a near real-time PM2.5 remote sensing inversion result, hourly meteorological data simulated by a global prediction system (GFS) mode of the national environmental prediction center are adopted, 2m air Temperature (TMP), ground air Pressure (PRES), 10m wind field U component (UGRD), 10m wind field V component (VGRD), 2m Relative Humidity (RH), Visibility (VIS) and boundary layer Height (HPBL) are related, website data are obtained, and spatial resolution is 0.25 degrees; DEM elevation data is derived from the SRTM of the American space and space agency, website data is obtained, and the spatial resolution is 90 m; the near-ground PM2.5 concentration data are acquired through ground stations and are derived from measured data issued by a national environment monitoring central station.
And converting the collected multi-source data into a data set with unified space-time scale through preprocessing processes such as space resampling, image cutting, Kriging (Kriging) interpolation, format conversion, space-time matching, vacancy value elimination and the like to construct a sample data set. Setting the spatial sampling rate to be 1km, and uniformly resampling the AOD, the meteorological data and the DEM data to be 1km resolution by adopting a nearest neighbor sampling method; and interpolating the PM2.5 concentration data of the station by adopting a Krigin interpolation method according to the set value of the spatial sampling rate to obtain a near-ground PM2.5 concentration spatial interpolation result.
And the spatial matching of the data is to convert the longitude and latitude of the position of the foundation site in space into a grid pixel row number, extract corresponding AOD data, meteorological data and DEM data grid pixel values according to the row number, and correspond the grid pixel values with the PM2.5 concentration, longitude (Lon) and latitude (Lat) data of the site to a data set. And (3) time matching of the data, namely arranging the PM2.5 concentration, AOD data, meteorological data and DEM data of the station in a time sequence, regarding the data at the same time point as matched data, and recording corresponding time including Year (Year), Month (Month), Day (Day) and Hour (Hour) information into a data set.
And randomly dividing the sample data set after space-time matching into a training data set and a testing data set, wherein the training data set is used for training a stochastic forest model, and the testing data set is used for verifying the model precision.
The concentration of PM2.5 of the sample data set is an output parameter and a true value of the concentration of PM2.5 of the model, and AOD data, meteorological data, DEM data, time information and space information in the sample data set are input parameters of the model; the random forest model is a Sklearn machine learning library based on Python, a grid search method is adopted to adjust and determine the hyper-parameters, and when the model precision and the verification data set precision reach the best, the optimal AOD-PM2.5 estimation model is obtained.
And (3) realizing image fusion of hourly PM2.5 satellite inversion results and near-ground PM2.5 spatial interpolation results by adopting a weighted average fusion algorithm to obtain a high-precision near-real-time PM2.5 concentration data set with high spatial continuity. The weight values are set to be two different conditions, when grid pixel values corresponding to PM2.5 satellite inversion results and near-ground station interpolation results are both greater than 0, the weight values are set to be 1/2 and 1/2, namely, the average value of the two values is calculated to serve as final fusion data; when the grid pixel value of the PM2.5 satellite inversion result is missing, the weight values are set to be 0 and 1, namely the pixel value interpolated by the site is taken as final fusion data.
The result chart of the embodiment of the high-resolution near-real-time PM2.5 concentration remote sensing estimation method is shown in figure 3. FIG. 4 is a scatter plot between the predicted values and observed values for the test data set, with a fitting correlation coefficient R2 of 0.8 and an RMSE of 14.58 μ g/m3, illustrating the better model fitting accuracy; FIG. 5 is a comparative scatter plot of hourly PM2.5 fusion data and site PM2.5 observation data, where the fitting correlation coefficient R2 can reach 0.87, and RMSE is 13.24 μ g/m3, so that the accuracy of fused PM2.5 is greatly improved.
Finally, it should be noted that the above examples are only used to illustrate the technical solutions of the present invention and not to limit the same; although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art will understand that: modifications to the specific embodiments of the invention or equivalent substitutions for parts of the technical features may be made; without departing from the spirit of the present invention, it is intended to cover all aspects of the invention as defined by the appended claims.
Claims (5)
1. A remote sensing estimation method for high-resolution near-real-time PM2.5 concentration is characterized by comprising the following steps: the remote sensing estimation method comprises the following steps:
A. fully automatically acquiring hourly AOD satellite data, GFS meteorological data, station PM2.5 concentration data and DEM elevation data through a Python program;
B. converting collected multi-source data into a data set with unified space-time scale through preprocessing processes of space resampling, image clipping, kriging interpolation, format conversion, space-time matching and vacancy value elimination, and constructing a sample data set and a near-ground PM2.5 concentration space interpolation result;
C. randomly dividing a sample data set subjected to space-time matching into a training data set and a testing data set, wherein the training data set is used for training a random forest model, and the testing data set is used for verifying model precision;
D. by constructing a regression relation model, taking the grid image hour values of AOD satellite data, GFS meteorological data, DEM elevation data, month information and longitude and latitude information as input, and fully automatically calculating in real time by a Python program to obtain an hourly PM2.5 satellite inversion result;
E. setting a weight value by adopting a weighted average fusion algorithm, calculating a weighted average of hourly PM2.5 satellite inversion results and near-ground PM2.5 spatial interpolation results, and fusing images of the hourly PM2.5 satellite inversion results and the near-ground PM2.5 spatial interpolation results to obtain a high-precision near-real-time PM2.5 concentration data set with high spatial continuity;
F. using a decision coefficient R between predicted and observed values2And calculating the estimation accuracy of the near-surface PM2.5 concentration by taking the root mean square error RMSE and the slope of a scatter point fitting equation as the evaluation standard of the model.
2. The remote sensing estimation method according to claim 1, characterized in that: in step B, the pretreatment process includes:
setting a spatial sampling rate, uniformly resampling AOD satellite data, GFS meteorological data and DEM elevation data by adopting a nearest neighbor sampling method, and then interpolating PM2.5 concentration data of a station by adopting a Krigin interpolation method according to the set value of the spatial sampling rate to obtain a near-ground PM2.5 concentration spatial interpolation result.
3. The remote sensing estimation method according to claim 2, characterized in that: the step of spatiotemporal matching comprises:
the method comprises the following steps of (1) spatial matching of data, converting longitude and latitude of a site position of a foundation in space into a grid pixel row number, extracting corresponding AOD satellite data, GFS meteorological data and DEM elevation data grid pixel values according to the row number, and corresponding to PM2.5 concentration data, longitude data and latitude data of the site to form a data set;
and (3) time matching of the data, arranging the PM2.5 concentration data, AOD satellite data, GFS meteorological data and DEM elevation data of the station according to a time sequence, taking the data at the same time point as matched data, and recording corresponding time information data into a data set.
4. The remote sensing estimation method according to claim 1, characterized in that: the sample data set of the space-time matching adopts a ten-fold cross validation method, and is randomly divided into 10 parts, wherein 9 parts are used for model training, and 1 part is used for model precision validation.
5. The remote sensing estimation method according to claim 1, characterized in that: in the step D:
when grid pixel values corresponding to the PM2.5 satellite inversion result and the near-ground station interpolation result are both greater than 0, the weight values are set to be 1/2 and 1/2, and the average value of the two values is calculated to serve as final fusion data;
when the grid pixel value of the PM2.5 satellite inversion result is missing, the weight values are set to be 0 and 1, and the pixel value interpolated by the site is taken as final fusion data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110362632.2A CN113159141A (en) | 2021-04-02 | 2021-04-02 | Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110362632.2A CN113159141A (en) | 2021-04-02 | 2021-04-02 | Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113159141A true CN113159141A (en) | 2021-07-23 |
Family
ID=76886420
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110362632.2A Pending CN113159141A (en) | 2021-04-02 | 2021-04-02 | Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113159141A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114018773A (en) * | 2021-11-03 | 2022-02-08 | 中科三清科技有限公司 | PM2.5Method, device and equipment for acquiring concentration spatial distribution data and storage medium |
CN115112586A (en) * | 2022-07-01 | 2022-09-27 | 行星数据科技(苏州)有限公司 | Pasture methane emission estimation method under multi-source data fusion |
CN117592004A (en) * | 2024-01-19 | 2024-02-23 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite monitoring method, device, equipment and medium |
CN117592005A (en) * | 2024-01-19 | 2024-02-23 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite remote sensing estimation method, device, equipment and medium |
-
2021
- 2021-04-02 CN CN202110362632.2A patent/CN113159141A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114018773A (en) * | 2021-11-03 | 2022-02-08 | 中科三清科技有限公司 | PM2.5Method, device and equipment for acquiring concentration spatial distribution data and storage medium |
CN115112586A (en) * | 2022-07-01 | 2022-09-27 | 行星数据科技(苏州)有限公司 | Pasture methane emission estimation method under multi-source data fusion |
CN117592004A (en) * | 2024-01-19 | 2024-02-23 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite monitoring method, device, equipment and medium |
CN117592005A (en) * | 2024-01-19 | 2024-02-23 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite remote sensing estimation method, device, equipment and medium |
CN117592004B (en) * | 2024-01-19 | 2024-04-12 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite monitoring method, device, equipment and medium |
CN117592005B (en) * | 2024-01-19 | 2024-04-26 | 中国科学院空天信息创新研究院 | PM2.5 concentration satellite remote sensing estimation method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113159141A (en) | Remote sensing estimation method for high-resolution near-real-time PM2.5 concentration | |
CN110954482B (en) | Atmospheric pollution gridding monitoring method based on static satellite and polar orbit satellite | |
CN112905560B (en) | Air pollution prediction method based on multi-source time-space big data deep fusion | |
Mekis et al. | An overview of surface-based precipitation observations at environment and climate change Canada | |
Cersosimo et al. | TROPOMI NO2 tropospheric column data: regridding to 1 km grid-resolution and assessment of their consistency with in situ surface observations | |
Li et al. | The impact of observation nudging on simulated meteorology and ozone concentrations during DISCOVER-AQ 2013 Texas campaign | |
Liu et al. | Characteristics and performance of vertical winds as observed by the radar wind profiler network of China | |
CN111323352B (en) | Regional PM2.5 remote sensing inversion model fusing fine particulate matter concentration data | |
Masoom et al. | Solar energy estimations in india using remote sensing technologies and validation with sun photometers in urban areas | |
CN111210483B (en) | Simulated satellite cloud picture generation method based on generation of countermeasure network and numerical mode product | |
CN109407177B (en) | Machine learning and conventional meteorological observation-based fog identification system and application method | |
Randriamampianina et al. | Exploring the assimilation of IASI radiances in forecasting polar lows | |
He et al. | A review of datasets and methods for deriving spatiotemporal distributions of atmospheric CO2 | |
Rui-xia et al. | Quality assessment of FY-4A lightning data in inland China | |
Callewaert et al. | Analysis of CO 2, CH 4, and CO surface and column concentrations observed at Réunion Island by assessing WRF-Chem simulations | |
Verma et al. | Urban Air Quality Monitoring and Modelling Using Ground Monitoring, Remote Sensing, and GIS | |
CN108931797B (en) | Method for finely quantifying exposure crowd of toxic metal in flying ash in sparse area of base station | |
Bu et al. | Joint retrieval of sea surface rainfall intensity, wind speed, and wave height based on spaceborne GNSS-R: a case study of the oceans near China | |
Bellini et al. | Exploiting satellite data in the context of smart city applications | |
Rickerby et al. | Big data for innovative air-pollution assessments in the era of verifiable regulatory decisions | |
CN111881590A (en) | Spatial analysis method for concentration of atmospheric particulate matter | |
Bain et al. | The Met Office winter testbed 2020/2021: Experimenting with an on‐demand 300‐m ensemble in a real‐time environment | |
Mues et al. | Air quality in the Kathmandu Valley: WRF and WRF-Chem simulations of meteorology and black carbon concentrations | |
Saha et al. | Development of Real-time Quality Monitoring Module for ARG network over Mumbai: Results from Monsoon 2020-2021 | |
Zhao et al. | A comparative study of ground-gridded and satellite-derived formaldehyde during ozone episodes in the Chinese Greater Bay Area |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |