CN112016696B - PM integrating satellite observation and ground observation 1 Concentration inversion method and system - Google Patents

PM integrating satellite observation and ground observation 1 Concentration inversion method and system Download PDF

Info

Publication number
CN112016696B
CN112016696B CN202010817931.6A CN202010817931A CN112016696B CN 112016696 B CN112016696 B CN 112016696B CN 202010817931 A CN202010817931 A CN 202010817931A CN 112016696 B CN112016696 B CN 112016696B
Authority
CN
China
Prior art keywords
model
observation
concentration
satellite
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010817931.6A
Other languages
Chinese (zh)
Other versions
CN112016696A (en
Inventor
臧琳
黄立
毛飞跃
潘增新
卢昕
龚威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010817931.6A priority Critical patent/CN112016696B/en
Publication of CN112016696A publication Critical patent/CN112016696A/en
Application granted granted Critical
Publication of CN112016696B publication Critical patent/CN112016696B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Dispersion Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a PM integrating satellite observation and ground observation 1 The concentration inversion method and the concentration inversion system are used for acquiring and matching data, and comprise the steps of acquiring ground PM1 data, satellite AOD data and related meteorological geographic parameters, and resampling the related meteorological geographic parameters by taking the spatial resolution of the satellite AOD data as reference; then, taking the PM1 observation station as a center, adopting a space-time window with a preset space radius and a preset time radius, calculating the mean value of each input feature in the space-time window, and matching the mean value with the actually measured PM1 concentration value of the corresponding observation station to form a training sample set; constructing an initial RF model, and optimally setting the number of decision trees and the number of variables used in constructing a binary tree according to the predicted residual error change of the model; constructing an initial geo-RF model, wherein the construction comprises defining space adjacent observation S-PM1, forward time adjacent observation T-PM1 and adjacent space distance constraint, and inputting space-time adjacent observation serving as an explanatory variable into the constructed initial RF model to obtain the geo-RF model; and carrying out geo-RF model training and PM1 concentration estimation.

Description

PM integrating satellite observation and ground observation 1 Concentration inversion method and system
Technical Field
The invention belongs to the technical field of space observation and analysis, and particularly relates to a PM (particulate matter) integrating satellite observation and ground observation 1 A concentration inversion method and system.
Background
In recent decades, the Chinese economy has developed rapidly, and at the same time the emission of large quantities of pollutants, especially of particulate matter, has resulted in serious air pollution. Research has shown that aerosol particles have significant environmental, climatic and health effects. First, the aerosol can be directedAnd scatter or absorb sunlight, thereby reducing atmospheric visibility. Due to PM 1 Is more similar to the dominant wavelength of sunlight and therefore has a greater ability to attenuate sunlight. Secondly, when the water vapor saturation reaches a certain degree, the aerosol can be activated into cloud condensation nuclei so as to change the radiation characteristics of the cloud, and the global radiation balance is indirectly influenced. In addition, prolonged exposure to fine particulate matter in the atmosphere may cause asthma, hypertension and even lung cancer. PM compared to Large particles 1 The medicine can stay in the air for a longer time, and can go deep into the human body due to small particle size, so that harmful substances are brought into various parts of the body through blood circulation, and the harm is larger. In view of this, PM 1 Has attracted increasing attention.
High-precision high-space-time resolution PM 1 The observation is crucial for the in-depth development of the spatial and temporal distribution of the aerosol and its health effects. At present, urban atmospheric quality monitoring in China mainly depends on foundation observation, but the number of sites is limited and the distribution is seriously uneven. Using the China Bureau of meteorology PM 1 Monitoring networks, e.g. national PM 1 Only 73 monitoring sites are distributed mainly in the east of China. Therefore, it is difficult to characterize the time-series variation of a wide range of particulate matter by relying solely on ground-based observations. The satellite remote sensing observation coverage range is wide, the imaging speed is high, and the defects of foundation observation can be effectively overcome. Developing satellite-based PM, particularly geostationary satellite 1 The concentration inversion method is especially necessary.
The theoretical basis for satellite-based inversion of particulate matter concentration is that there is a significant correlation between particulate matter concentration and the optical aerosol thickness (AOD) observed by the satellite. The current AOD data for inversion mostly derive from polar orbit satellites, the time resolution is relatively low (1 day or even lower), and effective tracking of the evolution process of the particulate matter cannot be realized. Inversion models can be roughly classified into three categories: physical models, statistical models, and machine learning models. The results are not ideal due to the limitations of the data and the algorithm itself. Specifically, the method comprises the following steps:
the basic principle of a physical model is to parameterize known or assumed physical mechanisms and establish a relationship between particulate matter concentration and explanatory variables. The theoretical foundation of the model is relatively firm, but the requirement on data is high, the data acquisition difficulty is high, and the universality of the model is limited by the hypothesis parameters facing small samples.
The statistical model is used for describing the linear relation between the dependent variable and the independent variables, the process is simple, the flexibility is strong, but the model cannot solve the complex nonlinear relation between the variables, and therefore the accuracy of the model is limited.
The machine learning model has self-organizing and self-learning capabilities and has incomparable advantages in nonlinear problem processing, so that the algorithm is gradually applied to particle concentration inversion, but the dependence of the model precision on samples is high, and the precision of large-scale inversion still needs to be further improved.
More importantly, with PM 2.5 And PM 10 In contrast, current is for PM 1 Much less, and mostly regionally.
Interpretation of terms:
aerosol: the particle size of the colloidal dispersion system formed by dispersing and suspending small solid or liquid particles in a gas medium, also called as a gas dispersion system, is 1-100 nanometers.
PM 1 : refers to particles having a kinetic diameter of less than or equal to 1 micron in the atmosphere.
Aerosol optical thickness (AOD): the integral of the extinction coefficient of the aerosol in the vertical direction can be used to evaluate the attenuation of light by the aerosol.
Random Forest (RF): a machine learning model that utilizes multiple decision trees to train and predict samples.
Disclosure of Invention
Based on the above analysis, the present invention aims to solve the problem of large-scale PM 1 The problems of low concentration inversion time resolution, low precision and the like are solved, and the PM with more accuracy and stronger applicability is established 1 And (4) inversion technical scheme.
The technical scheme of the invention provides a PM1 concentration inversion method fusing satellite and ground observation, which comprises the following steps,
step 1, data acquisitionCollecting and matching, including collecting foundation PM 1 The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then using PM 1 Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and PM measured by corresponding observation station points 1 Matching the concentration values to form a training sample set;
step 2, constructing an initial RF model, wherein input characteristic parameters of the model comprise related meteorological geographic parameters, AOD, hour, month, lon and Lat, wherein the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM 1 Concentration; optimally setting the number ntree of the decision tree and the number mtree of variables used in constructing a binary tree according to the prediction residual change of the model;
step 3, initial geo-RF model construction, including defining space adjacent observation S-PM 1 Forward time proximity observation T-PM 1 And the adjacent spatial distance constraint DIS is as follows,
Figure GDA0003801215620000021
Figure GDA0003801215620000022
Figure GDA0003801215620000023
in the formula (I), the compound is shown in the specification,
PM 1,i is PM measured from the ith nearest station of the target point 1 Concentration, ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the space distance between the target point and the ith station;
PM 1,j is the target pointPM measured at the jth measurement time closest to the phase before the previous time 1 Concentration, wt j PM measured at the jth measurement time point nearest to the previous time point before the target point phase 1 A time weight of (a); dt j Representing the time distance between the current time of the target point phase and the j-th measurement time nearest to the current time;
i =1,2, \8230, n, n is the number of nearest neighbor sites, j =1,2, \8230, m, m is the number of hours of forward neighbor observation;
according to S-PM 1 ,T-PM 1 The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
step 4, geo-RF model training and PM 1 The concentration estimation, carried out as follows,
calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model 1 And (4) concentration inversion.
Furthermore, the satellite AOD data were from a Hiwari-8L 3 hour scale dataset, with observation samples identified as "very good" with confidence, and a spatial distribution rate of 0.05.
And the optimization setting mode of ntree and mtree is that initial values are respectively set firstly, and then the model structure is determined when the prediction residual error change of the model is relatively gentle by continuous adjustment.
Furthermore, in step 3, the best proximity observations are determined, including n, the number of spatially nearest neighbor sites and m, the number of forward proximity observations, for constraining the geo-RF model input features and improving the model computation efficiency.
Furthermore, by calculating the correlation between the predicted values of the model and the observed values for all samples, the model performance is evaluated, and the best adjacent observation is determined.
Moreover, the relevant meteorological geographic parameters comprise a near-surface temperature TEMP, a near-surface pressure SP, a relative humidity RH, a horizontal wind speed, a boundary layer height BLH, a normalized vegetation index NDVI and a surface elevation DEM; the horizontal wind speed comprises a latitudinal wind uw and a longitudinal wind vw;
when the initial RF model is constructed, the input characteristic parameters of the model comprise AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, hour, month, lon and Lat,
spatio-temporal proximity observations are also input as explanatory variables into the constructed initial RF model, and the resulting geo-RF model is expressed as follows,
PM 1 =f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,
Hour,Month,Lon,Lat,S-PM 1 ,T-PM 1 ,DIS) (4)
where f () is the corresponding function representation that expresses the geo-RF model.
The invention provides a PM1 concentration inversion system fusing satellite and foundation observation, which is used for realizing the PM1 concentration inversion method fusing satellite and foundation observation.
And, including the following modules,
a first module for data acquisition and matching, including acquisition of ground-based PM 1 The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then using PM 1 Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and the PM measured by the corresponding observation station 1 Matching the concentration values to form a training sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, hour, month, lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM 1 Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing a binary treeThe number of variables mtree used;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM 1 Forward time proximity observation T-PM 1 And the proximity spatial distance constraint DIS is as follows,
Figure GDA0003801215620000041
Figure GDA0003801215620000042
Figure GDA0003801215620000043
in the formula (I), the compound is shown in the specification,
PM 1,i is PM measured from the ith nearest station of the target point 1 Concentration of ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the space distance between the target point and the ith station;
PM 1,j is the PM measured at the jth measurement time nearest to the target point phase before the current time 1 Concentration, wt j PM measured at the jth measurement time point nearest to the previous time point before the target point phase 1 A time weight of (a); dt j Representing the time distance between the current time of the target point phase and the j-th measurement time nearest to the front;
i =1,2, \8230, n, n is the number of nearest neighbor sites, j =1,2, \8230, m, m is the number of hours of forward neighbor observation;
according to S-PM 1 ,T-PM 1 The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM 1 The concentration is estimated, and is achieved as follows,
calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then all training samples are input into a geo-RF model for training, a trained geo-RF model is obtained according to the principle of minimum residual error, and finally large-range PM is realized based on the trained model 1 And (4) concentration inversion.
Alternatively, the system comprises a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the stored instructions in the processor to execute the PM1 concentration inversion method for fusing the satellite observation and the ground observation.
Alternatively, the system comprises a readable storage medium, wherein a computer program is stored on the readable storage medium, and when the computer program is executed, the PM1 concentration inversion method for fusing satellite observation and ground observation is realized.
The high-resolution-ratio high-efficiency forest (RF) model-based machine learning model combines Himapari-8 AOD products with meteorological and geographic auxiliary data and simultaneously considers PM 1 The space-time autocorrelation of concentration, fuse the satellite with the ground observation, construct geo-RF model, realize the small-scale PM of large-scale 1 High precision inversion of concentration. The scheme shows stronger stability at different moments, seasons and regions, and can realize large-scale hour PM 1 High precision inversion of concentration.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
FIG. 2 is a schematic diagram of the correlation between the predicted value and the measured value of the model according to the embodiment of the present invention, wherein FIG. 2 (a) is a schematic diagram of the correlation R between the predicted value and the measured value of the model obtained at different spatial distances 2 FIG. 2 (b) is a schematic diagram showing the correlation R between the predicted value and the measured value of the model obtained at different time distances 2 Schematic representation.
Detailed Description
In order to more clearly understand the present invention, the technical solutions of the present invention are specifically described below with reference to the accompanying drawings and examples.
The present invention notices that: sunflower satellite No. 8 (H)imawari-8) is a 2 nd generation static meteorological satellite transmitted by a japan meteorological office (JMA), and the satellite is successfully transmitted in 10 months in 2014 and is formally put into service in 7 months in 2015. The main sensor carried by the satellite is an advanced sunflower imager (AHI), which is a multispectral imager consisting of 3 visible light channels, 3 near infrared channels and 10 infrared channels. The spatial resolution of the visible light channel is 500 m, and the other channels are 1000 m-2000 m. Himapari-8 runs on a 140.7-degree E orbit, the observation range covers east Asia and Western Pacific areas (80-200-degree E, 60-degree S-60-degree N), and the full-disk observation can be completed every ten minutes. Hiwari-8 has published two AOD datasets (L2 and L3) with spatial resolution of 0.05 ° to date. The time resolution of L2 was 10 minutes, and the time resolution of L3 was 1 hour, 1 day, or 1 month, respectively. The confidence of the data can be divided into four levels, namely "very good", "general" and "untrusted". Compared with polar orbit satellites, himapari-8 AOD has high time resolution and can be used for hour-level PM 1 And the concentration is estimated, so that the large-range dynamic monitoring of the fine particulate pollution is realized, and data support is provided for urban atmospheric pollution monitoring and management and the preparation of a coping policy of atmospheric pollution.
The invention is based on a Random Forest (RF) machine learning method, constructs a method for realizing small-scale PM by fusing satellite and ground observation 1 A geo-RF model of concentration high precision inversion.
In the embodiment, a high-time-resolution Himapari-8 AOD product and auxiliary data such as weather and geography are combined based on a machine learning model-random forest model, and PM is considered at the same time 1 And (3) integrating the satellite and the ground observation into model training data by means of the spatial-temporal autocorrelation of the concentration, and inputting the model training data into the random forest model. The random forest is composed of a plurality of decision trees, a data set is constructed by a mode of replacing the decision trees for random extraction, the data set is input into different decision trees to train a model, each decision tree has a training result, and the average value of all the results is the prediction result of the model. Integrating the processes of data matching, space-time autocorrelation information solving and random forest training to construct a geo-RF model and realize the eastern hour in ChinaClass PM 1 And (4) concentration inversion.
Referring to fig. 1, an embodiment provides a PM integrated with satellite and ground observation 1 The concentration inversion method comprises the following steps:
(1) Data acquisition and matching
Ground-based PM used in the present embodiment 1 The data comes from small-scale mass concentration observation provided by the China meteorological office aerosol observation network; satellite AOD data is from a himwari-8L 3-hour scale dataset, where only observed samples with confidence level labeled "very good" are taken, with a spatial distribution rate of 0.05 °. Since the diffusion and accumulation of pollutants is mainly controlled by meteorological conditions, and the underlying surface and topography have some influence in the process, embodiments are directed to PM in addition to AOD 1 A series of relevant meteorological and geographical parameters are also taken into account when estimating, including near surface temperature (TEMP, unit: K), near surface pressure (SP, unit: pa), relative humidity (RH, unit:%), horizontal wind speed (uw: weftwise wind, vw: transwind, unit: m/s), boundary layer height (BLH, unit: m), normalized vegetation index (NDVI, dimensionless), and surface elevation (DEM, digital elevation model, unit: m), etc.
Wherein the meteorological parameters are from the European mesoscale numerical forecasting center (ECWMF), and the spatial resolution is 0.125 degrees; the normalized vegetation index (NDVI) is from MODIS satellite observation and has a spatial resolution of 1km; surface elevation (DEM) data was obtained from the united states geological survey at a resolution of 90m.
Because the spatial resolution of each variable is inconsistent, the meteorological and geographic data are first resampled with reference to the Hiwari-8 AOD spatial resolution. Then in PM 1 Taking an observation station as a center, adopting a space-time window with the space radius of 0.05 degrees and the time radius of 30min, calculating the average value of each input feature in the fixed space-time window, and carrying out actual measurement on the average value series and the PM measured by the corresponding observation station 1 And matching the concentration values to form a training sample set.
(2) Initial RF model building
A Random Forest (RF) model is formed by combining a plurality of decision trees (e.g., decision trees 1-L in fig. 1), and the average of the results of all decision trees is used as the final output of the model. Two keys of the random forest construction are the determination of the number of decision trees ntree and the structure of each decision tree. The former can be determined empirically and the latter is related to the number of variables mtree used in constructing the binary tree. Both of which can be optimized and determined by parameter adjustment.
In particular, the present invention relates to a method for producing, suppose there is a data set D = { x = i1 ,x i2 ,…,x iP ,y i }(i∈[1,Q]) There are Q samples, P features, x i1 ,x i2 ,…,x iP Representing P independent variables, i.e. P characteristic parameters, y, for model input, respectively i Representing the output of the model. Firstly, inputting a characteristic number mtree for determining a decision result of a node on a decision tree; where mtree should be much smaller than P. Then, Q times of random sampling are carried out from Q training samples in a mode of return sampling, sampling results form a training set, and the error of the samples which are not sampled is used for prediction. For each node, mtree features are randomly selected, and the decision for each node in the decision tree is determined based on these features. And calculating the optimal splitting mode according to the mtree characteristics so as to construct each decision tree.
Corresponding to the above description, the initial RF model in this embodiment is constructed as follows: the input characteristic parameters of the model are 13, namely P =13, including AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, hour (time of day), month (Month), lon (longitude), lat (latitude); the output parameter of the model is PM 1 Concentration, i.e. y being PM 1 And (4) concentration. By combining experience, given the initial values of ntree and mtree as 50 and 2 respectively, and continuously adjusting the two parameters, the model structure is determined when the prediction residual of the model changes relatively gently, and the present embodiment finally sets ntree as 100,mtree =4.
(3) geo-RF model construction
PM 1 Has autocorrelation characteristics, i.e., spatial and temporal neighboring observations are significantly correlated. Therefore, theoretically, the integration of the space-time adjacent ground observation information into the model is beneficial to improving the targetPM of dots 1 And (4) concentration estimation accuracy. To obtain a spatio-temporal proximity observation expression, this patent defines a spatial proximity observation (S-PM) 1 The unit: mu g/m 3 ) Forward time proximity observation (T-PM) 1 The unit is: mu g/m 3 ) And the constraint of adjacent space Distance (DIS), the calculation formula is as follows:
Figure GDA0003801215620000071
Figure GDA0003801215620000072
Figure GDA0003801215620000073
in the formula (I), the compound is shown in the specification,
PM 1,i is PM measured from the ith nearest station of the target point 1 Concentration, ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the spatial distance between a target point and the ith station, and describing the target point by adopting the Euclidean distance;
PM 1,j PM measured at the jth time instant closest to the forward direction of the target point 1 Concentration, wt j Represents the measured PM of the jth time point nearest to the station 1 A time weight of (d); dt j The time distance between the current time and the jth measuring time of the site is represented, wherein i =1,2, \8230, n, m is the number of nearest neighbor sites, j =1,2, \8230, m, m is the number of forward neighbor observation hours.
The patent inputs spatio-temporal proximity observations as explanatory variables into the initial RF model constructed, and defines the model obtained at this time as a geo-RF model. The geo-RF model can be expressed as follows:
PM 1 =f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,
Hour,Month,Lon,Lat,S-PM 1 ,T-PM 1 ,DIS) (4)
where f () is the corresponding function expression that expresses the geo-RF model.
According to the first law of geography, close objects are more closely related, and when the distance reaches a certain degree, the introduction of the adjacent observation can not improve the estimation of the target point any more and can even influence the estimation precision, so that the optimal adjacent observation (namely the number n of the spatial nearest adjacent stations and the number m of the forward adjacent observation) needs to be determined for constraining the input features of the geo-RF model and improving the calculation efficiency of the model.
To determine the best proximity observations (i.e., n and m), this embodiment compares the estimated accuracy of model 10 fold cross validation when n and m take different values. Firstly, determining the values of n and m, such as n =1, m =1, then randomly dividing the sample set into ten samples, training 1 of 9 samples in turn for verification, and repeating the steps for 10 times until all samples are forecasted once and only once. Finally, calculating the correlation between the model predicted value and the observed value of all samples, namely R 2 The model performance was evaluated as shown in fig. 2. As shown in fig. 2 (a), the correlation between the model predicted value and the measured value obtained at different spatial distances shows that the correlation between the model predicted value and the site observation is relatively high (R = 8) 2 = 0.617), and n is set to 8 in this patent in consideration of the computational complexity. As shown in FIG. 2 (b), the correlation between the predicted value and the measured value of the model obtained at different time intervals increases with m, R 2 Slightly modified but generally decreasing, with m set to 2 in this patent, the model now has relatively good performance (R) 2 =0.611)。
(4) geo-RF model training with PM 1 Concentration estimation
Calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS (n =8,m = 2), obtaining the best adjacent observation input features of geo-RF, and combining with AOD, weather, geography and other parameters to construct a complete training sample set. Then all training samples are input into the model to be trained, the trained model is obtained according to the principle of minimum residual error, and the process can adopt computer softwareThe technology realizes automatic operation. Finally, large-range PM can be realized based on the trained model 1 And (3) performing concentration inversion, specifically respectively matching parameters such as AOD, weather, geography and the like according to the step (1) and the step (3), and calculating and matching S-PM 1 、T-PM 1 And DIS to form an inversion data set, which is then input into a trained geo-RF model to achieve a wide range of PM 1 And (4) concentration inversion.
The results of the study using the eastern part of China (100 ℃ E. About.130 ℃ E., 20 ℃ N. About.44 ℃ N.) as an exemplary study area show that: (1) Compared with other inversion models, the method provided by the patent has relatively high precision (geo-RF V.S.LME-BT V.S.GTWR V.S.GAM, R) 2 = 0.83v.s.0.80v.s.0.74v.s.0.59); (2) Model R during daytime (9-16 hours) 2 The variation range is 0.68-0.87, wherein the precision is higher in the noon period; (3) Within one year, model R 2 The variation range is 0.64-0.86, wherein the winter performance is better than the summer, and the difference between the monthly mean estimate and the measured value is about 1 mug/m 3 (ii) a (4) Of the 66 sites in the study area, about 80% of sites had R 2 Greater than 0.6.
In specific implementation, a person skilled in the art can implement the automatic operation process by using a computer software technology, and a system device for implementing the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention and a computer device including the corresponding computer program, should also be within the scope of the present invention.
In some possible embodiments, a PM1 concentration inversion system fusing satellite and ground observation is provided, including the following modules,
a first module for data acquisition and matching, including acquisition of ground-based PM 1 The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM 1 Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and PM measured by corresponding observation station points 1 Matching concentration values to form trainingA sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, hour, month, lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM 1 Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing the variable number mtree used in the binary tree;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM 1 Forward time proximity observation T-PM 1 And the adjacent spatial distance constraint DIS is as follows,
Figure GDA0003801215620000081
Figure GDA0003801215620000082
Figure GDA0003801215620000091
in the formula (I), the compound is shown in the specification,
PM 1,i is PM measured from the ith nearest station of the target point 1 Concentration, ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the spatial distance between the target point and the ith station;
PM 1,j is the PM measured at the jth measurement time nearest to the target point phase before the current time 1 Concentration, wt j PM measured at the jth measurement time point nearest to the previous time point before the target point phase 1 A time weight of (d); dt j Representing the time distance between the current time of the target point phase and the j-th measurement time nearest to the current time;
i =1,2, \8230, n, n is the number of nearest neighbor sites, j =1,2, \8230, m, m is the number of hours observed in forward neighbors;
according to S-PM 1 ,T-PM 1 The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM 1 The concentration estimation, carried out as follows,
calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model 1 And (4) concentration inversion.
The implementation of each module may refer to the corresponding steps in the method embodiments, which are not repeated herein.
In some possible embodiments, a PM1 concentration inversion system fusing satellite and ground observation is provided, including a processor and a memory, the memory storing program instructions, the processor being configured to invoke the stored instructions in the processor to execute a PM1 concentration inversion method fusing satellite and ground observation as described above.
In some possible embodiments, a system for PM1 concentration inversion fusing satellite observation and ground observation is provided, which includes a readable storage medium, on which a computer program is stored, and when the computer program is executed, the method for PM1 concentration inversion fusing satellite observation and ground observation is implemented.
It should be understood that the above-mentioned embodiments are described in some detail, and not intended to limit the scope of the invention, and those skilled in the art will be able to make alterations and modifications without departing from the scope of the invention as defined by the appended claims.

Claims (10)

1. PM integrating satellite observation and ground observation 1 The concentration inversion method is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, data acquisition and matching, including acquisition of foundation PM 1 The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM 1 Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and PM measured by corresponding observation station points 1 Matching concentration values to form a training sample set;
step 2, constructing an initial RF model, wherein input characteristic parameters of the model comprise related meteorological geographic parameters, AOD, hour, month, lon and Lat, wherein the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM 1 Concentration; optimally setting the number ntree of the decision tree and the number mtree of variables used in constructing the binary tree according to the prediction residual variation of the model;
step 3, initial geo-RF model construction, including defining space adjacent observation S-PM 1 Forward time proximity observation T-PM 1 And the adjacent spatial distance constraint DIS is as follows,
Figure FDA0003801215610000011
Figure FDA0003801215610000012
Figure FDA0003801215610000013
in the formula (I), the compound is shown in the specification,
PM 1,i is the first point from the target pointPM measured by i nearest sites 1 Concentration of ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the spatial distance between the target point and the ith station;
PM 1,j is the PM measured at the jth measurement time nearest to the target point phase before the current time 1 Concentration, wt j PM measured at the jth measurement time point nearest to the previous time point before the target point phase 1 A time weight of (a); dt is j Representing the time distance between the current time of the target point phase and the j-th measurement time nearest to the current time;
i =1,2, \8230, n, n is the number of nearest neighbor sites, j =1,2, \8230, m, m is the number of hours of forward neighbor observation;
according to S-PM 1 ,T-PM 1 The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
step 4, geo-RF model training and PM 1 The concentration is estimated, and is achieved as follows,
calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then all training samples are input into a geo-RF model for training, a trained geo-RF model is obtained according to the principle of minimum residual error, and finally large-range PM is realized based on the trained model 1 And (4) concentration inversion.
2. PM fusing satellite and ground based observations as defined in claim 1 1 The concentration inversion method is characterized by comprising the following steps: satellite AOD data was from a himwari-8L 3-hour level dataset, with observation samples identified as "very good" with confidence, and a spatial distribution rate of 0.05 °.
3. The PM fused satellite and ground observation of claim 1 1 The concentration inversion method is characterized by comprising the following steps: the optimized setting mode of ntree and mtree is that firstly, the initial values are respectively set, and then the initial values are continuously adjustedAnd (4) determining the model structure until the prediction residual error of the model changes relatively smoothly.
4. PM fusing satellite and ground based observations as defined in claim 1 1 The concentration inversion method is characterized by comprising the following steps: in step 3, the best proximity observation is determined, including the number n of the nearest spatial neighboring stations and the number m of forward proximity observations, for constraining the geo-RF model input features and improving the model computation efficiency.
5. The PM fused satellite and ground observation of claim 4 1 The concentration inversion method is characterized by comprising the following steps: and evaluating the model performance by calculating the correlation between the model predicted values and the observed values of all samples, and determining the optimal adjacent observation.
6. PM fusing satellite and ground observation according to claim 1 or 2 or 3 or 4 or 5 1 The concentration inversion method is characterized by comprising the following steps: the relevant meteorological geographic parameters comprise a near-surface temperature TEMP, a near-surface pressure SP, a relative humidity RH, a horizontal wind speed, a boundary layer height BLH, a normalized vegetation index NDVI and a surface elevation DEM; the horizontal wind speed comprises a latitudinal wind uw and a latitudinal wind vw;
when the initial RF model is constructed, the input characteristic parameters of the model comprise AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, hour, month, lon and Lat,
spatio-temporal proximity observations are also input as explanatory variables into the constructed initial RF model, and the resulting geo-RF model is expressed as follows,
PM 1 =f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,Hour,Month,Lon,Lat,S-PM 1 ,T-PM 1 ,DIS) (4)
where f () is the corresponding function representation that expresses the geo-RF model.
7. PM integrating satellite observation and ground observation 1 Concentration inversion system, its characterized in that: for implementing a fusion device according to any one of claims 1 to 6PM observed on star and foundation 1 And (3) a concentration inversion method.
8. The PM fused satellite and ground based observation of claim 7 1 Concentration inversion system, its characterized in that: comprises the following modules which are used for realizing the functions of the system,
a first module for data acquisition and matching, including acquiring ground-based PM 1 The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then using PM 1 Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and the PM measured by the corresponding observation station 1 Matching concentration values to form a training sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, hour, month, lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM 1 Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing the variable number mtree used in the binary tree;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM 1 Forward time proximity observation T-PM 1 And the adjacent spatial distance constraint DIS is as follows,
Figure FDA0003801215610000031
Figure FDA0003801215610000032
Figure FDA0003801215610000033
in the formula (I), the compound is shown in the specification,
PM 1,i is PM measured from the ith nearest station of the target point 1 Concentration, ws i Is the ith nearest site PM 1 Spatial weight of, ds i Representing the spatial distance between the target point and the ith station;
PM 1,j is the PM measured at the jth time closest to the target point phase before the current time 1 Concentration, wt j Indicating the PM measured at the jth measurement time closest to the target point phase before the current time 1 A time weight of (a); dt j Representing the time distance between the current time of the target point phase and the j-th measurement time nearest to the current time;
i =1, 2.. N, n is the number of nearest neighbor stations, j =1, 2.. Wherein m, m is the number of forward neighbor observation hours;
according to S-PM 1 ,T-PM 1 The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM 1 The concentration estimation, carried out as follows,
calculating S-PM according to equations (1) - (3) 1 、T-PM 1 And DIS, obtaining geo-RF optimal adjacent observation input characteristics, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model 1 And (4) inverting the concentration.
9. The PM fused satellite and ground based observation of claim 7 1 Concentration inversion system, its characterized in that: comprising a processor and a memory, the memory is used for storing program instructions, the processor is used for calling the stored instructions in the processor to execute the PM fusing satellite and ground observation according to any one of claims 1-6 1 A concentration inversion method.
10. The PM fused satellite and ground-based observation of claim 7 1 Concentration inversion system, its characterized in that: comprising a readable storage medium having stored thereon a computer program which, when executed, implements a PM fused with satellite and ground based observations as claimed in any one of claims 1 to 6 1 A concentration inversion method.
CN202010817931.6A 2020-08-14 2020-08-14 PM integrating satellite observation and ground observation 1 Concentration inversion method and system Active CN112016696B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010817931.6A CN112016696B (en) 2020-08-14 2020-08-14 PM integrating satellite observation and ground observation 1 Concentration inversion method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010817931.6A CN112016696B (en) 2020-08-14 2020-08-14 PM integrating satellite observation and ground observation 1 Concentration inversion method and system

Publications (2)

Publication Number Publication Date
CN112016696A CN112016696A (en) 2020-12-01
CN112016696B true CN112016696B (en) 2022-10-04

Family

ID=73504482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010817931.6A Active CN112016696B (en) 2020-08-14 2020-08-14 PM integrating satellite observation and ground observation 1 Concentration inversion method and system

Country Status (1)

Country Link
CN (1) CN112016696B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114018773B (en) * 2021-11-03 2022-10-04 中科三清科技有限公司 PM 2.5 Method, device and equipment for acquiring concentration spatial distribution data and storage medium
CN114330146B (en) * 2022-03-02 2022-06-28 北京英视睿达科技股份有限公司 Satellite gas data completion method and system
CN114898823B (en) * 2022-07-01 2022-10-14 北京英视睿达科技股份有限公司 High-spatial-temporal-resolution remote sensing near-surface NO 2 Concentration estimation method and system
CN115356249B (en) * 2022-10-19 2023-01-31 北华航天工业学院 Satellite polarization PM2.5 estimation method and system based on machine learning fusion model
CN117828992A (en) * 2024-01-04 2024-04-05 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Accurate prediction method and system for CCN number concentration with high space-time resolution

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268423A (en) * 2014-10-11 2015-01-07 武汉大学 Large-scale dynamic evolution dust type aerosol retrieval method
CN109583516A (en) * 2018-12-24 2019-04-05 天津珞雍空间信息研究院有限公司 A kind of space and time continuous PM2.5 inversion method based on ground and moonscope
CN110428113A (en) * 2019-08-09 2019-11-08 云南电网有限责任公司电力科学研究院 A kind of predicting model for dissolved gas in transformer oil method based on random forest

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NO329798B1 (en) * 2009-02-16 2010-12-20 Inst Energiteknik System and method for empirical ensemble-based virtual sensing of particulate matter
US10393714B2 (en) * 2016-11-28 2019-08-27 International Business Machines Corporation Particulate matter monitoring
CN109213964B (en) * 2018-07-13 2021-08-17 中南大学 Satellite AOD product correction method fusing multi-source characteristic geographic parameters
CN111426633A (en) * 2020-06-15 2020-07-17 航天宏图信息技术股份有限公司 PM at night2.5Mass concentration estimation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268423A (en) * 2014-10-11 2015-01-07 武汉大学 Large-scale dynamic evolution dust type aerosol retrieval method
CN109583516A (en) * 2018-12-24 2019-04-05 天津珞雍空间信息研究院有限公司 A kind of space and time continuous PM2.5 inversion method based on ground and moonscope
CN110428113A (en) * 2019-08-09 2019-11-08 云南电网有限责任公司电力科学研究院 A kind of predicting model for dissolved gas in transformer oil method based on random forest

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Estimating high-resolution PM1 concentration from Himawari-8 combining extreme gradient boosting-geographically and temporally weighted regression (XGBoost-GTWR);Rui Li etal.;《Atmospheric Environment》;20200325;全文 *
紫外大气甲醛卫星遥感反演方法和研究现状;朱松岩等;《中国环境科学》;20181231;第38卷(第5期);全文 *

Also Published As

Publication number Publication date
CN112016696A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
CN112016696B (en) PM integrating satellite observation and ground observation 1 Concentration inversion method and system
Zhao et al. High-resolution daily AOD estimated to full coverage using the random forest model approach in the Beijing-Tianjin-Hebei region
Banks et al. Performance evaluation of the boundary-layer height from lidar and the Weather Research and Forecasting model at an urban coastal site in the north-east Iberian Peninsula
Cheruy et al. Combined influence of atmospheric physics and soil hydrology on the simulated meteorology at the SIRTA atmospheric observatory
CN110232471B (en) Rainfall sensor network node layout optimization method and device
CN113255874B (en) Optimized BP neural network-based soil moisture inversion method through microwave remote sensing
CN112785024B (en) Runoff calculation and prediction method based on watershed hydrological model
CN110595968B (en) PM2.5 concentration estimation method based on geostationary orbit satellite
Wang et al. Estimating hourly PM2. 5 concentrations using MODIS 3 km AOD and an improved spatiotemporal model over Beijing-Tianjin-Hebei, China
Wang et al. Performance of three reanalysis precipitation datasets over the Qinling‐Daba Mountains, eastern fringe of Tibetan Plateau, China
CN108874734B (en) Global land rainfall inversion method
CN101936777A (en) Method for inversing air temperature of surface layer based on thermal infrared remote sensing
Sun et al. Microwave and meteorological fusion: A method of spatial downscaling of remotely sensed soil moisture
CN115795399B (en) Multi-source remote sensing precipitation data self-adaptive fusion method and system
CN112861072A (en) Satellite-ground multi-source rainfall self-adaptive dynamic fusion method
Khesali et al. A method in near-surface estimation of air temperature (NEAT) in times following the satellite passing time using MODIS images
CN115544706A (en) Wavelet and XGboost model integrated atmospheric fine particle concentration estimation method
Han et al. Estimation of high-resolution PM2. 5 concentrations based on gap-filling aerosol optical depth using gradient boosting model
CN114707396A (en) All-time PM2.5Near real-time production method of concentration seamless lattice point data
Renné et al. Solar resource assessment for Sri Lanka and Maldives
Orellana-Samaniego et al. Estimating monthly air temperature using remote sensing on a region with highly variable topography and scarce monitoring in the southern Ecuadorian Andes
CN117494419A (en) Multi-model coupling drainage basin soil erosion remote sensing monitoring method
CN114878748B (en) CO (carbon monoxide) 2 Emission monitoring method and emission monitoring system
Zanella et al. Internet of things for hydrology: Potential and challenges
Holzbecher et al. Application of big data and technologies for integrated water resources management-a survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant