CN115598027A - PM based on remote sensing and machine learning technology 2.5 Inversion method - Google Patents

PM based on remote sensing and machine learning technology 2.5 Inversion method Download PDF

Info

Publication number
CN115598027A
CN115598027A CN202210686671.2A CN202210686671A CN115598027A CN 115598027 A CN115598027 A CN 115598027A CN 202210686671 A CN202210686671 A CN 202210686671A CN 115598027 A CN115598027 A CN 115598027A
Authority
CN
China
Prior art keywords
data
model
remote sensing
machine learning
meteorological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210686671.2A
Other languages
Chinese (zh)
Inventor
王伟民
熊向陨
余良
何伟彪
梁鸿
刘凯
曾清怀
许旺
余欣繁
尹淳阳
文雯
公莉
张志刚
李会亚
俞兆康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Shenzhen Ecological Environment Monitoring Center Station Guangdong Dongjiang River Basin Ecological Environment Monitoring Center
Original Assignee
Guangdong Shenzhen Ecological Environment Monitoring Center Station Guangdong Dongjiang River Basin Ecological Environment Monitoring Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Shenzhen Ecological Environment Monitoring Center Station Guangdong Dongjiang River Basin Ecological Environment Monitoring Center filed Critical Guangdong Shenzhen Ecological Environment Monitoring Center Station Guangdong Dongjiang River Basin Ecological Environment Monitoring Center
Priority to CN202210686671.2A priority Critical patent/CN115598027A/en
Publication of CN115598027A publication Critical patent/CN115598027A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Dispersion Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of remote sensing, in particular to a PM based on remote sensing and machine learning technology 2.5 An inversion method comprising the steps of: preprocessing meteorological station data, meteorological element data, terrain data, road network data and satellite remote sensing data; measuring and analyzing surface parameters to obtain PM 2.5 Analyzing station data and spectral characteristics, and searching modeling characteristics; establishing a model, selecting and applying a model technology, training the model, setting and adjusting hyper-parameters of the model, verifying the model, developing and testing the integrated model, selecting an algorithm and optimizing the model, and combining PM 2.5 Site data, implementation of PM 2.5 Constructing an inversion model of the system; evaluating model performance and establishing a baseline, experimenting and adjusting the running model to determine a model that can be used for inverting PM 2.5 The machine learning model of (1); the technical scheme of the invention can be used for processing the PM through a machine learning technology 2.5 StationThe relation model is established between the measured data and the multi-source data such as the remote sensing data, the meteorological data and the topographic data, and the space coverage range is improved.

Description

PM based on remote sensing and machine learning technology 2.5 Inversion method
Technical Field
The invention relates to the technical field of remote sensing, in particular to a PM based on remote sensing and machine learning technology 2.5 An inversion method.
Background
Global disease burden studies have identified air pollution as the fifth major risk factor for all mortality, PM 2.5 The particles of (a) can have an adverse effect on human health. With the rapid development of economy and the acceleration of urbanization process, the PM widely exists 2.5 Contamination has attracted considerable attention worldwide.
Therefore, PM is accurately estimated 2.5 The concentration of (A) is especially important for pollution prevention and control and protection of the life health of people. But most regions and countriesThe home still has little or no PM 2.5 Monitor, about 60% of countries do not have regular PM 2.5 Monitoring, only 10% of countries have more than 3 monitors per million residents, which prevents us from accurately assessing PM 2.5 The ability of contamination to affect health.
Aerosol optical thickness (AOD), defined as the integral of the extinction coefficient in the vertical direction, is an important dimensionless parameter that measures the solar radiation absorption capacity of aerosol particles. PM (particulate matter) 2.5 Chemical particles such as suspended water vapor and the like are basic components of the aerosol. The AOD has space-time continuity and is related to the number of atmospheric aerosol particles corresponding to a vertical observation point, and the statistical description of multiple studies also shows that the data distribution of the AOD is related to PM 2.5 Similarly. Therefore, in order to obtain the air pollution condition with wider spatial coverage, domestic and foreign scholars propose a method for evaluating by using an AOD product obtained by satellite images, which can supplement the shortage of spatial and temporal ground monitoring station data, but the obtained product still has the problems of low temporal resolution and low spatial resolution.
Early studies obtained AOD and PM using simple linear regression models or global or continental scale chemical migration models (CTM) 2.5 The relationship (c) in (c). Later scholars demonstrated AOD and PM using advanced statistical models 2.5 The relationships of (A) change under different weather and land cover type factors, and they improve PM by using the models 2.5 The accuracy of the estimation. Mixed effect model is proposed for estimating PM of urban area with insufficient ground detection point density 2.5 Concentration provides an effective tool. However, the intensive ground monitoring network and the high-resolution satellite data product are the ones for improving PM 2.5 The key factor of estimation precision is that PM with high spatial resolution is difficult to obtain through sparse national weather monitoring station data and low-resolution remote sensing satellite products 2.5 Concentration data.
The dense ground monitoring network and high resolution satellite data product is PM improvement 2.5 The key factor for the accuracy of the estimation. Therefore, the method not only utilizes sparse national weather monitoring station data, but also adds regional weather monitoringThe platform data is combined with meteorological element data, terrain data, road network data and satellite remote sensing data, a mixed effect model is established by utilizing linear regression, support vector machine and random forest machine learning technology, and PM is improved 2.5 And estimating the precision.
Disclosure of Invention
Technical problem to be solved
The invention provides a PM based on remote sensing and machine learning technology 2.5 Inversion method, can be used for converting PM through machine learning technology 2.5 A relational model is established between site actual measurement data and multi-source data such as remote sensing data, meteorological data and topographic data, and the spatial coverage range of the site actual measurement data is improved while the original measurement precision is maintained. The technical scheme fully considers PM 2.5 Can be a wide range of PMs 2.5 The evaluation of the data provides an important and highly operable reference.
(II) technical scheme
To solve the above problems, the present invention provides a PM based on remote sensing and machine learning techniques 2.5 An inversion method, comprising the steps of: step 1, preprocessing meteorological site data, meteorological element data, terrain data, road network data and satellite remote sensing data; step 2, measuring and analyzing surface parameters to obtain PM 2.5 Analyzing station data and spectral characteristics, searching modeling characteristics, converting original data into characteristics, and improving the accuracy of representing the actual problem processed by the prediction model; step 3, establishing a model, selecting and applying a model technology, training the model, setting and adjusting hyper-parameters of the model, verifying the model, developing and testing the integrated model, selecting an algorithm and optimizing the model, and combining PM 2.5 Site data, implementation of PM 2.5 Constructing an inversion model of the system; and 4, evaluating the performance of the model, establishing a reference, performing experiments, adjusting the running model, and determining that the model can be used for inverting the PM 2.5 The machine learning model of (1); step 5, use of PM 2.5 Inverse model and various observations to achieve PM 2.5 Large area real-time estimation.
Preferably, the specific method of step 1 is: step 1.1, screening meteorological site data to obtain the meteorological site data of the gulf area, and carrying out data cleaning; step 1.2, acquiring meteorological data of wind speed, humidity, pressure intensity and temperature of fifth generation ECMWF atmosphere re-analysis global climate data (ERA 5), and performing data extraction, cleaning and the like; step 1.3, acquiring data of a Digital Elevation Model (DEM), and cutting and cleaning the data; step 1.4, obtaining road network data (OSM), cutting and cleaning the road network data, and the like; step 1.5, acquiring a normalized vegetation index (NDVI) product synthesized by MODIS for 16 days and an aerosol optical thickness (AOD) daily product, and splicing, cutting, cleaning and the like data.
Preferably, the specific method of step 2 is: 2.1, carrying out preliminary simple analysis on a sample, carrying out difference research on physical quantities, placing sampling points on a vector diagram except for showing the statistics by using a table, showing the spatial variation of concentration on vector data by using a bar chart or a pie chart, further carrying out multi-angle division on a research area, and describing the variation of surface parameters of different sampling points; step 2.2, obtaining PM 2.5 Meteorological element data, topographic data, road network data and satellite remote sensing data corresponding to the station data are fused; step 2.3, performing secondary cleaning on the obtained multi-source data, eliminating abnormal data and repeated data, reducing noise and eliminating ambiguity; 2.4, carrying out operations such as standardization and the like on the data, and aiming at improving the accuracy of subsequent modeling; and 2.5, dividing the data into a training set and a test set and a verification set according to a certain proportion.
Preferably, the specific method of step 3 is: step 3.1, selecting a correct algorithm according to the learning target and the data requirement; step 3.2, configuring and adjusting the hyper-parameters, and determining an iteration method for obtaining the optimal hyper-parameters; 3.3, determining whether model interpretability is needed, and determining the operation and deployment requirements of the model; step 3.4, training the data by using different machine learning algorithms to obtain PM 2.5 Estimating a model:
PM 2.5ij ~NDVI ij +AOD ij +t2m ij +sp ij +tp ij +u10 ij +v10 ij +DEM ij +OSM ij #(1)
in the formula, PM 2.5ii Is PM 2.5 Predicting and estimating values, wherein NDVI is a normalized vegetation index, AOD is aerosol optical thickness, t2m is 2m temperature, sp is surface air pressure, tp is rainfall, u10 is a component of wind in a longitudinal direction of 10 m, v10 is a component of wind in a latitudinal direction of 10 m, DEM is elevation data, and OSM is road network data; and 3.5, evaluating the result model to determine whether the result model meets the service and operation requirements.
Preferably, the specific method of step 4 is: step 4.1, deploying PM 2.5 Inverting the model to continuously measure and monitor its performance, obtaining the Mean Square Error (MSE) and the Root Mean Square Error (RMSE) of the final model, the MSE and RMSE of the model being obtained by:
Figure RE-GDA0003932420280000031
Figure RE-GDA0003932420280000041
in the formula, N is the number of test samples, observed is a real observed value, predicted is a prediction estimation value, and other evaluations comprise model measurement evaluation, confusion matrix calculation, KPI (Kernel principal) and model performance measurement, model quality measurement and final determination of whether the model can meet the established service target or not; step 4.2, making a benchmark to be used for measuring the future iteration of the model; step 4.3, continuously iterating PM 2.5 Inverting different aspects of the model to improve overall performance; step 4.4, according to the specified reference, selecting the most appropriate machine learning algorithm and obtaining PM 2.5 And (6) inverting the model.
Preferably, the specific method of step 5 is: use of PM screened in step 4 2.5 Inverse model and various observations to achieve PM 2.5 Large area real-time estimation.
(III) advantageous effects
The invention provides a PM based on remote sensing and machine learning technology 2.5 The inversion method can use machine learning technique to monitor PM of national weather monitoring station and regional weather monitoring station 2.5 The relation model is established between the measured value and multi-source data such as remote sensing data, meteorological data and topographic data, and the space coverage range of the relation model is improved while the original measurement precision is kept. The technical scheme fully considers PM 2.5 Can be a wide range of PM 2.5 The evaluation of the data provides an important and operationally robust reference.
Drawings
FIG. 1 shows a PM based on remote sensing and machine learning techniques according to the present invention 2.5 A flow chart of an embodiment of an inversion method;
FIGS. 2-3 are graphs of normalized vegetation index (NDVI) data, aerosol optical thickness (AOD) data provided in accordance with an embodiment of the present invention;
FIGS. 4 to 8 are diagrams of meteorological data such as wind speed (u 10, v 10), humidity (tp), pressure (sp), and temperature (t 2 m) according to an embodiment of the present invention;
FIGS. 9-10 are graphs of Digital Elevation Model (DEM) data and road network data (OSM) provided in accordance with an embodiment of the present invention;
FIG. 11 is a PM provided for one embodiment of the present invention 2.5 PM inverted by inverse model 2.5 And (6) data graphs.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The invention provides a PM based on remote sensing and machine learning technology 2.5 An inversion method, comprising:
step 1, preprocessing meteorological site data, meteorological element data, terrain data, road network data and satellite remote sensing data;
step 2, measuring and analyzing surface parameters to obtain PM 2.5 Analyzing station data and spectral characteristics, searching modeling characteristics, converting original data into characteristics, and improving the accuracy of representing the actual problem processed by the prediction model;
step 3, establishing a model, selecting and applying a model technology, training the model, setting and adjusting hyper-parameters of the model, verifying the model, developing and testing the integrated model, selecting an algorithm and optimizing the model, and combining PM 2.5 Site data, implementation of PM 2.5 Constructing an inversion model of (1);
step 4, evaluating the performance of the model, establishing a standard, performing an experiment, adjusting the running model, and determining that the model can be used for inverting the PM 2.5 The machine learning model of (1);
step 5, use of PM 2.5 PM realization by inverse model and various observation data 2.5 Large area real-time estimation.
The preprocessing of the meteorological site data, meteorological element data, terrain data, road network data and satellite remote sensing data further comprises:
step 1.1, screening meteorological site data to obtain the meteorological site data of the gulf area, and carrying out data cleaning;
step 1.2, acquiring the fifth generation of ECMWF atmosphere, analyzing the meteorological data such as wind speed, humidity, pressure intensity, temperature and the like of global climate data (ERA 5), and extracting and cleaning the data;
step 1.3, acquiring Digital Elevation Model (DEM) data, and cutting and cleaning the data;
step 1.4, obtaining road network data (OSM), cutting and cleaning the road network data, and the like;
step 1.5, obtaining a normalized vegetation index (NDVI) product and an aerosol optical thickness (AOD) product synthesized by MODIS for 16 days, and splicing, cutting, cleaning and the like data.
Measuring and analyzing the surface parameters to obtain PM 2.5 The method comprises the following steps of analyzing site data and spectral characteristics, searching modeling characteristics, converting original data into characteristics, and improving the accuracy of actual problems of the representation prediction model processing, and further comprises the following steps:
2.1, carrying out preliminary simple analysis on the sample, carrying out difference research on physical quantities, placing sampling points on a vector diagram except for displaying the statistics by using a table, displaying the spatial variation of concentration on vector data by using a histogram or a pie chart, further carrying out multi-angle division on a research area, and describing the variation of surface parameters of different sampling points;
step 2.2, obtaining PM 2.5 Meteorological element data, topographic data, road network data and satellite remote sensing data corresponding to the station data are fused;
step 2.3, performing secondary cleaning on the obtained multi-source data, eliminating abnormal data and repeated data, reducing noise and eliminating ambiguity;
2.4, carrying out operations such as standardization and the like on the data, and aiming at improving the accuracy of subsequent modeling;
and 2.5, dividing the data into a training set, a test set and a verification set according to a certain proportion.
The model building, the selection and application of model technology, the model training, the setting and adjustment of model hyper-parameters, the model verification, the development and test of integrated model, the algorithm selection and the model optimization are combined with PM 2.5 Site data, implementation of PM 2.5 The constructing of the inversion model further comprises:
step 3.1, selecting a correct algorithm according to the learning target and the data requirement;
step 3.2, configuring and adjusting the hyper-parameters, and determining an iteration method for obtaining the optimal hyper-parameters;
3.3, determining whether model interpretability is needed or not, and determining the operation and deployment requirements of the model;
step 3.4, training the data by using different machine learning algorithms to obtain PM 2.5 Estimating a model:
PM 2.5ij ~NDVI ij +AOD ij +t2m ij +sp ij +tp ij +u10 ij +v10 ij +DEM ij +OSM ij #(1)
in the formula, PM 2.5ij Is PM 2.5 Predicted estimates, NDVI is the normalized vegetation index, AOD is the aerosol optical thickness, t2m is the temperature of 2 meters, sp is the surface air pressure, tp is the rainfall, u10 is the component of the wind in the longitudinal direction of 10 meters, v10 is the component of the wind in the latitudinal direction of 10 metersDEM is elevation data, and OSM is road network data;
and 3.5, evaluating the result model to determine whether the result model meets the service and operation requirements.
Evaluating the performance of the model and establishing a benchmark, testing and adjusting the model in operation to determine the model which can be used for inverting PM 2.5 The machine learning model of (a) further comprises:
step 4.1, deploying PM 2.5 And (3) inverting the model to continuously measure and monitor the performance of the model, acquiring the Mean Square Error (MSE) and the Root Mean Square Error (RMSE) of the final model, and acquiring the MSE and the RMSE of the model according to the following formula:
Figure RE-GDA0003932420280000071
Figure RE-GDA0003932420280000072
in the formula, N is the number of test samples, observed is a real observed value, predicted is a prediction estimation value, and other evaluations comprise model measurement evaluation, confusion matrix calculation, KPI (Kernel principal component), model performance measurement, model quality measurement and final determination of whether the model can meet the established service target;
step 4.2, making a benchmark to measure the future iteration of the model;
step 4.3, continuously iterating PM 2.5 Inverting different aspects of the model to improve overall performance;
step 4.4, according to the specified reference, selecting the most appropriate machine learning algorithm and obtaining PM 2.5 And (6) inverting the model.
Use of PM described above 2.5 Inverse model and various observations to achieve PM 2.5 The large area real-time estimation of (2) further comprises:
use of PM screened in step 4 2.5 Inverse model and various observations to achieve PM 2.5 Large area real-time estimation.
FIG. 1 illustrates the present invention based on remote sensing and machinePM of learning technology 2.5 As shown in the flowchart of the embodiment of the inversion method, the method of the embodiment includes the following steps:
s101, acquiring meteorological station data, meteorological element data, terrain data, road network data and satellite remote sensing data;
in the present embodiment, PM is inverted from meteorological site data, meteorological element data, terrain data, road network data, and satellite remote sensing data 2.5 Taking data as an example, an original 500-meter resolution normalized vegetation index (NDVI) product and an aerosol optical thickness (AOD) daily product are obtained by using an MODIS sensor, the fifth generation ECMWF atmosphere analyzes meteorological data such as wind speed, humidity, pressure intensity and temperature of global climate data (ERA 5), digital Elevation Model (DEM) data, road network data (OSM) and the like, and a machine learning method is combined for inversion to obtain the environment air aerodynamic equivalent diameter smaller than or equal to the aerodynamic equivalent diameter 2.5 Micron particle size (PM) 2.5 )。
S102, preprocessing meteorological site data, meteorological element data, terrain data, road network data and satellite remote sensing data;
the data obtained in the step S101 are segmented, screened and cleaned, and meteorological site data are screened to obtain data of meteorological sites in the gulf area and are cleaned; acquiring meteorological data such as wind speed, humidity, pressure, temperature and the like of fifth-generation ECMWF atmosphere reanalysis global climate data (ERA 5), and extracting and cleaning the data; acquiring Digital Elevation Model (DEM) data, and cutting and cleaning the data; acquiring road network data (OSM), cutting and cleaning the road network data, and the like; acquiring a normalized vegetation index (NDVI) product and an aerosol optical thickness (AOD) product synthesized by MODIS for 16 days, splicing and cutting data, cleaning the data and the like. FIGS. 2-10 show the data after the pre-processing.
S103, measuring and analyzing the surface parameters obtained in the S102 to obtain PM 2.5 Analyzing station data and spectral characteristics, searching modeling characteristics, converting original data into characteristics, and improving the accuracy of representing the actual problem processed by the prediction model;
performing simple preliminary analysis on the sampleIn the difference research between physical quantities, in addition to showing the statistics by using a table, the sampling points are placed on a vector diagram, the spatial variation of concentration is shown on vector data by using a bar chart or a pie chart, and the research area is further divided in multiple angles to describe the variation of surface parameters of different sampling points; obtaining PM 2.5 Meteorological element data, topographic data, road network data and satellite remote sensing data corresponding to the station data, and fusing the data; performing secondary cleaning on the acquired multi-source data, eliminating abnormal data and repeated data, reducing noise and eliminating ambiguity; the data is subjected to operations such as standardization and the like, and the purpose is to improve the accuracy of subsequent modeling; the data is divided into a training set, a test set and a verification set according to a certain proportion.
S104, establishing a model by using the data obtained in S103, selecting and applying a model technology, training the model, setting and adjusting hyper-parameters of the model, verifying the model, developing and testing the integrated model, selecting an algorithm and optimizing the model, and combining PM 2.5 Site data, implementation of PM 2.5 Constructing an inversion model of the system;
selecting a correct algorithm according to a learning target and data requirements; configuring and adjusting the hyper-parameters, and determining an iteration method for obtaining the optimal hyper-parameters; determining whether model interpretability is required, and determining the operation and deployment requirements of the model; training data by using different machine learning algorithms to obtain PM 2.5 Estimating a model:
PM 2.5ij ~NDVI ij +AOD ij +t2m ij +sp ij +tp ij +u10 ij +v10 ij +DBM ij +OSM ij #(1)
in the formula, PM 2.5ij Is PM 2.5 Forecasting and estimating values, wherein NDVI is a normalized vegetation index, AOD is aerosol optical thickness, t2m is 2m temperature, sp is surface air pressure, tp is rainfall, u10 is a component of wind in a longitudinal direction of 10 meters, v10 is a component of wind in a latitudinal direction of 10 meters, DEM is elevation data, and OSM is road network data; the resulting model is evaluated to determine if it meets the business and operational requirements.
S105, evaluating the performance of the model obtained in the S104Can establish a benchmark, experiment and adjust a running model to determine that the model can be used for inverting PM 2.5 The machine learning model of (1);
deploying PM 2.5 And (3) inverting the model to continuously measure and monitor the performance of the model, acquiring the Mean Square Error (MSE) and the Root Mean Square Error (RMSE) of the final model, and acquiring the MSE and the RMSE of the model according to the following formula:
Figure RE-GDA0003932420280000091
Figure RE-GDA0003932420280000092
in the formula, N is the number of test samples, observed is a real observed value, predicted is a prediction estimation value, and other evaluations comprise model measurement evaluation, confusion matrix calculation, KPI (Kernel principal) and model performance measurement, model quality measurement and final determination of whether the model can meet the established service target or not; establishing a benchmark to be used for measuring the future iteration of the model; continuously iterating PM 2.5 Inverting different aspects of the model to improve overall performance; according to the specified reference, selecting the most suitable machine learning algorithm and obtaining PM 2.5 And (6) inverting the model.
S106, use PM that S105 finally obtained 2.5 PM realization by inverse model and various observation data 2.5 Large-area real-time estimation;
use of PM screened in step 4 2.5 PM realization by inverse model and various observation data 2.5 Large area real-time estimation. FIG. 11 is the PM resulting from the inversion of the finally selected model 2.5 And (4) data.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

Claims (6)

1. PM based on remote sensing and machine learning technology 2.5 An inversion method, characterized in that it comprises the following steps:
step 1, preprocessing meteorological site data, meteorological element data, terrain data, road network data and satellite remote sensing data;
step 2, measuring and analyzing surface parameters to obtain PM 2.5 Analyzing site data and spectral characteristics, searching modeling characteristics, converting original data into characteristics, and improving the accuracy of representing actual problems processed by a prediction model;
step 3, establishing a model, selecting and applying a model technology, training the model, setting and adjusting hyper-parameters of the model, verifying the model, developing and testing the integrated model, selecting an algorithm and optimizing the model, and combining PM 2.5 Site data, implementation of PM 2.5 Constructing an inversion model of (1);
step 4, evaluating the performance of the model, establishing a standard, performing an experiment, adjusting the running model, and determining that the model can be used for inverting the PM 2.5 The machine learning model of (1);
step 5, use of PM 2.5 PM realization by inverse model and various observation data 2.5 Large area real-time estimation.
2. PM based remote sensing and machine learning techniques according to claim 1 2.5 The inversion method is characterized in that the specific method in the step 1 is as follows:
step 1.1, screening meteorological site data to obtain the meteorological site data of the gulf area, and carrying out data cleaning;
step 1.2, acquiring the fifth generation of ECMWF atmosphere, analyzing meteorological data such as wind speed, humidity, pressure, temperature and the like of global climate data (ERA 5), and extracting and cleaning the data;
step 1.3, acquiring data of a Digital Elevation Model (DEM), and cutting and cleaning the data;
step 1.4, obtaining road network data (OSM), cutting and cleaning the road network data, and the like;
step 1.5, acquiring a normalized vegetation index (NDVI) product synthesized by MODIS for 16 days and an aerosol optical thickness (AOD) daily product, and splicing, cutting, cleaning and the like data.
3. PM based remote sensing and machine learning techniques according to claim 1 2.5 The inversion method is characterized in that the specific method in the step 2 is as follows:
2.1, carrying out preliminary simple analysis on a sample, carrying out difference research on physical quantities, placing sampling points on a vector diagram except for showing the statistics by using a table, showing the spatial variation of concentration on vector data by using a bar chart or a pie chart, further carrying out multi-angle division on a research area, and describing the variation of surface parameters of different sampling points;
step 2.2, obtaining PM 2.5 Meteorological element data, topographic data, road network data and satellite remote sensing data corresponding to the station data, and fusing the data;
step 2.3, performing secondary cleaning on the obtained multi-source data, eliminating abnormal data and repeated data, reducing noise and eliminating ambiguity;
2.4, carrying out operations such as standardization and the like on the data, and aiming at improving the accuracy of subsequent modeling;
and 2.5, dividing the data into a training set and a test set and a verification set according to a certain proportion.
4. PM based remote sensing and machine learning techniques according to claim 1 2.5 The inversion method is characterized in that the specific method in the step 3 is as follows:
step 3.1, selecting a correct algorithm according to the learning target and the data requirement;
step 3.2, configuring and adjusting the hyper-parameters, and determining an iteration method for obtaining the optimal hyper-parameters;
3.3, determining whether model interpretability is needed, and determining the operation and deployment requirements of the model;
step 3.4, training the data by using different machine learning algorithms to obtain PM 2.5 Estimating a model:
PM 2.5ij ~NDVI ij +AOD ij +t2m ij +sp ij +tp ij +u10 ij +v10 ij +DEM ij +OSM ij #(1)
in the formula, PM 2.5 Is PM 2.5 Predicting estimated values, wherein NDV is a normalized vegetation index, AO is an aerosol optical thickness, t is a temperature of 2 meters, s is a surface air pressure, t is a rainfall, u is a component of a 10-meter longitudinal wind, v is a component of a 10-meter latitudinal wind, DE is elevation data, and OS is road network data;
and 3.5, evaluating the result model to determine whether the result model meets the service and operation requirements.
5. PM based remote sensing and machine learning techniques according to claim 1 2.5 The inversion method is characterized in that the specific method in the step 4 comprises the following steps:
step 4.1, deploying PM 2.5 And (3) inverting the model to continuously measure and monitor the performance of the model, acquiring the Mean Square Error (MSE) and the Root Mean Square Error (RMSE) of the final model, and acquiring the MSE and the RMSE of the model according to the following formula:
Figure FDA0003698211990000031
Figure FDA0003698211990000032
in the formula, for the number of test samples, observe is a real observed value, predict is a prediction estimated value, and other evaluations comprise model measurement evaluation, confusion matrix calculation, KPI (key performance indicator), model performance measurement, model quality measurement and final determination of whether the model can meet the established business target or not;
step 4.2, making a benchmark to be used for measuring the future iteration of the model;
step 4.3, continuously iterating PM 2.5 Inverting different aspects of the model to improve overall performance;
step 4.4, according to the specified reference, selecting the most appropriate machine learning algorithm and obtaining PM 2.5 And (6) inverting the model.
6. PM based remote sensing and machine learning techniques according to claim 1 2.5 The inversion method is characterized in that the specific method in the step 5 is as follows: use of PM screened in step 4 2.5 Inverse model and various observations to achieve PM 2.5 Large area real-time estimation.
CN202210686671.2A 2022-06-16 2022-06-16 PM based on remote sensing and machine learning technology 2.5 Inversion method Pending CN115598027A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210686671.2A CN115598027A (en) 2022-06-16 2022-06-16 PM based on remote sensing and machine learning technology 2.5 Inversion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210686671.2A CN115598027A (en) 2022-06-16 2022-06-16 PM based on remote sensing and machine learning technology 2.5 Inversion method

Publications (1)

Publication Number Publication Date
CN115598027A true CN115598027A (en) 2023-01-13

Family

ID=84841911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210686671.2A Pending CN115598027A (en) 2022-06-16 2022-06-16 PM based on remote sensing and machine learning technology 2.5 Inversion method

Country Status (1)

Country Link
CN (1) CN115598027A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370772A (en) * 2023-12-08 2024-01-09 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370772A (en) * 2023-12-08 2024-01-09 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification
CN117370772B (en) * 2023-12-08 2024-04-16 北京英视睿达科技股份有限公司 PM2.5 diffusion analysis method and system based on urban street topography classification

Similar Documents

Publication Publication Date Title
Chen et al. Assessment of CFSR, ERA-Interim, JRA-55, MERRA-2, NCEP-2 reanalysis data for drought analysis over China
Aryee et al. Development of high spatial resolution rainfall data for Ghana
Rauthe et al. A Central European precipitation climatology–Part I: Generation and validation of a high-resolution gridded daily data set (HYRAS)
Nourani et al. Estimation of prediction interval in ANN-based multi-GCMs downscaling of hydro-climatologic parameters
Reuter et al. A joint effort to deliver satellite retrieved atmospheric CO 2 concentrations for surface flux inversions: the ensemble median algorithm EMMA
Ochoa et al. Evaluation of downscaled estimates of monthly temperature and precipitation for a Southern Ecuador case study.
Yumimoto et al. JRAero: the Japanese reanalysis for aerosol v1. 0
CN113901384A (en) Ground PM2.5 concentration modeling method considering global spatial autocorrelation and local heterogeneity
Eini et al. Evaluating three non-gauge-corrected satellite precipitation estimates by a regional gauge interpolated dataset over Iran
Xu et al. Area-averaged evapotranspiration over a heterogeneous land surface: aggregation of multi-point EC flux measurements with a high-resolution land-cover map and footprint analysis
Liu et al. Characterizing the spatiotemporal response of runoff to impervious surface dynamics across three highly urbanized cities in southern China from 2000 to 2017
Han et al. An improved modeling of precipitation phase and snow in the Lancang River Basin in Southwest China
Rempel et al. Object-based metrics for forecast verification of convective development with geostationary satellite data
CN114880933A (en) Atmospheric temperature and humidity profile inversion method and system for non-exploration-site foundation microwave radiometer based on reanalysis data
CN114819737B (en) Method, system and storage medium for estimating carbon reserves of highway road vegetation
CN116805439A (en) Drought prediction method and system based on artificial intelligence and atmospheric circulation mechanism
Laverde-Barajas et al. Decomposing satellite-based rainfall errors in flood estimation: Hydrological responses using a spatiotemporal object-based verification method
CN115598027A (en) PM based on remote sensing and machine learning technology 2.5 Inversion method
Ziveh et al. Spatio-temporal performance evaluation of 14 global precipitation estimation products across river basins in southwest Iran
Soleimani et al. Satellite aerosol optical depth prediction using data mining of climate parameters
Yaswanth et al. Evaluation of remote sensing rainfall products, bias correction and temporal disaggregation approaches, for improved accuracy in hydrologic simulations
CN116466368B (en) Dust extinction coefficient profile estimation method based on laser radar and satellite data
Zhu et al. An extraction method for long-term tropical cyclone precipitation from daily rain gauges
CN115544706A (en) Wavelet and XGboost model integrated atmospheric fine particle concentration estimation method
Blond et al. RJ van der A, M. Van Roozendael, I. De Smedt, G. Bergametti, and R. Vautard (2007), Intercomparison of SCIAMACHY nitrogen dioxide observations, in situ measurements and air quality modeling results over Western Europe

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination