CN110595968B - PM2.5 concentration estimation method based on geostationary orbit satellite - Google Patents

PM2.5 concentration estimation method based on geostationary orbit satellite Download PDF

Info

Publication number
CN110595968B
CN110595968B CN201910714837.5A CN201910714837A CN110595968B CN 110595968 B CN110595968 B CN 110595968B CN 201910714837 A CN201910714837 A CN 201910714837A CN 110595968 B CN110595968 B CN 110595968B
Authority
CN
China
Prior art keywords
data
optical thickness
aerosol
concentration
satellite
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910714837.5A
Other languages
Chinese (zh)
Other versions
CN110595968A (en
Inventor
左欣
顾行发
程天海
郭红
余涛
张晓川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Langfang Spatial Information Technology R&d Service Center
Zhongke Xingtong Langfang Information Technology Co ltd
Institute of Remote Sensing and Digital Earth of CAS
Original Assignee
Research Institute Of Space Information (langfang) Of China Science
Zhongke Xingtong Langfang Information Technology Co ltd
Institute of Remote Sensing and Digital Earth of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Institute Of Space Information (langfang) Of China Science, Zhongke Xingtong Langfang Information Technology Co ltd, Institute of Remote Sensing and Digital Earth of CAS filed Critical Research Institute Of Space Information (langfang) Of China Science
Priority to CN201910714837.5A priority Critical patent/CN110595968B/en
Publication of CN110595968A publication Critical patent/CN110595968A/en
Application granted granted Critical
Publication of CN110595968B publication Critical patent/CN110595968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions
    • G01N15/075Investigating concentration of particle suspensions by optical means

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Dispersion Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

A PM2.5 concentration estimation method based on a geostationary orbit satellite adopts aerosol optical thickness data meeting the precision requirement and corresponding PM2.5 concentration data to establish a data set, completes sample learning and data testing based on a random forest machine learning method, carries out precision verification on a test result, adjusts parameters of a random forest machine learning model to enable the parameters to reach the precision requirement, and carries out multi-time-phase PM2.5 concentration estimation under different weather conditions through a finally obtained calculation model. The PM2.5 concentration remote sensing estimation method based on the geostationary orbit satellite can effectively carry out multi-temporal PM2.5 concentration remote sensing estimation, makes up the deficiency of the traditional method in time continuity, and provides more accurate data support for developing atmospheric pollution prevention and control.

Description

PM2.5 concentration estimation method based on geostationary orbit satellite
Technical Field
The invention relates to the technical field of remote sensing, in particular to a PM2.5 concentration estimation method based on a geostationary orbit satellite.
Background
With the continuous and high-speed development of industrialization and urbanization in China, the living standard of people is increased rapidly, more and more environmental problems are caused, and in recent years, large-scale continuous haze weather occurs in China for many times. The haze mainly comprises lung-entering fine particulate matters PM2.5, wherein PM2.5 refers to particulate matters with aerodynamic diameters smaller than 2.5 microns, and compared with PM10, the particle size of PM2.5 is smaller, and the particulate matters can stay in the atmosphere for a long time and can be transmitted in a long distance, so that the influence on the quality of the atmospheric environment is larger; PM2.5 is easy to be attached to various toxic and harmful substances (such as persistent organic pollutants, heavy metals, various pathogenic bacteria and the like), and can directly reach the lung to generate greater health hazard. A number of epidemiological studies in foreign countries have demonstrated that PM2.5 is associated with negative health effects, such as excessive morbidity and mortality, which can lead to cardiovascular and cerebrovascular and respiratory diseases. Therefore, the method is particularly important for monitoring and controlling the concentration of the fine particulate matter PM2.5 in real time. At present, research on PM2.5 concentration mainly comprises two modes of ground observation and remote sensing inversion, wherein the ground observation mainly comprises two modes of off-line sample collection and real-time on-line sample detection, but the ground monitoring site is low in coverage range and discontinuous in time, and long-time and large-range research is difficult to carry out. The satellite remote sensing method can effectively solve the problems, and the PM2.5 remote sensing estimation method through the aerosol optical thickness is widely applied to the estimation work of the PM2.5 concentration because the aerosol optical thickness has strong correlation with the PM2.5 concentration. The satellite remote sensing data mostly comes from polar orbit satellites, namely sun synchronous orbit satellites, and the same-point revisit period is generally long due to the fact that the orbit of the satellite is synchronous with the sun through the north and south poles of the earth, for example, the data of a moderate-resolution imaging spectrometer MODIS (modulation-resolution imaging spectrometer) is acquired twice a day (the Beijing time is 10: 30 and 13:30), the PM2.5 concentration estimation based on the remote sensing data cannot well reflect the migration of pollutants, and a lot of problems exist in the aspect of high-time-resolution PM2.5 concentration estimation. The geostationary orbit satellite has great advantages in the aspect of time resolution, for example, the observation frequency of the Japanese Himapari-8 meteorological satellite is increased to once every ten minutes, the continuous observation performance of the dynamics of cloud layers and the like is further improved, and a new opportunity is provided for the dynamic monitoring remote sensing estimation of PM2.5 concentration.
Disclosure of Invention
Aiming at the current situation that the remote sensing data of the current satellite remote sensing estimation PM2.5 concentration algorithm mainly comes from polar orbit satellites, the PM2.5 concentration estimation method based on the geostationary orbit satellite is provided for improving the time continuity of the PM2.5 concentration remote sensing estimation method and expanding the application on environmental monitoring.
The invention is realized by the following technical scheme:
acquiring optical thickness data of the aerosol observed by the geostationary orbit satellite;
calculating the optical thickness of the aerosol at the corresponding wave band of the corresponding ground station, and performing precision verification on the optical thickness data of the aerosol observed by the satellite;
establishing a data set of the concentration of PM2.5 in a preset waveband under different weather conditions and the optical thickness of the corresponding satellite observation aerosol;
completing sample learning and data testing based on a random forest machine learning model to obtain remote sensing estimation of PM2.5 concentration;
performing precision verification on the PM2.5 concentration obtained by the data test to obtain a precision verification result;
adjusting parameters of a random forest machine learning model according to the precision verification result, and repeating the steps of sample learning, data testing and precision verification until the concentration of PM2.5 obtained by data testing reaches the preset precision requirement;
and estimating the PM2.5 concentration according to the adjusted random forest machine learning model.
Further, the step of acquiring the optical thickness data of the aerosol of the geostationary orbit satellite comprises the steps of extracting the optical thickness data of the aerosol observed by the geostationary orbit satellite with a preset wave band, projecting an original image which is not subjected to projection transformation to a WGS-84 coordinate system, and acquiring the optical thickness data of the aerosol with the preset wave band according to a preset time interval.
Further, the step of calculating the optical thickness of the aerosol at the corresponding waveband of the corresponding ground station and performing precision verification on the optical thickness data of the aerosol observed by the satellite comprises the following steps:
acquiring ground observation data with one hour interval;
performing quadratic polynomial interpolation calculation on the optical thickness data of the aerosol observed on the ground according to different wave bands, and then calculating the optical thickness data of the aerosol of a preset wave band corresponding to the ground according to the obtained quadratic polynomial interpolation formula; the quadratic polynomial interpolation formula is as follows:
lnτα=a0+a1lnλ+a2(lnλ)2 (1)
wherein λ is a band value, ταExpressing the optical thickness value of the aerosol at the lambda wave band channel; a is0、a1、a2The unknown coefficient is obtained by calculation after the ground observation data is substituted into the formula (1) in the aerosol optical thickness of different wave band values;
and selecting an accuracy evaluation coefficient, and performing accuracy verification on the satellite observation aerosol optical thickness data by taking the calculated aerosol optical thickness data of the predetermined wave band corresponding to the ground, namely the ground observation aerosol optical thickness, as a true value.
Further, the different wave band values are selected to be 440nm, 500nm and 675nm, the optical thickness of the aerosol at the positions of 440nm, 500nm and 675nm is measured by ground observation data, and the formula (1) is substituted to calculate the a0、a1、a2(ii) a The predetermined wavelength band is 550nm, and then the optical thickness of the aerosol at 550nm is calculated according to formula (1).
Further, the precision evaluation coefficient comprises a correlation coefficient R, a root mean square error RMSE and a slope B; selecting the aerosol optical thickness data with the precision evaluation coefficient reaching a preset value as the aerosol optical thickness data meeting the precision requirement;
wherein the correlation coefficient R, the root mean square error RMSE and the slope B are respectively calculated by the following formula:
Figure BDA0002153642710000031
Figure BDA0002153642710000032
Figure BDA0002153642710000033
in the formula, Xi、YiThe optical thickness values of the ith ground observation aerosol in the data set and the optical thickness values of the satellite observation aerosol are respectively;
Figure BDA0002153642710000034
respectively taking the average value of the optical thickness of the aerosol observed on the ground and the average value of the optical thickness of the aerosol observed by a satellite; n is the data number of the data set; a is the intercept of the fitted line.
Further, when the accuracy evaluation coefficient reaches the following preset value, the aerosol optical thickness data meets the accuracy requirement:
wherein R > 0.5; RMSE < 0.3; b > 0.5.
Further, the step of establishing a data set of the multi-temporal PM2.5 concentration corresponding to the optical thickness data of the aerosol observed by the geostationary orbit satellite under different weather conditions comprises:
according to PM2.5 concentration data of a ground atmosphere monitoring station, selecting concentration values x of which the PM2.5 concentrations measured by the station are respectively excellent, good, pollution and heavy pollution grades and aerosol optical thickness values y of corresponding time and place as a data set T { (x {)1,y1),(x2,y2),…,(xn,yn) N is a natural number greater than 1; wherein the concentration of PM2.5 with excellent grade is less than 35 mu g/m3Good grade PM2.5 concentration of 35-75 μ g/m3The concentration of PM2.5 with the pollution level is 75-150 mu g/m3PM2.5 concentration of grade heavily contaminated greater than 150 μ g/m3
And dividing the data set into a training sample data set and a test sample data set according to a preset proportion.
Further, the step of completing sample learning and data testing based on the random forest machine learning model to obtain remote sensing estimation of the PM2.5 concentration comprises:
taking the training sample data set and the test sample set as 9:1 ratio ofSelecting a training sample data set Si
Using SiGenerating a tree h without pruningiRandomly selecting M from the d featurestryA feature, from M on each nodetrySelecting the optimal characteristics according to the gini indexes by the characteristics, and splitting until the tree grows to the maximum;
get the tree hiSet of (c) { h }i,i=1,2...,NtreeIn which N istreeThe number of trees;
for the sample x to be measuredtOutput tree hi(xt);xtRepresents a concentration value corresponding to the t-th PM 2.5;
output strong learner f (x):
Figure BDA0002153642710000041
and carrying out preliminary parameter setting based on the algorithm, and realizing the remote sensing estimation process of the PM2.5 concentration.
Further, performing precision verification on the PM2.5 concentration obtained by the data test, wherein the step of obtaining a precision verification result comprises performing precision verification by selecting a ten-fold cross verification method.
Furthermore, parameters of a random forest machine learning model are adjusted according to the precision verification result, wherein the adjusted parameters comprise the number n _ estimators of method learning subtrees, the maximum characteristic number max _ features participating in judgment during node splitting, the parallel number n _ jobs and/or the minimum sample leaf size min _ sample _ leaf.
In conclusion, the invention provides a PM2.5 concentration estimation method based on a geostationary orbit satellite, which adopts aerosol optical thickness data meeting the precision requirement and corresponding PM2.5 concentration data to form a data set, completes sample learning and data testing based on a random forest machine learning method, performs precision verification on a test result, adjusts parameters of a random forest machine learning model to enable the parameters to reach the precision requirement, and performs multi-time phase PM2.5 concentration estimation under different weather conditions through a finally obtained calculation model. The PM2.5 concentration remote sensing estimation method based on the geostationary orbit satellite can effectively carry out multi-temporal PM2.5 concentration remote sensing estimation, makes up the deficiency of the traditional method in time continuity, and provides more accurate data support for developing atmospheric pollution prevention and control.
Drawings
FIG. 1 is a flow chart of a PM2.5 concentration remote sensing estimation method based on a geostationary orbit satellite according to the invention;
FIG. 2 is a flow chart of a method for completing sample learning and data testing based on a random forest machine learning model according to the present invention;
FIG. 3 is a flow chart of a method for remote sensing estimation of PM2.5 concentration in an exemplary embodiment;
FIG. 4 is a distribution diagram of atmospheric monitoring sites for AERONET and PM2.5 concentration;
FIG. 5 is a graph of the accuracy evaluation of the Himapari-8 satellite aerosol optical thickness product in a specific example;
FIG. 6 is a diagram of accuracy evaluation of remote sensing estimation of PM2.5 concentration under different weather conditions in the specific embodiment;
FIG. 7 is a distribution diagram of a Himapari-8 satellite true color image and PM2.5 concentration remote sensing estimation in an embodiment;
fig. 8 is a diagram illustrating accuracy evaluation of remote sensing estimation results of PM2.5 concentration in the specific embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The invention provides a PM2.5 concentration estimation method based on a geostationary orbit satellite, which is used for carrying out remote sensing estimation on the PM2.5 concentration under different weather conditions according to the correlation between aerosol optical thickness data and the PM2.5 concentration, and can quickly and accurately obtain the estimation result of the PM2.5 concentration.
As shown in fig. 1, the estimation method of the present invention includes the steps of:
and S100, acquiring the optical thickness data of the aerosol observed by the geostationary orbit satellite.
Further, the step of acquiring the optical thickness data of the aerosol observed by the geostationary orbit satellite comprises the following steps:
extracting static orbit satellite observation aerosol optical thickness data of a preset waveband, and performing batch preprocessing work on Himapari-8L 2 aerosol optical thickness by using a remote sensing visualization language IDL; and projecting the original image without projection transformation to a WGS-84 coordinate system, and acquiring the optical thickness data of the aerosol with different wave bands at preset time intervals. Specifically, the preset time interval is obtained by hourly interval, 550nm aerosol optical thickness data with hourly time resolution and 5KM spatial resolution, and 11-16 points of data per day can be counted as effective data to ensure data effectiveness.
And S200, calculating the optical thickness of the aerosol at the corresponding wave band of the ground station according to the corresponding time and place, and performing precision verification on the optical thickness data of the aerosol observed by the satellite. In a specific embodiment, satellite aerosol optical thickness accuracy validation is performed based on AERONET data. AERONET is a foundation aerosol remote sensing observation network jointly established by NASA and LOA-PHOTONS (CNRS), and the data of the aerosol optical thickness measured by the foundation aerosol remote sensing observation network can be used as the true value of the aerosol optical thickness to carry out precision evaluation on the data result measured by the satellite. Specifically, the AERONET Level-2.0 data which is processed by filtering cloud and is verified is selected for aerosol optical thickness precision verification of a typical region.
Furthermore, according to the aerosol optical thickness data sets at different wave bands provided by the ground observation station, the interpolation of the aerosol optical thickness at 550nm is completed by a quadratic polynomial method.
lnτα=a0+a1lnλ+a2(lnλ)2 (1)
Wherein λ is a band value, ταExpressing the optical thickness value of the aerosol at the lambda wave band channel; a is0、a1、a2Is an unknown coefficient, from groundAnd (3) calculating the surface observation data after the optical thicknesses of the aerosol with different wave band values are substituted into the formula (1).
Further, the different wave band values are selected to be 440nm, 500nm and 675nm, the optical thickness of the aerosol at the positions of 440nm, 500nm and 675nm is measured by ground observation data, and the formula (1) is substituted to calculate the a0、a1、a2(ii) a The predetermined wavelength band is 550nm, and then the optical thickness of the aerosol at 550nm is calculated according to formula (1).
Further, accuracy evaluation coefficients are selected for the accuracy verification, wherein the accuracy evaluation coefficients comprise a correlation coefficient R (used for measuring the linear relation between two variables), a root mean square error RMSE (used for measuring the deviation between an observed value and a true value) and a slope B (used for reflecting the correlation of the mean value of the variables); and selecting the aerosol optical thickness data with the accuracy evaluation coefficient reaching a preset value as the aerosol optical thickness data meeting the accuracy requirement. The predetermined value may be selected to be R > 0.5; RMSE < 0.3; b > 0.5.
Specifically, the correlation coefficient R, the root mean square error RMSE, and the slope B are calculated by the following equations:
Figure BDA0002153642710000071
Figure BDA0002153642710000072
Figure BDA0002153642710000073
in the formula, Xi、YiThe optical thickness values of the ith ground observation aerosol in the data set and the optical thickness values of the satellite observation aerosol are respectively;
Figure BDA0002153642710000074
respectively of mean value of optical thickness of ground-observed aerosol and optical thickness value of satellite-observed aerosolMean value; n is the data number of the data set; a is the intercept of the fitted line.
And obtaining Himapari-8 meteorological satellite aerosol optical thickness data meeting the precision verification requirement according to the steps.
Step S300, establishing a data set corresponding to the multi-temporal PM2.5 concentration and the aerosol optical thickness data under different weather conditions, and dividing the data set into a training sample data set and a test sample data set.
Further, according to PM2.5 concentration data of a ground atmosphere monitoring station and longitude and latitude positions of the monitoring station, concentration values x of PM2.5 concentrations measured by the station as excellent, good, pollution and heavy pollution grades and aerosol optical thickness values y of corresponding time and place are selected as a data set T { (x)1,y1),(x2,y2),…,(xn,yn) N is a natural number greater than 1. Wherein the concentration of PM2.5 with excellent grade is less than 35 mu g/m3Good grade PM2.5 concentration of 35-75 μ g/m3The concentration of PM2.5 with the pollution level is 75-150 mu g/m3PM2.5 concentration of grade heavily contaminated greater than 150 μ g/m3. Dividing the data set into a training sample data set and a test sample data set according to a predetermined proportion, specifically, according to 9:1, the training sample data set and the test sample data set of the invention are established.
And S400, completing sample learning and data testing based on the random forest machine learning model to obtain remote sensing estimation of PM2.5 concentration.
The method is characterized in that preliminary realization and parameter setting of a random forest machine learning algorithm are completed based on Python, a decision tree is constructed for each training set, when nodes find features to split, all the features are not found to enable indexes (such as information gain) to be maximum, but a part of features are randomly extracted from the features, an optimal solution is found among the extracted features and is applied to the nodes to split. In effect, this is equivalent to sampling both the samples and the features (if the training data is viewed as a matrix, as is common in practice, then a row and column sampling process), so that overfitting can be avoided and the votes are classified and the mean is regressed to obtain a good estimate.
The input is a training sample set S: s { (x)1,y1),(x2,y2),…,(xm,ym)};
The output is a strong learner f (x).
Specifically, the method comprises the following steps as shown in fig. 2:
step S410, with the training sample set and the test sample set as 9:1 proportion selection training sample data set Si
And step S420, obtaining a tree set according to the training sample data set. In particular, using SiGenerating a tree h without pruningiRandomly selecting M from the d featurestryA feature, from M on each nodetryThe characteristics select the optimal characteristics according to the gini index, the characteristics are split until the tree grows to the maximum, the gini index is a judging method for determining the division characteristics, the characteristics are similar to the information entropy, the categories are more disordered when the indexes are larger, and whether the fitting value calculated by using the sample is more uncertain after the characteristics are divided can be calculated by using the method. Get the tree hiSet of (c) { h }i,i=1,2...,NtreeIn which N istreeThe number of trees;
for the sample x to be measuredtOutput tree hi(xt);xtRepresents a concentration value corresponding to the t-th PM 2.5;
step S430, output strong learner f (x):
Figure BDA0002153642710000091
and carrying out preliminary parameter setting based on the algorithm, and realizing the remote sensing estimation process of the PM2.5 concentration.
And S500, performing precision verification on the PM2.5 concentration obtained by the data test, evaluating the estimation precision, and obtaining a precision verification result.
Specifically, a ten-fold cross validation method can be selected for precision validation. Dividing a data set formed by aerosol optical thickness data and corresponding PM2.5 concentration data into 10 parts of sub-data sets according to a ratio of 9: 1; and sequentially selecting 9 parts of different sub data sets, inputting the sub data sets into the strong learner to be trained, inputting the optical thickness data of the aerosol in the remaining 1 part of sub data sets into the trained strong learner to obtain corresponding PM2.5 concentration data, and comparing the concentration data with the measured PM2.5 concentration data to obtain an accuracy verification result.
And S600, adjusting parameters of a random forest machine learning model according to the precision verification result, and repeating the steps of sample learning, data testing and precision verification until the concentration of PM2.5 obtained by data testing reaches the preset precision requirement to obtain the final strong learner. The parameters comprise the number n _ estimators of method learning subtrees, the maximum characteristic number max _ features participating in judgment during node splitting, the parallel number n _ jobs and/or the minimum sample leaf size min _ sample _ leaf, the parameters are matched with each other, and the parameters are reasonably adjusted according to time and budget of a memory, so that overfitting is prevented, and PM2.5 concentration estimation is completed quickly and efficiently.
And S700, estimating the PM2.5 concentration according to the adjusted random forest machine learning model.
The invention is further illustrated below in a specific example, following the above procedure.
Taking the area of jingji as an example, the specific process is shown in fig. 3. Taking the kyford wing area as an example, a training data set and a test sample set are constructed by PM2.5 concentrations of 81 atmospheric monitoring stations (shown in fig. 4) in a research area from 7/15/2015 to 12/31/2017/11-16 points, in the step of performing precision verification on the himarwari-8 aerosol optical thickness, a Beijing station and a Xianghe station (shown in fig. 4) are selected to represent cities and villages as typical stations to perform precision verification based on AERONET Level-2.0, and the obtained verification results are shown in fig. 5, which all obtain higher correlation coefficients R (0.878, 0.860) and lower root mean square error RMSE (0.185,0.175), have slopes of 0.667 and 0.742, and prove that the aerosol optical thickness data obtained based on the himarwari-8 have good confidence and meet the requirements of next step of modeling. The invention further performs regression estimation and verification on the PM2.5 concentration under different weather conditions in Jingjin Ji area based on a random forest machine learning algorithm, and the obtained ten-fold cross verification result is shown in FIG. 6, wherein correlation coefficients R are all larger than 0.6, when the PM2.5 concentration is larger than 150 mu g/m3, R reaches 0.863, and the root mean square error under each weather condition is also within the error allowable range, thereby proving the feasibility of the invention. Further, a case application specific to a day (11/2/2017) is selected, according to the method, a Himapari-8 satellite true color map at six continuous moments (11/16/11/day) in a study area and a corresponding remote sensing monitoring distribution map (figure 7) for PM2.5 concentration estimation are obtained, PM2.5 concentration change in continuous moments is reflected, accuracy verification of estimation results is further carried out (figure 8), accuracy of all parts is relatively consistent, correlation can reach 0.86, and feasibility of the method is proved.
In conclusion, the invention provides a PM2.5 concentration estimation method based on a geostationary orbit satellite, which adopts aerosol optical thickness data meeting the precision requirement and corresponding PM2.5 concentration data to form a data set, completes sample learning and data testing based on a random forest machine learning method, performs precision verification on a test result, adjusts the parameters of a random forest machine learning model to enable the parameters to reach the precision requirement, and performs PM2.5 concentration estimation under different weather conditions through a finally obtained calculation model. The PM2.5 concentration remote sensing estimation method based on the geostationary orbit satellite can effectively carry out multi-temporal PM2.5 concentration remote sensing estimation, makes up the deficiency of the traditional method in time continuity, and provides more accurate data support for developing atmospheric pollution prevention and control.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (7)

1. A PM2.5 concentration estimation method based on a stationary orbit satellite is characterized by comprising the following steps:
acquiring optical thickness data of the aerosol observed by the geostationary orbit satellite;
calculating the optical thickness of the aerosol at the corresponding wave band of the corresponding ground station, and performing precision verification on the optical thickness data of the aerosol observed by the satellite;
establishing a data set of the concentration of PM2.5 in a preset waveband under different weather conditions and the optical thickness of the corresponding satellite observation aerosol;
completing sample learning and data testing based on a random forest machine learning model to obtain remote sensing estimation of PM2.5 concentration;
performing precision verification on the PM2.5 concentration obtained by the data test to obtain a precision verification result;
adjusting parameters of a random forest machine learning model according to the precision verification result, and repeating the steps of sample learning, data testing and precision verification until the concentration of PM2.5 obtained by data testing reaches the preset precision requirement;
estimating the PM2.5 concentration according to the adjusted random forest machine learning model;
the step of calculating the optical thickness of the aerosol corresponding to the ground station in the corresponding waveband and carrying out precision verification on the optical thickness data of the aerosol observed by the satellite comprises the following steps:
acquiring ground observation data with one hour interval;
performing quadratic polynomial interpolation calculation on the optical thickness data of the aerosol observed on the ground according to different wave bands, and then calculating the optical thickness data of the aerosol of a preset wave band corresponding to the ground according to the obtained quadratic polynomial interpolation formula; the quadratic polynomial interpolation formula is as follows:
lnτα=a0+a1lnλ+a2(lnλ)2 (1)
wherein λ is a band value, ταIndicating the lambda bandAn aerosol optical thickness value at the channel; a is0、a1、a2The unknown coefficient is obtained by calculation after the ground observation data is substituted into the formula (1) in the aerosol optical thickness of different wave band values;
selecting an accuracy evaluation coefficient, and performing accuracy verification on the satellite observation aerosol optical thickness data by taking the calculated aerosol optical thickness data of the predetermined wave band corresponding to the ground, namely the ground observation aerosol optical thickness, as a true value;
the accuracy evaluation coefficient comprises a correlation coefficient R, a root mean square error RMSE and a slope B, and the aerosol optical thickness data with the accuracy evaluation coefficient reaching a preset value is selected as the aerosol optical thickness data meeting the accuracy requirement;
wherein the correlation coefficient R, the root mean square error RMSE and the slope B are respectively calculated by the following formula:
Figure FDA0002973223410000021
Figure FDA0002973223410000022
Figure DEST_PATH_IMAGE001
(4)
in the formula, Xi、YiThe optical thickness values of the ith ground observation aerosol in the data set and the optical thickness values of the satellite observation aerosol are respectively;
Figure FDA0002973223410000024
respectively taking the average value of the optical thickness of the aerosol observed on the ground and the average value of the optical thickness of the aerosol observed by a satellite; n is the data number of the data set; a is the intercept of the fitted line;
and when the accuracy evaluation coefficient reaches the following preset value, the aerosol optical thickness data meets the accuracy requirement:
wherein R > 0.5; RMSE < 0.3; b > 0.5.
2. The method of claim 1, wherein the step of acquiring geostationary orbit satellite observed aerosol optical thickness data comprises:
extracting the optical thickness data of the aerosol observed by the geostationary orbit satellite with the preset wave band, projecting the original image which is not subjected to projection transformation to a WGS-84 coordinate system, and acquiring the optical thickness data of the aerosol with the preset wave band according to a preset time interval.
3. The method of claim 2, wherein the different wavelength band values are selected as 440nm, 500nm, 675nm, and the optical thickness of the aerosol at 440nm, 500nm, 675nm is measured from ground observation data and is substituted into the formula (1) to calculate a0、a1、a2(ii) a The predetermined wavelength band is 550nm, and then the optical thickness of the aerosol at 550nm is calculated according to formula (1).
4. The method of claim 1, wherein the step of establishing a data set of PM2.5 concentrations at different weather conditions corresponding to the geostationary orbit satellite observed aerosol optical thickness data comprises:
according to PM2.5 concentration data of a ground atmosphere monitoring station, selecting concentration values x of which the PM2.5 concentrations measured by the station are respectively excellent, good, pollution and heavy pollution grades and aerosol optical thickness values y of corresponding time and place as a data set T { (x {)1,y1),(x2,y2),…,(xn,yn) N is a natural number greater than 1; wherein the concentration of PM2.5 with excellent grade is less than 35 mu g/m3Good grade PM2.5 concentration of 35-75 μ g/m3The concentration of PM2.5 with the pollution level is 75-150 mu g/m3PM2.5 concentration of grade heavily contaminated greater than 150 μ g/m3
And dividing the data set into a training sample data set and a test sample data set according to a preset proportion.
5. The method as claimed in claim 4, wherein the step of completing sample learning and data testing based on the random forest machine learning model to obtain the remote sensing estimation of PM2.5 concentration comprises:
taking the training sample data set and the test sample set as 9:1 proportion selection training sample data set Si
Using SiGenerating a tree h without pruningiRandomly selecting M from the d featurestryA feature, from M on each nodetrySelecting the optimal characteristics according to the gini index, splitting until the tree grows to the maximum, and obtaining the tree hiSet of (c) { h }i,i=1,2…,NtreeIn which N istreeThe number of trees;
for the sample x to be measuredtOutput tree hi(xt);xtRepresents a concentration value corresponding to the t-th PM 2.5;
output strong learner f (x):
Figure FDA0002973223410000031
and carrying out preliminary parameter setting based on the algorithm, and realizing the remote sensing estimation process of the PM2.5 concentration.
6. The method according to claim 5, wherein the step of performing precision verification on the PM2.5 concentration obtained by the data test comprises performing precision verification by using a ten-fold cross validation method.
7. The method according to claim 6, wherein parameters of the random forest machine learning model are adjusted according to the precision verification result, and the adjusted parameters comprise the number n _ estimators of method learning subtrees, the maximum feature number max _ features participating in judgment during node splitting, the parallel number n _ jobs and/or the minimum sample leaf size min _ sample _ leaf.
CN201910714837.5A 2019-08-02 2019-08-02 PM2.5 concentration estimation method based on geostationary orbit satellite Active CN110595968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910714837.5A CN110595968B (en) 2019-08-02 2019-08-02 PM2.5 concentration estimation method based on geostationary orbit satellite

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910714837.5A CN110595968B (en) 2019-08-02 2019-08-02 PM2.5 concentration estimation method based on geostationary orbit satellite

Publications (2)

Publication Number Publication Date
CN110595968A CN110595968A (en) 2019-12-20
CN110595968B true CN110595968B (en) 2021-05-18

Family

ID=68853401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910714837.5A Active CN110595968B (en) 2019-08-02 2019-08-02 PM2.5 concentration estimation method based on geostationary orbit satellite

Country Status (1)

Country Link
CN (1) CN110595968B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111426633A (en) * 2020-06-15 2020-07-17 航天宏图信息技术股份有限公司 PM at night2.5Mass concentration estimation method and device
CN111723524B (en) * 2020-06-23 2024-01-30 南通大学 PM2.5 satellite remote sensing inversion method based on daily variation constraint
CN112484776A (en) * 2020-11-18 2021-03-12 成都信息工程大学 Method for estimating hourly near-ground atmospheric fine particles by using geostationary satellite
CN113189014B (en) * 2021-04-14 2023-05-02 西安交通大学 Ozone concentration estimation method integrating satellite remote sensing and ground monitoring data
CN113780383B (en) * 2021-08-27 2024-07-05 北京工业大学 Dioxin emission concentration prediction method based on semi-supervised random forest and deep forest regression integration

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109001091A (en) * 2018-07-18 2018-12-14 北京航天宏图信息技术股份有限公司 Satellite remote-sensing monitoring method, device and the computer-readable medium of atmosphere pollution
CN109583516A (en) * 2018-12-24 2019-04-05 天津珞雍空间信息研究院有限公司 A kind of space and time continuous PM2.5 inversion method based on ground and moonscope

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5534300B2 (en) * 2009-07-27 2014-06-25 株式会社サタケ How to create a remote sensing calibration curve
CN104573155B (en) * 2013-10-17 2017-12-19 中国科学院地理科学与资源研究所 A kind of near surface PM2.5 Concentration Estimation Methods and estimating system
CN103674794B (en) * 2013-12-16 2016-06-01 中国科学院遥感与数字地球研究所 Remote sensing monitoring near surface fine particle quality concentration PM2.5Multiple regression procedure
CN106124374A (en) * 2016-07-22 2016-11-16 中科宇图科技股份有限公司 Atmospheric particulates remote-sensing monitoring method based on data fusion
US10154624B2 (en) * 2016-08-08 2018-12-18 The Climate Corporation Estimating nitrogen content using hyperspectral and multispectral images
CN108106979B (en) * 2017-12-21 2020-05-19 深圳先进技术研究院 PM2.5 inversion method based on MODIS and machine learning model fusion
CN109030301A (en) * 2018-06-05 2018-12-18 中南林业科技大学 Aerosol optical depth inversion method based on remotely-sensed data
CN109657363B (en) * 2018-12-24 2023-11-24 武汉大学 Space-time continuous PM2.5 inversion method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109001091A (en) * 2018-07-18 2018-12-14 北京航天宏图信息技术股份有限公司 Satellite remote-sensing monitoring method, device and the computer-readable medium of atmosphere pollution
CN109583516A (en) * 2018-12-24 2019-04-05 天津珞雍空间信息研究院有限公司 A kind of space and time continuous PM2.5 inversion method based on ground and moonscope

Also Published As

Publication number Publication date
CN110595968A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN110595968B (en) PM2.5 concentration estimation method based on geostationary orbit satellite
CN109580003B (en) Method for estimating near-ground atmospheric temperature by thermal infrared data of stationary meteorological satellite
CN109213964B (en) Satellite AOD product correction method fusing multi-source characteristic geographic parameters
CN110389087B (en) PM2.5 concentration satellite remote sensing estimation method in polluted weather
CN110186823B (en) Aerosol optical thickness inversion method
CN105678085B (en) A kind of PM2.5The evaluation method and system of concentration
CN110595960B (en) PM2.5 concentration remote sensing estimation method based on machine learning
CN112016696B (en) PM integrating satellite observation and ground observation 1 Concentration inversion method and system
CN110046771B (en) PM2.5 concentration prediction method and device
CN108874734B (en) Global land rainfall inversion method
CN112163375A (en) Long-time sequence near-surface ozone inversion method based on neural network
CN112785024A (en) Runoff calculation and prediction method based on watershed hydrological model
CN111079835B (en) Himapari-8 atmospheric aerosol inversion method based on deep full-connection network
CN113189014A (en) Ozone concentration estimation method fusing satellite remote sensing and ground monitoring data
CN114049570B (en) Satellite-borne remote sensing water vapor space inversion method and system based on neural network
CN113408111B (en) Atmospheric precipitation inversion method and system, electronic equipment and storage medium
CN111323352A (en) Regional PM2.5 remote sensing inversion model fusing fine particulate matter concentration data
CN110990505A (en) Loran-C ASF correction method based on neural network
AU2021105183A4 (en) A Method for Estimating Near-surface Air Temperature From Remote Sensing Data Based on Machine Learning
CN110738354A (en) Method and device for predicting particulate matter concentration, storage medium and electronic equipment
CN112308029A (en) Rainfall station and satellite rainfall data fusion method and system
CN115081557A (en) Night aerosol optical thickness estimation method and system based on ground monitoring data
CN114819737B (en) Method, system and storage medium for estimating carbon reserves of highway road vegetation
CN115166750A (en) Quantitative precipitation estimation method based on dual-polarization Doppler radar data
KR102002593B1 (en) Method and apparatus for analyzing harmful gas diffusion in a specific space

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No. a 20, Datun Road, Chaoyang District, Beijing 100101

Patentee after: Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences

Patentee after: Langfang Spatial Information Technology R&D Service Center

Patentee after: Zhongke Xingtong (Langfang) Information Technology Co.,Ltd.

Address before: No. a 20, Datun Road, Chaoyang District, Beijing 100101

Patentee before: Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences

Patentee before: Research Institute of Space Information (Langfang) of China Science

Patentee before: Zhongke Xingtong (Langfang) Information Technology Co.,Ltd.