CN110046771A - A kind of PM2.5 concentration prediction method and apparatus - Google Patents
A kind of PM2.5 concentration prediction method and apparatus Download PDFInfo
- Publication number
- CN110046771A CN110046771A CN201910340382.5A CN201910340382A CN110046771A CN 110046771 A CN110046771 A CN 110046771A CN 201910340382 A CN201910340382 A CN 201910340382A CN 110046771 A CN110046771 A CN 110046771A
- Authority
- CN
- China
- Prior art keywords
- observation point
- gtwr
- model
- concentration
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 230000003287 optical effect Effects 0.000 claims abstract description 9
- 238000005070 sampling Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000001419 dependent effect Effects 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 9
- 239000005427 atmospheric aerosol Substances 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 239000000443 aerosol Substances 0.000 abstract 2
- 239000000843 powder Substances 0.000 abstract 1
- 230000009286 beneficial effect Effects 0.000 description 7
- 238000010219 correlation analysis Methods 0.000 description 7
- 238000012544 monitoring process Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000003344 environmental pollutant Substances 0.000 description 2
- 231100000719 pollutant Toxicity 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Educational Administration (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Geometry (AREA)
- Evolutionary Computation (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention relates to a kind of PM2.5 concentration prediction method and apparatus, belong to the electric powder prediction of PM2.5 concentration value.Wherein method is the following steps are included: obtain PM2.5 concentration history data, meteorological historical data, Determination of Aerosol Optical historical data;Obtain two-dimensional position information, elevation information and the sampling time information of observation point;The parameter of 4D-GTWR model is determined according to above-mentioned data, so that it is determined that 4D-GTWR model, 4D-GTWR model are as follows:According to the 4D-GTWR model of the meteorological data of prediction, Determination of Aerosol Optical data and determination, the concentration of PM2.5 is predicted.4D-GTWR model in this method provides effective method in terms of predicting PM2.5 concentration, so that the prediction of PM2.5 concentration is more accurate, while also solving the non-stationary of four-dimensional spacetime, also demonstrating elevation information has great influence to the variation of PM2.5 concentration.
Description
Technical Field
The invention relates to a PM2.5 concentration prediction method and device, and belongs to the technical field of PM2.5 concentration value prediction.
Background
The fine particulate matter (PM2.5) not only poses serious threats to public health, but also seriously affects urban traffic and the daily life of citizens, so that the PM2.5 is taken as an atmospheric pollutant which directly affects human life and health, thereby arousing wide attention of numerous scholars on PM2.5 related inversion research work and realizing the prediction of PM2.5 concentration.
There are many methods for predicting PM2.5 in the prior art, such as: the Chinese patent application publication No. CN106056210A discloses a PM2.5 concentration value prediction method based on a hybrid neural network, which adopts historical data of PM2.5 concentration values, historical data of related indexes, meteorological historical data and PM2.5 component analysis data, and simulates the change rule of local PM2.5 concentration values through neural network segmentation to realize the prediction of the PM2.5 concentration values.
For another example: zhaoyangyang et al combine collaborative training and space-time geography weighted regression model (GTWR), propose a collaborative space-time geography weighted regression PM2.5 concentration estimation method, its provenance is "mapping science", 2016,41(12): 172-. However, the prediction accuracy of the PM2.5 concentration prediction method in the prior art still needs to be improved.
Disclosure of Invention
The invention aims to provide a PM2.5 concentration prediction method, which is used for solving the problem of low accuracy of the existing prediction method; still provide a PM2.5 concentration prediction device simultaneously for solve the problem that current prediction device's accuracy is low.
In order to achieve the above object, the present invention provides a PM2.5 concentration prediction method, including the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining the parameters of the 4D-GTWR model according to the data so as to determine the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is the PM2.5 concentration data of the observation point i, and the parameter of the 4D-GTWR model is βik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And ξi,βik(ui,vi,zi,ti) The independent variable regression coefficient of the k independent variable of the observation point i is related to the space position of the observation point i βi0(ui,vi,zi,ti) Constant term of independent variable regression coefficient for observation point i ξiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model.
In addition, the invention also provides a PM2.5 concentration prediction device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the PM2.5 concentration prediction method when executing the computer program.
The beneficial effects are that: elevation information is introduced into a space-time geographic weighted regression model, a four-dimensional space-time geographic weighted regression model (4D-GTWR) is provided, four-dimensional space Euclidean distance measurement is adopted, four-dimensional space-time distribution characteristics can be better reflected, a more real space distance is provided, and therefore effective analysis is conducted on actual conditions, and the 4D-GTWR model provides an effective method in the aspect of predicting PM2.5 concentration, so that the PM2.5 concentration is more accurately predicted, meanwhile, the non-stationarity of four-dimensional space-time is solved, and the fact that the elevation information has important influence on PM2.5 concentration change is verified.
Further, in the method and the apparatus for predicting the PM2.5 concentration, the method for calculating the regression coefficient of the independent variable includes: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficientsComprises the following steps:
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
The beneficial effects are that: the solution of the independent variable regression coefficient is carried out by establishing the objective function, and the method is simple and accurate.
Further, in the method and the device for predicting the PM2.5 concentration, the kernel function is a Gaussian kernel function, and the formula is as follows:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The beneficial effects are that: the four-dimensional space-time weight matrix can be more accurately obtained through the Gaussian kernel function.
Further, in the method and the device for predicting the PM2.5 concentration, the elevation information is digital elevation information, and a calculation formula of a four-dimensional space-time distance between the observation point i and the observation point j is as follows:
wherein,the two-dimensional space distance between the observation point i and the observation point j is obtained;the digital elevation space distance between the observation point i and the observation point j is obtained;the time distance between the observation point i and the observation point j is obtained; lambda, lambda,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
The beneficial effects are that: the method for calculating the four-dimensional space-time distance is simple and accurate.
Further, in the PM2.5 concentration prediction method and apparatus, the four-dimensional space-time bandwidth is a two-dimensional space bandwidth h based onS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
The beneficial effects are that: and respectively solving the two-dimensional space bandwidth, the digital elevation space bandwidth and the time bandwidth, and then obtaining the four-dimensional space-time bandwidth according to the two-dimensional space bandwidth, the digital elevation space bandwidth, the time bandwidth and the four-dimensional space-time bandwidth, so that the obtained four-dimensional space-time bandwidth is more accurate.
Further, in the PM2.5 concentration prediction method and apparatus, a two-dimensional spatial bandwidth, a digital elevation spatial bandwidth, and a time bandwidth are calculated according to a akage pool information amount criterion, where the akage pool information amount criterion AICc is:
wherein σ2Is an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap matrix of the 4D-GTWR model, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
The beneficial effects are that: more accurate bandwidth data can be obtained through the Chichi information quantity criterion, and the accuracy of PM2.5 prediction is further improved.
Furthermore, in the method and the device for predicting the PM2.5 concentration, the unbiased estimation sigma of the random error variance in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRIs the sum of the squares of the residuals of the 4D-GTWR model.
Further, in the PM2.5 concentration prediction method and apparatus, the residual sum of squares R of the residuals of the 4D-GTWR modelSS4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
Further, in the PM2.5 concentration prediction method and apparatus, the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
The beneficial effects are that: the meteorological data are all related to the concentration of PM2.5, and the concentration of PM2.5 can be more comprehensively predicted through the meteorological data.
Drawings
FIG. 1 is a schematic diagram of a four-dimensional spatiotemporal distance between an observation point i and an observation point j according to the present invention;
FIG. 2 is a diagram of a PM2.5 estimated value and an observed value using a GTWR model in the prior art;
FIG. 3 is a diagram of the relationship between the PM2.5 estimated value and the observed value by using the 4D-GTWR model.
Detailed Description
Embodiment of PM2.5 concentration prediction method:
the method for predicting the concentration of PM2.5 provided by the embodiment includes the following steps:
1) and acquiring historical PM2.5 concentration data, meteorological historical data and historical data of the optical thickness of the atmospheric aerosol.
Different gas phases and different atmospheric aerosol optical thicknesses (AOD data) will have different effects on PM2.5 concentration.
In the present embodiment, the weather history data includes weather factors such as air pressure (h Pa), air temperature (°), relative humidity (%), rainfall (mm), wind speed (km/h), and wind direction (°). However, since the wind speed, wind direction, and relative humidity greatly affect the PM2.5 concentration, only the wind speed, wind direction, and relative humidity may be used in the meteorological history data.
The PM2.5 concentration historical data is obtained by detecting an automatic air quality pollution monitoring station (hereinafter referred to as a monitoring station) and is PM2.5 mean concentration monitoring data per hour, and the meteorological historical data is obtained by detecting a meteorological station and is mean data per hour. The data can be downloaded from a national weather information center website in China, and the value of the weather factor of the PM2.5 monitoring site is interpolated from the data of the surrounding weather sites by Krigin due to the difference between the geographical positions of the weather sites and the PM2.5 monitoring site.
The AOD data is obtained by inversion of a medium resolution imaging spectrometer (MODIS), the MYD04_3K data is subjected to geometric correction by IDL programming and is converted into a WGS-84 geographical coordinate system, and Arcpy is used for extracting AOD data which is matched with a PM2.5 detection station in a time-space mode. MODIS is a common sensor for inverting the optical thickness of atmospheric aerosol, has a scanning width of 2330km, and can obtain global observation data at least once a day. The MODIS has 36 spectral channels, the spectral range is 0.4-14 μm, the spatial resolution can be 250m, 500m and 1000m, and the MODIS can be used for acquiring data of AOD, steam, surface temperature, ocean and the like. AOD data in MODIS data is provided for data download by NASA LAADS.
2) Two-dimensional position information, digital elevation information (i.e., DEM information), and sampling time information of the observation point are acquired.
The two-dimensional position information is position information of an observation point, the sampling time is 2017.12-2018.2, digital elevation information (namely DEM information) is adopted as height information of a three-dimensional space in the embodiment, DEM data (namely DEM information) is derived from a geographic space data cloud platform of a computer network information center of the Chinese academy of sciences, the spatial resolution is 30m, and the DEM data is 2009 data. As other embodiments, other elevation information may be used, as the invention is not limited in this respect.
3) And determining the parameters of the 4D-GTWR model according to the data, thereby determining the 4D-GTWR model.
The 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable of the observation point i is PM2.5 concentration data, and the parameter of the 4D-GTWR model is βik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And ξi,βik(ui,vi,zi,ti) The independent variable regression coefficient of the k independent variable of the observation point i is related to the space position of the observation point i βi0(ui,vi,zi,ti) Constant term of independent variable regression coefficient for observation point i ξiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing digital elevation space coordinates (t)i) Representing a time coordinate; n is the number of observation points.
Solving the model is mainly to the independent variable regression coefficient βik(ui,vi,zi,ti) Solution of ξiFor random error at observation point i, obey ξi-N(0,ω2) Independently and identically distributed, and Cov (ξ)i,ξj) 0(i ≠ j) (i.e., random variable ξ)iAnd ξjHave the same probability distribution and are independent of each other).
Least squares estimation of the independent variable regression coefficients of observation point i asIs composed of βik(ui,vi,zi,ti) Composed matrix, βi0(ui,vi,zi,ti) Is thatAnd estimating the 4D-GTWR model according to a weighted least square criterion for the value of the first column in the matrix, and respectively establishing an objective function for each observation point. The objective function for observation point i is as follows:
wherein, wij STIs a kernel function (i.e., weight) between observation point i and observation point j, and is related to the four-dimensional space-time distance.
Least squares estimation of independent variable regression coefficientsCan be expressed as:
wherein, W (u)i,vi,zi,ti) Representing a four-dimensional space-time weight matrix, is represented by wij STThe formed matrix, X is independent variable matrix, and 1 in the X matrix is βi0Corresponding independent variable xi0The general value is 1, Y is the observation value matrix (here, the observed actual PM2.5 value), and the dependent variable estimation value of the observation point i can be obtained through the above calculationComprises the following steps:
wherein, XiThe vector representing the ith row of the matrix X (i.e., the row vector consisting of the arguments of observation point i). Thus, the dependent variable regression vector (i.e., the predicted outcome) at each observation pointComprises the following steps:
where S is the hat matrix of the 4D-GTWR model.
In this example wij STThe kernel function is a Gaussian kernel function, as a further embodiment, wij STThe kernel function may also be a Bi-square type kernel function, and the invention is not limited to the type of kernel function.
The Gaussian kernel function formula is:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The invention provides an analysis and calculation method of four-dimensional space-time distance in consideration of the heterogeneity of four-dimensional space and one-dimensional time, and a construction method of the time distance and the three-dimensional space distance of a 4D-GTWR model is explained below.
Considering that DEM mainly has some influence on spatial dimension, based on the present study on PM2.5 spatio-temporal variation, PThe pollution process of pollutants such as M2.5 and the like has four-dimensional space-time regional variation, namely, four-dimensional space-time non-stationarity variation. Therefore, for any number of monitoring points, three-dimensional spatial coordinates are differentiated between them. Considering that two-three-dimensional space may have different scale effects, the 4D-GTWR model is introducedTo represent the difference in scale between them. Therefore, as can be seen from the euclidean distance, the three-dimensional distance d between the observation point i and the observation point j is obtainedij sCan be expressed as follows:
wherein,various operators may be represented. On the basis, a space-time distance construction method is fused, and the four-dimensional space-time distance is expressed as follows:
scale effect symbolsAndin general, the four-dimensional spatio-temporal distance D between the observation point i (i.e. the regression point in fig. 1) and the observation point j (i.e. the proximity point in fig. 1) in the 4D-GTWR model can be obtained by linear combination of the four-dimensional spatio-temporal distances, as shown in fig. 1ij STIs represented as follows:
wherein,the two-dimensional space distance between the observation point i and the observation point j is obtained;a digital elevation space distance between the observation point i and the observation point j;the time distance between observation point i and observation point j, λ,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances. Where u corresponds to the X-axis of the coordinates in the graph, v corresponds to the Y-axis of the coordinates in the graph, Z corresponds to the Z-axis of the coordinates in the graph, and T corresponds to the T-axis of the coordinates in the graph.
In the kernel function, the selection of bandwidth parameters has a great influence on the fitting result of the model, the over-fitting of the estimation result can be caused if the bandwidth is too small, and the inaccurate estimation result can be caused if the bandwidth is too large. Therefore, selecting appropriate bandwidth parameters is crucial to the accuracy of model estimation. Applied to the present embodiment, that is, the four-dimensional space-time bandwidth h4D-STThe accuracy of the 4D-GTWR model estimation is crucial. Four-dimensional space-time bandwidth h4D-STIs based on a two-dimensional spatial bandwidth hS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
D to the aboveij STBringing inIt can be derived that:
wherein, wij S2d、wij SDEM、wij TThe two-dimensional spatial weight, the DEM spatial weight and the temporal weight are respectively expressed, so that the four-dimensional space-time weight matrix of the 4D-GTWR model is expressed as follows:
from the above W (u)i,vi,zi,ti) The expression shows that the four-dimensional space-time weight matrix is related to two-dimensional space weight, DEM space weight and time weight, and is used for solvingNeeds to solve for hS2d、hSDEM、hT、λ、δ、μ。
In this embodiment, h is calculated based on the Chichi information criterion (AICc)S2d、hSDEM、hT、λ、δ and μmay be calculated according to CV criteria (cross validation method) as another embodiment, and the calculation process of the bandwidth and the scaling factor is not limited by the present invention.
The calculation process of AICc is:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein σ2For unbiased estimation of random error variance in 4D-GTWR models, RSS4D-GTWRIs the residual sum of squares of the 4D-GTWR model, and I is the identity matrix.
RSS4D-GTWRThe specific calculation process is as follows:
wherein,
assumed fitting valueIs E (y)i) Unbiased estimation of, i.e.Then:
and E (ε)Tε)=σ2I. Then RSS4D-GTWRThe table can also be represented as:
wherein the degree of freedom fd is n-2tr (S) + tr (S)TS) so as to obtain unbiased variance estimation sigma of random error terms in the 4D-GTWR model2Comprises the following steps:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS))。
as can be seen from the above formula, the hat matrix S of the 4D-GTWR model is related to the unknown parameter W (u)i,vi,zi,ti) Of the matrix of (c), then AICc is with respect to W (u) onlyi,vi,zi,ti) A function of, however, W (u)i,vi,zi,ti) Is a function for each bandwidth, so the final AICc is a function for each bandwidth, and each bandwidth and scaling factor corresponding to solving the lowest AICc value is each required bandwidth and scaling factor.
The scale adjustment factor lambda,Relating to two-dimensional space, λ, δ to DEM space, μ to time, in general, λ, δ,Set as a constant m, in the process of calculating δ, μ and each bandwidth, δ, μ and each bandwidth range are first set (the bandwidth range setting is based on the fact that it is continuously tested in the calculation process, and is related to the four-dimensional space-time distance).
The specific solving process is as follows:
the first step is as follows: will be lambda andset to m, then set to hS2dDetermining h corresponding to the minimum AICc valueS2d;
The second step is that: will be lambda andset to m, then set to μ and hTDetermining the mu and h corresponding to the minimum AICc valueT;
The third step: will be lambda andset to m, then set to delta and hSDEMDetermining the delta and h corresponding to the minimum AICc valueSDEM。
4) The calculation can be used for solving W (u) in a reverse modei,vi,zi,ti) Then, the hat matrix S of the 4D-GTWR model is solved, and finally, a prediction result is obtained
Through the calculation, all parameters in the 4D-GTWR model are determined, the 4D-GTWR model can be finally determined, and when PM2.5 concentration prediction is carried out, predicted meteorological data are substituted into the 4D-GTWR model, so that the PM2.5 concentration of each observation area can be predicted.
The method is verified by specific historical data as follows:
taking a certain observation point as an example for verification, and setting the certain observation point as (u)1,v1,z1,t1) The other two observation points are (u)2,v2,z2,t2) And (u)3,v3,z3,t3) Independent variables of andthe historical data of dependent variables are as shown in table one:
table-history data
The data are only a part of historical data, and at least 100 observation points are needed in the process of determining the 4D-GTWR model, and the data are not listed here because of too much data.
The space-time coordinates of the three observation points are shown as a second table, and the two-dimensional position information is the abscissa and the ordinate in the WGS _1984_ World _ Mercator projection coordinate system:
space-time coordinates of two observation points of table
z | t | u | v |
161(z1) | 49(t1) | 12612250.93(u1) | 4111074.699(v1) |
147(z2) | 49(t2) | 12612591.07(u2) | 4114111.191(v2) |
131(z3) | 49(t3) | 12575408.2(u3) | 4105491.236(v3) |
The historical data is brought into the model, and the accuracy of the model can be known by analyzing the correlation, wherein the correlation analysis comprises R2(correlation coefficient), RMSE (root mean square error), MAE (mean absolute error), R2The larger the RMSE and MAE, the smaller the accuracy of the model. The calculation formula is as follows:
wherein,is an estimated value (predicted value) of the PM2.5 concentration at observation point i,is the average of the observed values of PM2.5 concentrations at observation point i, yiPM2.5 concentration observations for observation point i.
The historical data are respectively brought into a GTWR model (the GTWR model is the prior art, and is not introduced too much here), and the result of the GTWR model is obtained, and comprises independent variable regression coefficients of independent variables in each observation point and dependent variable regression fitting vectors of each observation pointThe bandwidth parameter is 2.2895, thenAnd the actual vector valuey (PM2.5 observed value at each observation point) is subjected to linear correlation analysis, and as a result, as shown in fig. 2, a linear correlation analysis curve y is 1.0536x-2.3442 (the curve is a correlation analysis curve, x in the formula is PM2.5 observed value, and y is a predicted value of PM2.5), R, and R can be obtained2RMSE, MAE, in fig. 2, scattered points have both PM2.5 estimated values (predicted values) and PM2.5 observed values (measured values), the abscissa is PM2.5 (measured values), the ordinate is PM2.5 predicted values, the effect of the model can be seen by the relationship of the scattered points and the linear correlation analysis curve, and the obtained results are shown in table three:
TABLE III GTWR model
In the third table, Min is the minimum value, LQ is the lower quartile, Med is the median, UQ is the upper quartile, Max is the maximum value, the variation of the regression coefficient of each variable along with the empty position is described, intercept is the constant term of the regression coefficient of the independent variable, AOD in the table represents the independent variable regression coefficient corresponding to the independent variable AOD, relative humidity represents the independent variable regression coefficient corresponding to the independent variable relative humidity, air pressure represents the independent variable regression coefficient corresponding to the independent variable air pressure, air temperature represents the independent variable regression coefficient corresponding to the independent variable air temperature, wind speed represents the independent variable regression coefficient corresponding to the independent variable wind speed, wind direction represents the independent variable regression coefficient corresponding to the independent variable wind direction, and rainfall data is 0, so there is no relevant regression coefficient.
Respectively substituting the historical data into a 4D-GTWR model to obtain the result of the 4D-GTWR model, including the independent variable regression coefficient of independent variables in each observation point and the dependent variable regression fitting vector of each observation pointThe bandwidth parameter is 0.0063, thenWith fruitLinear correlation analysis is performed on the vector value y (PM2.5 observed value at each observation point), and as a result, as shown in fig. 3, a linear correlation analysis curve y is obtained, wherein R is 1.0253x-1.27492RMSE, MAE, as shown in fig. 3, and fig. 3 is the same as that shown in the abscissa, ordinate, scattered points and curves of fig. 2, and the results are shown in table four, without much description:
TABLE IV 4D-GTWR model
The meanings in Table four are the same as those in Table three, and it can be seen from the results that in the GTWR model, R is20.8811, RMSE 12.9579, MAE 9.5815, 4D-GTWR model, where R2R of 0.9496, RMSE 8.5931, MAE 5.9498, 4D-GTWR model2R of GTWR model2Large, the RMSE and MAE of the 4D-GTWR model are smaller than those of the GTWR model, indicating that the 4D-GTWR model is more accurate.
PM2.5 concentration prediction apparatus embodiment:
the PM2.5 concentration prediction apparatus proposed in this embodiment includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the PM2.5 concentration prediction method when executing the computer program.
The specific implementation process of the PM2.5 concentration prediction method is already described in the foregoing embodiment of the PM2.5 concentration prediction method, and is not described herein again.
Claims (10)
1. A PM2.5 concentration prediction method is characterized by comprising the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining parameters of a 4D-GTWR model according to the data, thereby determining the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is the PM2.5 concentration data of the observation point i, and the parameter of the 4D-GTWR model is βik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And ξi,βik(ui,vi,zi,ti) The independent variable regression coefficient of the k independent variable of the observation point i is related to the space position of the observation point i βi0(ui,vi,zi,ti) Constant term of independent variable regression coefficient for observation point i ξiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model.
2. The PM2.5 concentration prediction method according to claim 1, wherein the method of calculating the regression coefficient of the independent variable includes: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficientsComprises the following steps:
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
3. The PM2.5 concentration prediction method according to claim 2, wherein the kernel function is a Gaussian kernel function, and the formula is:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
4. The PM2.5 concentration prediction method according to claim 3, wherein the elevation information is digital elevation information, and the four-dimensional space-time distance between the observation point i and the observation point j is calculated by the following formula:
wherein,the two-dimensional space distance between the observation point i and the observation point j is obtained;the digital elevation space distance between the observation point i and the observation point j is obtained;the time distance between the observation point i and the observation point j is obtained; lambda, lambda,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
5. The PM2.5 concentration prediction method of claim 4, wherein the four-dimensional space-time bandwidth is a two-dimensional space bandwidth h based onS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
6. The PM2.5 concentration prediction method according to claim 5, wherein the two-dimensional spatial bandwidth, the digital elevation spatial bandwidth and the time bandwidth are calculated according to the akage information amount criterion, and the akage information amount criterion AICc is:
wherein σ2Is an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap matrix of the 4D-GTWR model, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
7. The PM2.5 concentration prediction method of claim 6, wherein the unbiased estimation of random error variance σ in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRAs residues of the 4D-GTWR modelThe difference is the sum of squares.
8. The PM2.5 concentration prediction method according to claim 7, wherein the Residual Sum of Squares (RSS) of the residuals of the 4D-GTWR model4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
9. The PM2.5 concentration prediction method according to claim 1, wherein the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
10. A PM2.5 concentration prediction apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the PM2.5 concentration prediction method according to any one of claims 1 to 9 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340382.5A CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340382.5A CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110046771A true CN110046771A (en) | 2019-07-23 |
CN110046771B CN110046771B (en) | 2021-04-16 |
Family
ID=67279470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910340382.5A Active CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110046771B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111256745A (en) * | 2020-02-28 | 2020-06-09 | 芜湖职业技术学院 | Data calibration method for portable air quality monitor |
CN111307199A (en) * | 2019-12-05 | 2020-06-19 | 北京普源精电科技有限公司 | Electronic measuring instrument with prediction device and prediction method of electronic measuring instrument |
CN111896680A (en) * | 2020-07-08 | 2020-11-06 | 天津师范大学 | Greenhouse gas emission analysis method and system based on satellite remote sensing data |
CN112013822A (en) * | 2020-07-22 | 2020-12-01 | 武汉智图云起科技有限公司 | Multispectral remote sensing water depth inversion method based on improved GWR model |
CN112990609A (en) * | 2021-04-30 | 2021-06-18 | 中国测绘科学研究院 | Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression |
CN113538239A (en) * | 2021-07-12 | 2021-10-22 | 浙江大学 | Interpolation method based on space-time autoregressive neural network model |
CN113901348A (en) * | 2021-11-10 | 2022-01-07 | 江苏省血吸虫病防治研究所 | Oncomelania snail distribution influence factor identification and prediction method based on mathematical model |
CN114511087A (en) * | 2022-04-19 | 2022-05-17 | 四川国蓝中天环境科技集团有限公司 | Air quality space inference method and system based on double models |
CN117055087A (en) * | 2023-10-10 | 2023-11-14 | 中国电建集团西北勘测设计研究院有限公司 | GNSS real-time positioning and resolving method for haze influence area |
CN117786618A (en) * | 2024-02-27 | 2024-03-29 | 四川国蓝中天环境科技集团有限公司 | Application method of regional pollution transmission evaluation method in environment control |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184012A (en) * | 2015-09-28 | 2015-12-23 | 宁波大学 | Method for predicting PM2.5 concentration of regional air |
CN106056210A (en) * | 2016-06-07 | 2016-10-26 | 浙江工业大学 | PM 2.5 concentration value prediction method based on hybrid neural network |
CN106407633A (en) * | 2015-07-30 | 2017-02-15 | 中国科学院遥感与数字地球研究所 | Method and system for estimating ground PM2.5 based on space-time regression Kriging model |
CN107103392A (en) * | 2017-05-24 | 2017-08-29 | 北京航空航天大学 | A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression |
CN107133686A (en) * | 2017-03-30 | 2017-09-05 | 大连理工大学 | City-level PM2.5 concentration prediction methods based on Spatio-Temporal Data Model for Spatial |
CN108763756A (en) * | 2018-05-28 | 2018-11-06 | 河南工业大学 | A kind of aerosol optical depth and PM2.5 invertings correction method and its system |
-
2019
- 2019-04-25 CN CN201910340382.5A patent/CN110046771B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106407633A (en) * | 2015-07-30 | 2017-02-15 | 中国科学院遥感与数字地球研究所 | Method and system for estimating ground PM2.5 based on space-time regression Kriging model |
CN105184012A (en) * | 2015-09-28 | 2015-12-23 | 宁波大学 | Method for predicting PM2.5 concentration of regional air |
CN106056210A (en) * | 2016-06-07 | 2016-10-26 | 浙江工业大学 | PM 2.5 concentration value prediction method based on hybrid neural network |
CN107133686A (en) * | 2017-03-30 | 2017-09-05 | 大连理工大学 | City-level PM2.5 concentration prediction methods based on Spatio-Temporal Data Model for Spatial |
CN107103392A (en) * | 2017-05-24 | 2017-08-29 | 北京航空航天大学 | A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression |
CN108763756A (en) * | 2018-05-28 | 2018-11-06 | 河南工业大学 | A kind of aerosol optical depth and PM2.5 invertings correction method and its system |
Non-Patent Citations (4)
Title |
---|
QINGQING HE: "Satellite-based high-resolution PM2.5 estimation over the Beijing-Tianjin-Hebei region of China using an improved geographically and temporally weighted regression model", 《ENVIRONMENTAL POLLUTION》 * |
WEIDONG LI: "Environmental Study on Contribution Rates of Aerosol Scale Height and Humidity in PM2.5 Inversion Based on Calipso Data", 《FOUNDATION ENVIRONMENTAL PROTECTION & RESEARCH》 * |
卢月明: "一种基于主成分分析的时空地理加权回归方法", 《测绘科学技术学报》 * |
赵阳阳: "一种协同时空地理加权回归PM2.5浓度估算方法", 《测绘科学》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111307199A (en) * | 2019-12-05 | 2020-06-19 | 北京普源精电科技有限公司 | Electronic measuring instrument with prediction device and prediction method of electronic measuring instrument |
CN111256745A (en) * | 2020-02-28 | 2020-06-09 | 芜湖职业技术学院 | Data calibration method for portable air quality monitor |
CN111896680B (en) * | 2020-07-08 | 2022-07-05 | 天津师范大学 | Greenhouse gas emission analysis method and system based on satellite remote sensing data |
CN111896680A (en) * | 2020-07-08 | 2020-11-06 | 天津师范大学 | Greenhouse gas emission analysis method and system based on satellite remote sensing data |
CN112013822A (en) * | 2020-07-22 | 2020-12-01 | 武汉智图云起科技有限公司 | Multispectral remote sensing water depth inversion method based on improved GWR model |
CN112990609A (en) * | 2021-04-30 | 2021-06-18 | 中国测绘科学研究院 | Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression |
CN112990609B (en) * | 2021-04-30 | 2021-09-14 | 中国测绘科学研究院 | Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression |
CN113538239A (en) * | 2021-07-12 | 2021-10-22 | 浙江大学 | Interpolation method based on space-time autoregressive neural network model |
CN113538239B (en) * | 2021-07-12 | 2024-03-19 | 浙江大学 | Interpolation method based on space-time autoregressive neural network model |
CN113901348A (en) * | 2021-11-10 | 2022-01-07 | 江苏省血吸虫病防治研究所 | Oncomelania snail distribution influence factor identification and prediction method based on mathematical model |
CN114511087A (en) * | 2022-04-19 | 2022-05-17 | 四川国蓝中天环境科技集团有限公司 | Air quality space inference method and system based on double models |
CN114511087B (en) * | 2022-04-19 | 2022-07-01 | 四川国蓝中天环境科技集团有限公司 | Air quality space inference method and system based on double models |
CN117055087A (en) * | 2023-10-10 | 2023-11-14 | 中国电建集团西北勘测设计研究院有限公司 | GNSS real-time positioning and resolving method for haze influence area |
CN117055087B (en) * | 2023-10-10 | 2024-01-30 | 中国电建集团西北勘测设计研究院有限公司 | GNSS real-time positioning and resolving method for haze influence area |
CN117786618A (en) * | 2024-02-27 | 2024-03-29 | 四川国蓝中天环境科技集团有限公司 | Application method of regional pollution transmission evaluation method in environment control |
CN117786618B (en) * | 2024-02-27 | 2024-05-07 | 四川国蓝中天环境科技集团有限公司 | Application method of regional pollution transmission evaluation method in environment control |
Also Published As
Publication number | Publication date |
---|---|
CN110046771B (en) | 2021-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110046771B (en) | PM2.5 concentration prediction method and device | |
CN106407633B (en) | Method and system based on space regression Kriging model estimation ground PM2.5 | |
Cordero et al. | Using statistical methods to carry out in field calibrations of low cost air quality sensors | |
Banks et al. | Performance evaluation of the boundary-layer height from lidar and the Weather Research and Forecasting model at an urban coastal site in the north-east Iberian Peninsula | |
CN113901384B (en) | Ground PM2.5 concentration modeling method considering global spatial autocorrelation and local heterogeneity | |
CN109784552B (en) | Re-ESF algorithm-based construction method of space variable coefficient PM2.5 concentration estimation model | |
CN110232471B (en) | Rainfall sensor network node layout optimization method and device | |
CN106404620A (en) | Method and system for inverting ground PM 2.5 through combination of geostatistical interpolation and satellite remote sensing | |
CN110595968B (en) | PM2.5 concentration estimation method based on geostationary orbit satellite | |
WO2020044127A1 (en) | Atmospheric pollution forecasting method | |
CN110174359A (en) | A kind of Airborne Hyperspectral image heavy metal-polluted soil concentration evaluation method returned based on Gaussian process | |
CN110595960B (en) | PM2.5 concentration remote sensing estimation method based on machine learning | |
CN107909192B (en) | Estimation method and device for heavy metal content in soil | |
CN109579774B (en) | Antenna downward inclination angle measurement method based on depth instance segmentation network | |
CN114972984B (en) | Random forest-based snow space-time analysis and prediction method | |
CN115049026A (en) | Regression analysis method of space non-stationarity relation based on GSNNR | |
CN117669270A (en) | Method for correcting space-time consistency of micro-station networking data | |
CN113901348A (en) | Oncomelania snail distribution influence factor identification and prediction method based on mathematical model | |
CN110321528A (en) | A kind of Hyperspectral imaging heavy metal-polluted soil concentration evaluation method based on semi-supervised geographical space regression analysis | |
CN116466368B (en) | Dust extinction coefficient profile estimation method based on laser radar and satellite data | |
CN113049606A (en) | Large-area high-precision insulator pollution distribution assessment method | |
CN117037449A (en) | Group fog monitoring method and system based on edge calculation | |
Liu et al. | Gridded statistical downscaling based on interpolation of parameters and predictor locations for summer daily precipitation in North China | |
CN113191536A (en) | Near-ground environment element prediction model training and prediction method based on machine learning | |
CN114239250A (en) | System and method for territorial space planning design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |