CN110046771B - PM2.5 concentration prediction method and device - Google Patents
PM2.5 concentration prediction method and device Download PDFInfo
- Publication number
- CN110046771B CN110046771B CN201910340382.5A CN201910340382A CN110046771B CN 110046771 B CN110046771 B CN 110046771B CN 201910340382 A CN201910340382 A CN 201910340382A CN 110046771 B CN110046771 B CN 110046771B
- Authority
- CN
- China
- Prior art keywords
- observation point
- gtwr
- model
- concentration
- dimensional space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 239000005427 atmospheric aerosol Substances 0.000 claims abstract description 9
- 230000003287 optical effect Effects 0.000 claims abstract description 9
- 238000005070 sampling Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000001419 dependent effect Effects 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 230000009286 beneficial effect Effects 0.000 description 7
- 238000010219 correlation analysis Methods 0.000 description 7
- 238000012544 monitoring process Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000003344 environmental pollutant Substances 0.000 description 2
- 231100000719 pollutant Toxicity 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 239000000470 constituent Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Databases & Information Systems (AREA)
- Algebra (AREA)
- Tourism & Hospitality (AREA)
- Software Systems (AREA)
- Human Resources & Organizations (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Quality & Reliability (AREA)
- Primary Health Care (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a PM2.5 concentration prediction method and device, and belongs to the technical field of PM2.5 concentration value prediction. The method comprises the following steps: acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data; acquiring two-dimensional position information, elevation information and sampling time information of an observation point; determining the parameters of the 4D-GTWR model according to the data so as to determine the 4D-GTWR model, wherein the 4D-GTWR model is as follows:and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model. The 4D-GTWR model in the method provides an effective method for predicting the PM2.5 concentration, so that the PM2.5 concentration is more accurately predicted, meanwhile, the non-stationarity of four-dimensional space-time is solved, and the important influence of elevation information on the PM2.5 concentration change is verified.
Description
Technical Field
The invention relates to a PM2.5 concentration prediction method and device, and belongs to the technical field of PM2.5 concentration value prediction.
Background
The fine particulate matter (PM2.5) not only poses serious threats to public health, but also seriously affects urban traffic and the daily life of citizens, so that the PM2.5 is taken as an atmospheric pollutant which directly affects human life and health, thereby arousing wide attention of numerous scholars on PM2.5 related inversion research work and realizing the prediction of PM2.5 concentration.
There are many methods for predicting PM2.5 in the prior art, such as: the Chinese patent application publication No. CN106056210A discloses a PM2.5 concentration value prediction method based on a hybrid neural network, which adopts historical data of PM2.5 concentration values, historical data of related indexes, meteorological historical data and PM2.5 component analysis data, and simulates the change rule of local PM2.5 concentration values through neural network segmentation to realize the prediction of the PM2.5 concentration values.
For another example: zhaoyangyang et al combine collaborative training and space-time geography weighted regression model (GTWR), propose a collaborative space-time geography weighted regression PM2.5 concentration estimation method, its provenance is "mapping science", 2016,41(12): 172-. However, the prediction accuracy of the PM2.5 concentration prediction method in the prior art still needs to be improved.
Disclosure of Invention
The invention aims to provide a PM2.5 concentration prediction method, which is used for solving the problem of low accuracy of the existing prediction method; still provide a PM2.5 concentration prediction device simultaneously for solve the problem that current prediction device's accuracy is low.
In order to achieve the above object, the present invention provides a PM2.5 concentration prediction method, including the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining the parameters of the 4D-GTWR model according to the data so as to determine the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is a dependent variable of an observation point i, and the dependent variable is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model.
In addition, the invention also provides a PM2.5 concentration prediction device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the PM2.5 concentration prediction method when executing the computer program.
The beneficial effects are that: elevation information is introduced into a space-time geographic weighted regression model, a four-dimensional space-time geographic weighted regression model (4D-GTWR) is provided, four-dimensional space Euclidean distance measurement is adopted, four-dimensional space-time distribution characteristics can be better reflected, a more real space distance is provided, and therefore effective analysis is conducted on actual conditions, and the 4D-GTWR model provides an effective method in the aspect of predicting PM2.5 concentration, so that the PM2.5 concentration is more accurately predicted, meanwhile, the non-stationarity of four-dimensional space-time is solved, and the fact that the elevation information has important influence on PM2.5 concentration change is verified.
Further, in the method and the apparatus for predicting the PM2.5 concentration, the method for calculating the regression coefficient of the independent variable includes: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficientsComprises the following steps:
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
The beneficial effects are that: the solution of the independent variable regression coefficient is carried out by establishing the objective function, and the method is simple and accurate.
Further, in the method and the device for predicting the PM2.5 concentration, the kernel function is a Gaussian kernel function, and the formula is as follows:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The beneficial effects are that: the four-dimensional space-time weight matrix can be more accurately obtained through the Gaussian kernel function.
Further, in the method and the device for predicting the PM2.5 concentration, the elevation information is digital elevation information, and a calculation formula of a four-dimensional space-time distance between the observation point i and the observation point j is as follows:
wherein the content of the first and second substances,the two-dimensional space distance between the observation point i and the observation point j is obtained;the digital elevation space distance between the observation point i and the observation point j is obtained;the time distance between the observation point i and the observation point j is obtained; lambda, lambda,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
The beneficial effects are that: the method for calculating the four-dimensional space-time distance is simple and accurate.
Further, in the PM2.5 concentration prediction method and apparatus, the four-dimensional space-time bandwidth is a two-dimensional space bandwidth h based onS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
The beneficial effects are that: and respectively solving the two-dimensional space bandwidth, the digital elevation space bandwidth and the time bandwidth, and then obtaining the four-dimensional space-time bandwidth according to the two-dimensional space bandwidth, the digital elevation space bandwidth, the time bandwidth and the four-dimensional space-time bandwidth, so that the obtained four-dimensional space-time bandwidth is more accurate.
Further, in the PM2.5 concentration prediction method and apparatus, a two-dimensional spatial bandwidth, a digital elevation spatial bandwidth, and a time bandwidth are calculated according to a akage pool information amount criterion, where the akage pool information amount criterion AICc is:
wherein σ2Is an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap matrix of the 4D-GTWR model, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
The beneficial effects are that: more accurate bandwidth data can be obtained through the Chichi information quantity criterion, and the accuracy of PM2.5 prediction is further improved.
Furthermore, in the method and the device for predicting the PM2.5 concentration, the unbiased estimation sigma of the random error variance in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRIs the sum of the squares of the residuals of the 4D-GTWR model.
Further, in the PM2.5 concentration prediction method and apparatus, the 4DResidual sum of squares RSS for GTWR model4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
Further, in the PM2.5 concentration prediction method and apparatus, the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
The beneficial effects are that: the meteorological data are all related to the concentration of PM2.5, and the concentration of PM2.5 can be more comprehensively predicted through the meteorological data.
Drawings
FIG. 1 is a schematic diagram of a four-dimensional spatiotemporal distance between an observation point i and an observation point j according to the present invention;
FIG. 2 is a diagram of a PM2.5 estimated value and an observed value using a GTWR model in the prior art;
FIG. 3 is a diagram of the relationship between the PM2.5 estimated value and the observed value by using the 4D-GTWR model.
Detailed Description
Embodiment of PM2.5 concentration prediction method:
the method for predicting the concentration of PM2.5 provided by the embodiment includes the following steps:
1) and acquiring historical PM2.5 concentration data, meteorological historical data and historical data of the optical thickness of the atmospheric aerosol.
Different gas phases and different atmospheric aerosol optical thicknesses (AOD data) will have different effects on PM2.5 concentration.
In the present embodiment, the weather history data includes weather factors such as air pressure (h Pa), air temperature (°), relative humidity (%), rainfall (mm), wind speed (km/h), and wind direction (°). However, since the wind speed, wind direction, and relative humidity greatly affect the PM2.5 concentration, only the wind speed, wind direction, and relative humidity may be used in the meteorological history data.
The PM2.5 concentration historical data is obtained by detecting an automatic air quality pollution monitoring station (hereinafter referred to as a monitoring station) and is PM2.5 mean concentration monitoring data per hour, and the meteorological historical data is obtained by detecting a meteorological station and is mean data per hour. The data can be downloaded from a national weather information center website in China, and the value of the weather factor of the PM2.5 monitoring site is interpolated from the data of the surrounding weather sites by Krigin due to the difference between the geographical positions of the weather sites and the PM2.5 monitoring site.
The AOD data is obtained by inversion of a medium resolution imaging spectrometer (MODIS), the MYD 04_3K data is subjected to geometric correction by adopting IDL program design and is converted into a WGS-84 geographical coordinate system, and the AOD data which is matched with a PM2.5 detection station in a time-space mode is extracted by utilizing Arcpy. MODIS is a common sensor for inverting the optical thickness of atmospheric aerosol, has a scanning width of 2330km, and can obtain global observation data at least once a day. The MODIS has 36 spectral channels, the spectral range is 0.4-14 μm, the spatial resolution can be 250m, 500m and 1000m, and the MODIS can be used for acquiring data of AOD, steam, surface temperature, ocean and the like. AOD data in MODIS data is provided for data download by NASA LAADS.
2) Two-dimensional position information, digital elevation information (i.e., DEM information), and sampling time information of the observation point are acquired.
The two-dimensional position information is position information of an observation point, the sampling time is 2017.12-2018.2, digital elevation information (namely DEM information) is adopted as height information of a three-dimensional space in the embodiment, DEM data (namely DEM information) is derived from a geographic space data cloud platform of a computer network information center of the Chinese academy of sciences, the spatial resolution is 30m, and the DEM data is 2009 data. As other embodiments, other elevation information may be used, as the invention is not limited in this respect.
3) And determining the parameters of the 4D-GTWR model according to the data, thereby determining the 4D-GTWR model.
The 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiTo watchA dependent variable of a measuring point i is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing digital elevation space coordinates (t)i) Representing a time coordinate; n is the number of observation points.
Solving the model mainly comprises solving the independent variable regression coefficient betaik(ui,vi,zi,ti) Solution of, xiiFor random errors at observation point i, obey xii-N(0,ω2) Independently and identically distributed, and Cov (ξ)i,ξj) 0(i ≠ j) (i.e., the random variable ξ)iAnd xijHave the same probability distribution and are independent of each other).
Least squares estimation of the independent variable regression coefficients of observation point i asIs formed by betaik(ui,vi,zi,ti) Constituent matrix, betai0(ui,vi,zi,ti) Is thatAnd estimating the 4D-GTWR model according to a weighted least square criterion for the value of the first column in the matrix, and respectively establishing an objective function for each observation point. The objective function for observation point i is as follows:
wherein, wij STIs a kernel function (i.e., weight) between observation point i and observation point j, and is related to the four-dimensional space-time distance.
wherein, W (u)i,vi,zi,ti) Representing a four-dimensional space-time weight matrix, is represented by wij STA matrix composed of X independent variable matrix with 1 in X matrix being betai0Corresponding independent variable xi0The general value is 1, Y is the observation value matrix (here, the observed actual PM2.5 value), and the dependent variable estimation value of the observation point i can be obtained through the above calculationComprises the following steps:
wherein, XiThe vector representing the ith row of the matrix X (i.e., the row vector consisting of the arguments of observation point i). Thus, the dependent variable regression vector (i.e., the predicted outcome) at each observation pointComprises the following steps:
where S is the hat matrix of the 4D-GTWR model.
In this example wij STThe kernel function is a Gaussian kernel function, as a further embodiment, wij STThe kernel function may also be a Bi-square type kernel function, and the invention is not limited to the type of kernel function.
The Gaussian kernel function formula is:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The invention provides an analysis and calculation method of four-dimensional space-time distance in consideration of the heterogeneity of four-dimensional space and one-dimensional time, and a construction method of the time distance and the three-dimensional space distance of a 4D-GTWR model is explained below.
Considering that the DEM mainly has certain influence on the spatial dimension, based on the research on the PM2.5 space-time variation, the pollution process of the pollutants such as the PM2.5 and the like has four-dimensional space-time regional variation, namely, four-dimensional space-time non-stationarity variation exists. Therefore, for any number of monitoring points, three-dimensional spatial coordinates are differentiated between them. Considering that two-three-dimensional space may have different scale effects, the 4D-GTWR model is introducedTo represent the difference in scale between them. Therefore, as can be seen from the euclidean distance, the three-dimensional distance d between the observation point i and the observation point j is obtainedij sCan be expressed as follows:
wherein the content of the first and second substances,various operators may be represented. On the basis, a space-time distance construction method is fused, and the four-dimensional space-time distance is expressed as follows:
scale effect symbolsAndin general, the four-dimensional spatio-temporal distance D between the observation point i (i.e. the regression point in fig. 1) and the observation point j (i.e. the proximity point in fig. 1) in the 4D-GTWR model can be obtained by linear combination of the four-dimensional spatio-temporal distances, as shown in fig. 1ij STIs represented as follows:
wherein the content of the first and second substances,the two-dimensional space distance between the observation point i and the observation point j is obtained;a digital elevation space distance between the observation point i and the observation point j;the time distance between observation point i and observation point j, λ,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances. Here u pairsThe X-axis of the coordinates in the graph, v corresponds to the Y-axis of the coordinates in the graph, Z corresponds to the Z-axis of the coordinates in the graph, and T corresponds to the T-axis of the coordinates in the graph.
In the kernel function, the selection of bandwidth parameters has a great influence on the fitting result of the model, the over-fitting of the estimation result can be caused if the bandwidth is too small, and the inaccurate estimation result can be caused if the bandwidth is too large. Therefore, selecting appropriate bandwidth parameters is crucial to the accuracy of model estimation. Applied to the present embodiment, that is, the four-dimensional space-time bandwidth h4D-STThe accuracy of the 4D-GTWR model estimation is crucial. Four-dimensional space-time bandwidth h4D-STIs based on a two-dimensional spatial bandwidth hS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
wherein, wij S2d、wij SDEM、wij TThe two-dimensional spatial weight, the DEM spatial weight and the temporal weight are respectively expressed, so that the four-dimensional space-time weight matrix of the 4D-GTWR model is expressed as follows:
from the above W (u)i,vi,zi,ti) The expression shows that the four-dimensional space-time weight matrix is related to two-dimensional space weight, DEM space weight and time weight, and is used for solvingNeeds to solve for hS2d、hSDEM、hT、λ、δ、μ。
In this embodiment, h is calculated based on the Chichi information criterion (AICc)S2d、hSDEM、hT、λ、δ and μmay be calculated according to CV criteria (cross validation method) as another embodiment, and the calculation process of the bandwidth and the scaling factor is not limited by the present invention.
The calculation process of AICc is:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein σ2As random errors in the 4D-GTWR modelUnbiased estimation of difference variance, RSS4D-GTWRIs the residual sum of squares of the 4D-GTWR model, and I is the identity matrix.
RSS4D-GTWRThe specific calculation process is as follows:
wherein the degree of freedom fd is n-2tr (S) + tr (S)TS) so as to obtain unbiased variance estimation sigma of random error terms in the 4D-GTWR model2Comprises the following steps:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS))。
as can be seen from the above formula, the hat matrix S of the 4D-GTWR model is related to the unknown parameter W (u)i,vi,zi,ti) Of the matrix of (c), then AICc is with respect to W (u) onlyi,vi,zi,ti) A function of, however, W (u)i,vi,zi,ti) Is a function of the respective bandwidth, so that the final AICc is offAnd solving each bandwidth and scale adjustment factor corresponding to the minimum AICc value in the function of each bandwidth, wherein each bandwidth and scale adjustment factor are required.
The scale adjustment factor lambda,Relating to two-dimensional space, λ, δ to DEM space, μ to time, in general, λ, δ,Set as a constant m, in the process of calculating δ, μ and each bandwidth, δ, μ and each bandwidth range are first set (the bandwidth range setting is based on the fact that it is continuously tested in the calculation process, and is related to the four-dimensional space-time distance).
The specific solving process is as follows:
the first step is as follows: will be lambda andset to m, then set to hS2dDetermining h corresponding to the minimum AICc valueS2d;
The second step is that: will be lambda andset to m, then set to μ and hTDetermining the mu and h corresponding to the minimum AICc valueT;
The third step: will be lambda andset to m, then set to delta and hSDEMDetermining the delta and h corresponding to the minimum AICc valueSDEM。
4) The calculation can be used for solving W (u) in a reverse modei,vi,zi,ti) Then, the hat matrix S of the 4D-GTWR model is solved, and finally, a prediction result is obtained
Through the calculation, all parameters in the 4D-GTWR model are determined, the 4D-GTWR model can be finally determined, and when PM2.5 concentration prediction is carried out, predicted meteorological data are substituted into the 4D-GTWR model, so that the PM2.5 concentration of each observation area can be predicted.
The method is verified by specific historical data as follows:
taking a certain observation point as an example for verification, and setting the certain observation point as (u)1,v1,z1,t1) The other two observation points are (u)2,v2,z2,t2) And (u)3,v3,z3,t3) The historical data of independent variables and dependent variables are as shown in table one:
table-history data
The data are only a part of historical data, and at least 100 observation points are needed in the process of determining the 4D-GTWR model, and the data are not listed here because of too much data.
The space-time coordinates of the three observation points are shown as a second table, and the two-dimensional position information is the abscissa and the ordinate in the WGS _1984_ World _ Mercator projection coordinate system:
space-time coordinates of two observation points of table
z | t | u | v |
161(z1) | 49(t1) | 12612250.93(u1) | 4111074.699(v1) |
147(z2) | 49(t2) | 12612591.07(u2) | 4114111.191(v2) |
131(z3) | 49(t3) | 12575408.2(u3) | 4105491.236(v3) |
The historical data is brought into the model, and the accuracy of the model can be known by analyzing the correlation, wherein the correlation analysis comprises R2(correlation coefficient), RMSE (root mean square error), MAE (mean absolute error), R2The larger the RMSE and MAE, the smaller the accuracy of the model. The calculation formula is as follows:
wherein the content of the first and second substances,is an estimated value (predicted value) of the PM2.5 concentration at observation point i,is the average of the observed values of PM2.5 concentrations at observation point i, yiPM2.5 concentration observations for observation point i.
Will be at the topThe historical data are respectively brought into a GTWR model (the GTWR model is the prior art and is not introduced too much), and the result of the GTWR model is obtained, and comprises independent variable regression coefficients of independent variables in each observation point and dependent variable regression fitting vectors of each observation pointThe bandwidth parameter is 2.2895, thenAs a result of linear correlation analysis with the actual vector value y (PM2.5 observed value at each observation point), as shown in fig. 2, a linear correlation analysis curve y is 1.0536x-2.3442 (the curve is a correlation analysis curve, x in the formula is PM2.5 observed value, y is predicted value of PM2.5), and R is obtained2RMSE, MAE, in fig. 2, scattered points have both PM2.5 estimated values (predicted values) and PM2.5 observed values (measured values), the abscissa is PM2.5 (measured values), the ordinate is PM2.5 predicted values, the effect of the model can be seen by the relationship of the scattered points and the linear correlation analysis curve, and the obtained results are shown in table three:
TABLE III GTWR model
In the third table, Min is the minimum value, LQ is the lower quartile, Med is the median, UQ is the upper quartile, Max is the maximum value, the variation of the regression coefficient of each variable along with the empty position is described, intercept is the constant term of the regression coefficient of the independent variable, AOD in the table represents the independent variable regression coefficient corresponding to the independent variable AOD, relative humidity represents the independent variable regression coefficient corresponding to the independent variable relative humidity, air pressure represents the independent variable regression coefficient corresponding to the independent variable air pressure, air temperature represents the independent variable regression coefficient corresponding to the independent variable air temperature, wind speed represents the independent variable regression coefficient corresponding to the independent variable wind speed, wind direction represents the independent variable regression coefficient corresponding to the independent variable wind direction, and rainfall data is 0, so there is no relevant regression coefficient.
Will be at the topThe historical data are respectively substituted into a 4D-GTWR model, and the result of the 4D-GTWR model is obtained and comprises the independent variable regression coefficient of independent variables in each observation point and the dependent variable regression fitting vector of each observation pointThe bandwidth parameter is 0.0063, thenLinear correlation analysis is performed on the actual vector value y (observed value PM2.5 at each observation point), and as a result, as shown in fig. 3, a linear correlation analysis curve y is obtained, wherein R is 1.0253x-1.27492RMSE, MAE, as shown in fig. 3, and fig. 3 is the same as that shown in the abscissa, ordinate, scattered points and curves of fig. 2, and the results are shown in table four, without much description:
The meanings in Table four are the same as those in Table three, and it can be seen from the results that in the GTWR model, R is20.8811, RMSE 12.9579, MAE 9.5815, 4D-GTWR model, where R2R of 0.9496, RMSE 8.5931, MAE 5.9498, 4D-GTWR model2R of GTWR model2Large, the RMSE and MAE of the 4D-GTWR model are smaller than those of the GTWR model, indicating that the 4D-GTWR model is more accurate.
PM2.5 concentration prediction apparatus embodiment:
the PM2.5 concentration prediction apparatus proposed in this embodiment includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the PM2.5 concentration prediction method when executing the computer program.
The specific implementation process of the PM2.5 concentration prediction method is already described in the foregoing embodiment of the PM2.5 concentration prediction method, and is not described herein again.
Claims (9)
1. A PM2.5 concentration prediction method is characterized by comprising the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining parameters of a 4D-GTWR model according to the data, thereby determining the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is a dependent variable of an observation point i, and the dependent variable is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model;
the method for calculating the independent variable regression coefficient comprises the following steps: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficientsComprises the following steps:
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
2. The PM2.5 concentration prediction method according to claim 1, wherein the kernel function is a Gaussian kernel function, and the formula is:
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
3. The PM2.5 concentration prediction method according to claim 2, wherein the elevation information is digital elevation information, and the four-dimensional space-time distance between the observation point i and the observation point j is calculated by the following formula:
wherein the content of the first and second substances,the two-dimensional space distance between the observation point i and the observation point j is obtained;the digital elevation space distance between the observation point i and the observation point j is obtained;the time distance between the observation point i and the observation point j is obtained; lambda, lambda,Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
4. The PM2.5 concentration prediction method of claim 3 wherein the four-dimensional space-time bandwidth is based on a two-dimensional space bandwidth hS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
5. The PM2.5 concentration prediction method according to claim 4, characterized in that the two-dimensional spatial bandwidth, the digital elevation spatial bandwidth and the time bandwidth are calculated according to the akage information amount criterion, wherein the akage information amount criterion AICc is:
wherein σ2For an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap moment for the 4D-GTWR modelArray, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
6. The PM2.5 concentration prediction method of claim 5, wherein the unbiased estimation of random error variance σ in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRIs the sum of the squares of the residuals of the 4D-GTWR model.
7. The PM2.5 concentration prediction method according to claim 6, wherein the Residual Sum of Squares (RSS) of the residuals of the 4D-GTWR model4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
8. The PM2.5 concentration prediction method according to claim 1, wherein the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
9. A PM2.5 concentration prediction apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the PM2.5 concentration prediction method according to any one of claims 1 to 8 when executing the computer program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340382.5A CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910340382.5A CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110046771A CN110046771A (en) | 2019-07-23 |
CN110046771B true CN110046771B (en) | 2021-04-16 |
Family
ID=67279470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910340382.5A Active CN110046771B (en) | 2019-04-25 | 2019-04-25 | PM2.5 concentration prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110046771B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111307199A (en) * | 2019-12-05 | 2020-06-19 | 北京普源精电科技有限公司 | Electronic measuring instrument with prediction device and prediction method of electronic measuring instrument |
CN111256745A (en) * | 2020-02-28 | 2020-06-09 | 芜湖职业技术学院 | Data calibration method for portable air quality monitor |
CN111896680B (en) * | 2020-07-08 | 2022-07-05 | 天津师范大学 | Greenhouse gas emission analysis method and system based on satellite remote sensing data |
CN112013822A (en) * | 2020-07-22 | 2020-12-01 | 武汉智图云起科技有限公司 | Multispectral remote sensing water depth inversion method based on improved GWR model |
CN112990609B (en) * | 2021-04-30 | 2021-09-14 | 中国测绘科学研究院 | Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression |
CN113538239B (en) * | 2021-07-12 | 2024-03-19 | 浙江大学 | Interpolation method based on space-time autoregressive neural network model |
CN113901348A (en) * | 2021-11-10 | 2022-01-07 | 江苏省血吸虫病防治研究所 | Oncomelania snail distribution influence factor identification and prediction method based on mathematical model |
CN114511087B (en) * | 2022-04-19 | 2022-07-01 | 四川国蓝中天环境科技集团有限公司 | Air quality space inference method and system based on double models |
CN117055087B (en) * | 2023-10-10 | 2024-01-30 | 中国电建集团西北勘测设计研究院有限公司 | GNSS real-time positioning and resolving method for haze influence area |
CN117786618B (en) * | 2024-02-27 | 2024-05-07 | 四川国蓝中天环境科技集团有限公司 | Application method of regional pollution transmission evaluation method in environment control |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106407633B (en) * | 2015-07-30 | 2019-08-13 | 中国科学院遥感与数字地球研究所 | Method and system based on space regression Kriging model estimation ground PM2.5 |
CN105184012B (en) * | 2015-09-28 | 2017-12-22 | 宁波大学 | A kind of regional air PM2.5 concentration prediction methods |
CN106056210B (en) * | 2016-06-07 | 2018-06-01 | 浙江工业大学 | A kind of PM2.5 concentration value Forecasting Methodologies based on hybrid neural networks |
CN107133686A (en) * | 2017-03-30 | 2017-09-05 | 大连理工大学 | City-level PM2.5 concentration prediction methods based on Spatio-Temporal Data Model for Spatial |
CN107103392A (en) * | 2017-05-24 | 2017-08-29 | 北京航空航天大学 | A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression |
CN108763756B (en) * | 2018-05-28 | 2021-06-25 | 河南工业大学 | Aerosol optical thickness and PM2.5 inversion correction method and system |
-
2019
- 2019-04-25 CN CN201910340382.5A patent/CN110046771B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110046771A (en) | 2019-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110046771B (en) | PM2.5 concentration prediction method and device | |
Cordero et al. | Using statistical methods to carry out in field calibrations of low cost air quality sensors | |
Banks et al. | Performance evaluation of the boundary-layer height from lidar and the Weather Research and Forecasting model at an urban coastal site in the north-east Iberian Peninsula | |
Zeng et al. | A Regional Gap-Filling Method Based on Spatiotemporal Variogram Model of $\hbox {CO} _ {2} $ Columns | |
CN106407633B (en) | Method and system based on space regression Kriging model estimation ground PM2.5 | |
CN110232471B (en) | Rainfall sensor network node layout optimization method and device | |
Lin et al. | ASCAT wind quality control near rain | |
CN107688906B (en) | Multi-method fused transmission line meteorological element downscaling analysis system and method | |
CN111323352B (en) | Regional PM2.5 remote sensing inversion model fusing fine particulate matter concentration data | |
CN110174359A (en) | A kind of Airborne Hyperspectral image heavy metal-polluted soil concentration evaluation method returned based on Gaussian process | |
CN110595968B (en) | PM2.5 concentration estimation method based on geostationary orbit satellite | |
CN110726431A (en) | Operation method of pollution source analysis system with multipoint air quality detection | |
CN110595960B (en) | PM2.5 concentration remote sensing estimation method based on machine learning | |
CN107909192B (en) | Estimation method and device for heavy metal content in soil | |
CN109579774B (en) | Antenna downward inclination angle measurement method based on depth instance segmentation network | |
CN113901384A (en) | Ground PM2.5 concentration modeling method considering global spatial autocorrelation and local heterogeneity | |
WO2020044127A1 (en) | Atmospheric pollution forecasting method | |
CN110503348B (en) | Individual air pollution exposure simulation measurement method based on position matching | |
CN115166750A (en) | Quantitative precipitation estimation method based on dual-polarization Doppler radar data | |
CN113901348A (en) | Oncomelania snail distribution influence factor identification and prediction method based on mathematical model | |
CN116466368B (en) | Dust extinction coefficient profile estimation method based on laser radar and satellite data | |
CN113049606A (en) | Large-area high-precision insulator pollution distribution assessment method | |
CN117037449A (en) | Group fog monitoring method and system based on edge calculation | |
CN110321528A (en) | A kind of Hyperspectral imaging heavy metal-polluted soil concentration evaluation method based on semi-supervised geographical space regression analysis | |
CN113191536A (en) | Near-ground environment element prediction model training and prediction method based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |