CN110046771B - PM2.5 concentration prediction method and device - Google Patents

PM2.5 concentration prediction method and device Download PDF

Info

Publication number
CN110046771B
CN110046771B CN201910340382.5A CN201910340382A CN110046771B CN 110046771 B CN110046771 B CN 110046771B CN 201910340382 A CN201910340382 A CN 201910340382A CN 110046771 B CN110046771 B CN 110046771B
Authority
CN
China
Prior art keywords
observation point
gtwr
model
concentration
dimensional space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910340382.5A
Other languages
Chinese (zh)
Other versions
CN110046771A (en
Inventor
李卫东
董立晔
董前林
赵晨曦
张学海
段金龙
孟凡谦
许向安
张定文
王雪志
侯嘉润
崔永成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN201910340382.5A priority Critical patent/CN110046771B/en
Publication of CN110046771A publication Critical patent/CN110046771A/en
Application granted granted Critical
Publication of CN110046771B publication Critical patent/CN110046771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Computing Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a PM2.5 concentration prediction method and device, and belongs to the technical field of PM2.5 concentration value prediction. The method comprises the following steps: acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data; acquiring two-dimensional position information, elevation information and sampling time information of an observation point; determining the parameters of the 4D-GTWR model according to the data so as to determine the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
Figure DDA0002040515740000011
and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model. The 4D-GTWR model in the method provides an effective method for predicting the PM2.5 concentration, so that the PM2.5 concentration is more accurately predicted, meanwhile, the non-stationarity of four-dimensional space-time is solved, and the important influence of elevation information on the PM2.5 concentration change is verified.

Description

PM2.5 concentration prediction method and device
Technical Field
The invention relates to a PM2.5 concentration prediction method and device, and belongs to the technical field of PM2.5 concentration value prediction.
Background
The fine particulate matter (PM2.5) not only poses serious threats to public health, but also seriously affects urban traffic and the daily life of citizens, so that the PM2.5 is taken as an atmospheric pollutant which directly affects human life and health, thereby arousing wide attention of numerous scholars on PM2.5 related inversion research work and realizing the prediction of PM2.5 concentration.
There are many methods for predicting PM2.5 in the prior art, such as: the Chinese patent application publication No. CN106056210A discloses a PM2.5 concentration value prediction method based on a hybrid neural network, which adopts historical data of PM2.5 concentration values, historical data of related indexes, meteorological historical data and PM2.5 component analysis data, and simulates the change rule of local PM2.5 concentration values through neural network segmentation to realize the prediction of the PM2.5 concentration values.
For another example: zhaoyangyang et al combine collaborative training and space-time geography weighted regression model (GTWR), propose a collaborative space-time geography weighted regression PM2.5 concentration estimation method, its provenance is "mapping science", 2016,41(12): 172-. However, the prediction accuracy of the PM2.5 concentration prediction method in the prior art still needs to be improved.
Disclosure of Invention
The invention aims to provide a PM2.5 concentration prediction method, which is used for solving the problem of low accuracy of the existing prediction method; still provide a PM2.5 concentration prediction device simultaneously for solve the problem that current prediction device's accuracy is low.
In order to achieve the above object, the present invention provides a PM2.5 concentration prediction method, including the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining the parameters of the 4D-GTWR model according to the data so as to determine the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
Figure BDA0002040515720000021
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is a dependent variable of an observation point i, and the dependent variable is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
and predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model.
In addition, the invention also provides a PM2.5 concentration prediction device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the PM2.5 concentration prediction method when executing the computer program.
The beneficial effects are that: elevation information is introduced into a space-time geographic weighted regression model, a four-dimensional space-time geographic weighted regression model (4D-GTWR) is provided, four-dimensional space Euclidean distance measurement is adopted, four-dimensional space-time distribution characteristics can be better reflected, a more real space distance is provided, and therefore effective analysis is conducted on actual conditions, and the 4D-GTWR model provides an effective method in the aspect of predicting PM2.5 concentration, so that the PM2.5 concentration is more accurately predicted, meanwhile, the non-stationarity of four-dimensional space-time is solved, and the fact that the elevation information has important influence on PM2.5 concentration change is verified.
Further, in the method and the apparatus for predicting the PM2.5 concentration, the method for calculating the regression coefficient of the independent variable includes: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
Figure BDA0002040515720000022
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficients
Figure BDA0002040515720000031
Comprises the following steps:
Figure BDA0002040515720000032
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
The beneficial effects are that: the solution of the independent variable regression coefficient is carried out by establishing the objective function, and the method is simple and accurate.
Further, in the method and the device for predicting the PM2.5 concentration, the kernel function is a Gaussian kernel function, and the formula is as follows:
Figure BDA0002040515720000033
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The beneficial effects are that: the four-dimensional space-time weight matrix can be more accurately obtained through the Gaussian kernel function.
Further, in the method and the device for predicting the PM2.5 concentration, the elevation information is digital elevation information, and a calculation formula of a four-dimensional space-time distance between the observation point i and the observation point j is as follows:
Figure BDA0002040515720000034
wherein the content of the first and second substances,
Figure BDA0002040515720000035
the two-dimensional space distance between the observation point i and the observation point j is obtained;
Figure BDA0002040515720000036
the digital elevation space distance between the observation point i and the observation point j is obtained;
Figure BDA0002040515720000037
the time distance between the observation point i and the observation point j is obtained; lambda, lambda,
Figure BDA0002040515720000038
Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
The beneficial effects are that: the method for calculating the four-dimensional space-time distance is simple and accurate.
Further, in the PM2.5 concentration prediction method and apparatus, the four-dimensional space-time bandwidth is a two-dimensional space bandwidth h based onS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
The beneficial effects are that: and respectively solving the two-dimensional space bandwidth, the digital elevation space bandwidth and the time bandwidth, and then obtaining the four-dimensional space-time bandwidth according to the two-dimensional space bandwidth, the digital elevation space bandwidth, the time bandwidth and the four-dimensional space-time bandwidth, so that the obtained four-dimensional space-time bandwidth is more accurate.
Further, in the PM2.5 concentration prediction method and apparatus, a two-dimensional spatial bandwidth, a digital elevation spatial bandwidth, and a time bandwidth are calculated according to a akage pool information amount criterion, where the akage pool information amount criterion AICc is:
Figure BDA0002040515720000041
Figure BDA0002040515720000042
wherein σ2Is an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap matrix of the 4D-GTWR model, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
The beneficial effects are that: more accurate bandwidth data can be obtained through the Chichi information quantity criterion, and the accuracy of PM2.5 prediction is further improved.
Furthermore, in the method and the device for predicting the PM2.5 concentration, the unbiased estimation sigma of the random error variance in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRIs the sum of the squares of the residuals of the 4D-GTWR model.
Further, in the PM2.5 concentration prediction method and apparatus, the 4DResidual sum of squares RSS for GTWR model4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
Further, in the PM2.5 concentration prediction method and apparatus, the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
The beneficial effects are that: the meteorological data are all related to the concentration of PM2.5, and the concentration of PM2.5 can be more comprehensively predicted through the meteorological data.
Drawings
FIG. 1 is a schematic diagram of a four-dimensional spatiotemporal distance between an observation point i and an observation point j according to the present invention;
FIG. 2 is a diagram of a PM2.5 estimated value and an observed value using a GTWR model in the prior art;
FIG. 3 is a diagram of the relationship between the PM2.5 estimated value and the observed value by using the 4D-GTWR model.
Detailed Description
Embodiment of PM2.5 concentration prediction method:
the method for predicting the concentration of PM2.5 provided by the embodiment includes the following steps:
1) and acquiring historical PM2.5 concentration data, meteorological historical data and historical data of the optical thickness of the atmospheric aerosol.
Different gas phases and different atmospheric aerosol optical thicknesses (AOD data) will have different effects on PM2.5 concentration.
In the present embodiment, the weather history data includes weather factors such as air pressure (h Pa), air temperature (°), relative humidity (%), rainfall (mm), wind speed (km/h), and wind direction (°). However, since the wind speed, wind direction, and relative humidity greatly affect the PM2.5 concentration, only the wind speed, wind direction, and relative humidity may be used in the meteorological history data.
The PM2.5 concentration historical data is obtained by detecting an automatic air quality pollution monitoring station (hereinafter referred to as a monitoring station) and is PM2.5 mean concentration monitoring data per hour, and the meteorological historical data is obtained by detecting a meteorological station and is mean data per hour. The data can be downloaded from a national weather information center website in China, and the value of the weather factor of the PM2.5 monitoring site is interpolated from the data of the surrounding weather sites by Krigin due to the difference between the geographical positions of the weather sites and the PM2.5 monitoring site.
The AOD data is obtained by inversion of a medium resolution imaging spectrometer (MODIS), the MYD 04_3K data is subjected to geometric correction by adopting IDL program design and is converted into a WGS-84 geographical coordinate system, and the AOD data which is matched with a PM2.5 detection station in a time-space mode is extracted by utilizing Arcpy. MODIS is a common sensor for inverting the optical thickness of atmospheric aerosol, has a scanning width of 2330km, and can obtain global observation data at least once a day. The MODIS has 36 spectral channels, the spectral range is 0.4-14 μm, the spatial resolution can be 250m, 500m and 1000m, and the MODIS can be used for acquiring data of AOD, steam, surface temperature, ocean and the like. AOD data in MODIS data is provided for data download by NASA LAADS.
2) Two-dimensional position information, digital elevation information (i.e., DEM information), and sampling time information of the observation point are acquired.
The two-dimensional position information is position information of an observation point, the sampling time is 2017.12-2018.2, digital elevation information (namely DEM information) is adopted as height information of a three-dimensional space in the embodiment, DEM data (namely DEM information) is derived from a geographic space data cloud platform of a computer network information center of the Chinese academy of sciences, the spatial resolution is 30m, and the DEM data is 2009 data. As other embodiments, other elevation information may be used, as the invention is not limited in this respect.
3) And determining the parameters of the 4D-GTWR model according to the data, thereby determining the 4D-GTWR model.
The 4D-GTWR model is as follows:
Figure BDA0002040515720000061
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiTo watchA dependent variable of a measuring point i is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing digital elevation space coordinates (t)i) Representing a time coordinate; n is the number of observation points.
Solving the model mainly comprises solving the independent variable regression coefficient betaik(ui,vi,zi,ti) Solution of, xiiFor random errors at observation point i, obey xii-N(0,ω2) Independently and identically distributed, and Cov (ξ)ij) 0(i ≠ j) (i.e., the random variable ξ)iAnd xijHave the same probability distribution and are independent of each other).
Least squares estimation of the independent variable regression coefficients of observation point i as
Figure BDA0002040515720000062
Is formed by betaik(ui,vi,zi,ti) Constituent matrix, betai0(ui,vi,zi,ti) Is that
Figure BDA0002040515720000063
And estimating the 4D-GTWR model according to a weighted least square criterion for the value of the first column in the matrix, and respectively establishing an objective function for each observation point. The objective function for observation point i is as follows:
Figure BDA0002040515720000064
wherein, wij STIs a kernel function (i.e., weight) between observation point i and observation point j, and is related to the four-dimensional space-time distance.
Least squares estimation of independent variable regression coefficients
Figure BDA0002040515720000071
Can be expressed as:
Figure BDA0002040515720000072
Figure BDA0002040515720000073
wherein, W (u)i,vi,zi,ti) Representing a four-dimensional space-time weight matrix, is represented by wij STA matrix composed of X independent variable matrix with 1 in X matrix being betai0Corresponding independent variable xi0The general value is 1, Y is the observation value matrix (here, the observed actual PM2.5 value), and the dependent variable estimation value of the observation point i can be obtained through the above calculation
Figure BDA0002040515720000074
Comprises the following steps:
Figure BDA0002040515720000075
wherein, XiThe vector representing the ith row of the matrix X (i.e., the row vector consisting of the arguments of observation point i). Thus, the dependent variable regression vector (i.e., the predicted outcome) at each observation point
Figure BDA0002040515720000076
Comprises the following steps:
Figure BDA0002040515720000077
where S is the hat matrix of the 4D-GTWR model.
In this example wij STThe kernel function is a Gaussian kernel function, as a further embodiment, wij STThe kernel function may also be a Bi-square type kernel function, and the invention is not limited to the type of kernel function.
The Gaussian kernel function formula is:
Figure BDA0002040515720000078
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
The invention provides an analysis and calculation method of four-dimensional space-time distance in consideration of the heterogeneity of four-dimensional space and one-dimensional time, and a construction method of the time distance and the three-dimensional space distance of a 4D-GTWR model is explained below.
Considering that the DEM mainly has certain influence on the spatial dimension, based on the research on the PM2.5 space-time variation, the pollution process of the pollutants such as the PM2.5 and the like has four-dimensional space-time regional variation, namely, four-dimensional space-time non-stationarity variation exists. Therefore, for any number of monitoring points, three-dimensional spatial coordinates are differentiated between them. Considering that two-three-dimensional space may have different scale effects, the 4D-GTWR model is introduced
Figure BDA00020405157200000811
To represent the difference in scale between them. Therefore, as can be seen from the euclidean distance, the three-dimensional distance d between the observation point i and the observation point j is obtainedij sCan be expressed as follows:
Figure BDA0002040515720000081
wherein the content of the first and second substances,
Figure BDA0002040515720000082
various operators may be represented. On the basis, a space-time distance construction method is fused, and the four-dimensional space-time distance is expressed as follows:
Figure BDA0002040515720000083
scale effect symbols
Figure BDA0002040515720000084
And
Figure BDA0002040515720000085
in general, the four-dimensional spatio-temporal distance D between the observation point i (i.e. the regression point in fig. 1) and the observation point j (i.e. the proximity point in fig. 1) in the 4D-GTWR model can be obtained by linear combination of the four-dimensional spatio-temporal distances, as shown in fig. 1ij STIs represented as follows:
Figure BDA0002040515720000086
wherein the content of the first and second substances,
Figure BDA0002040515720000087
the two-dimensional space distance between the observation point i and the observation point j is obtained;
Figure BDA0002040515720000088
a digital elevation space distance between the observation point i and the observation point j;
Figure BDA0002040515720000089
the time distance between observation point i and observation point j, λ,
Figure BDA00020405157200000810
Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances. Here u pairsThe X-axis of the coordinates in the graph, v corresponds to the Y-axis of the coordinates in the graph, Z corresponds to the Z-axis of the coordinates in the graph, and T corresponds to the T-axis of the coordinates in the graph.
In the kernel function, the selection of bandwidth parameters has a great influence on the fitting result of the model, the over-fitting of the estimation result can be caused if the bandwidth is too small, and the inaccurate estimation result can be caused if the bandwidth is too large. Therefore, selecting appropriate bandwidth parameters is crucial to the accuracy of model estimation. Applied to the present embodiment, that is, the four-dimensional space-time bandwidth h4D-STThe accuracy of the 4D-GTWR model estimation is crucial. Four-dimensional space-time bandwidth h4D-STIs based on a two-dimensional spatial bandwidth hS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
D to the aboveij STBringing in
Figure BDA0002040515720000091
It can be derived that:
Figure BDA0002040515720000092
Figure BDA0002040515720000093
wherein, wij S2d、wij SDEM、wij TThe two-dimensional spatial weight, the DEM spatial weight and the temporal weight are respectively expressed, so that the four-dimensional space-time weight matrix of the 4D-GTWR model is expressed as follows:
Figure BDA0002040515720000094
from the above W (u)i,vi,zi,ti) The expression shows that the four-dimensional space-time weight matrix is related to two-dimensional space weight, DEM space weight and time weight, and is used for solving
Figure BDA0002040515720000095
Needs to solve for hS2d、hSDEM、hT、λ、
Figure BDA0002040515720000096
δ、μ。
In this embodiment, h is calculated based on the Chichi information criterion (AICc)S2d、hSDEM、hT、λ、
Figure BDA0002040515720000101
δ and μmay be calculated according to CV criteria (cross validation method) as another embodiment, and the calculation process of the bandwidth and the scaling factor is not limited by the present invention.
The calculation process of AICc is:
Figure BDA0002040515720000102
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
RSS4D-GTWR=YT(I-S)T(I-S)Y;
Figure BDA0002040515720000103
Figure BDA0002040515720000104
Figure BDA0002040515720000105
Figure BDA0002040515720000106
wherein σ2As random errors in the 4D-GTWR modelUnbiased estimation of difference variance, RSS4D-GTWRIs the residual sum of squares of the 4D-GTWR model, and I is the identity matrix.
RSS4D-GTWRThe specific calculation process is as follows:
Figure BDA0002040515720000107
wherein the content of the first and second substances,
Figure BDA0002040515720000108
assumed fitting value
Figure BDA0002040515720000109
Is E (y)i) Unbiased estimation of, i.e.
Figure BDA00020405157200001010
Then:
Figure BDA0002040515720000111
and E (ε)Tε)=σ2I. Then RSS4D-GTWRThe table can also be represented as:
Figure BDA0002040515720000112
wherein the degree of freedom fd is n-2tr (S) + tr (S)TS) so as to obtain unbiased variance estimation sigma of random error terms in the 4D-GTWR model2Comprises the following steps:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS))。
as can be seen from the above formula, the hat matrix S of the 4D-GTWR model is related to the unknown parameter W (u)i,vi,zi,ti) Of the matrix of (c), then AICc is with respect to W (u) onlyi,vi,zi,ti) A function of, however, W (u)i,vi,zi,ti) Is a function of the respective bandwidth, so that the final AICc is offAnd solving each bandwidth and scale adjustment factor corresponding to the minimum AICc value in the function of each bandwidth, wherein each bandwidth and scale adjustment factor are required.
The scale adjustment factor lambda,
Figure BDA0002040515720000113
Relating to two-dimensional space, λ, δ to DEM space, μ to time, in general, λ, δ,
Figure BDA0002040515720000114
Set as a constant m, in the process of calculating δ, μ and each bandwidth, δ, μ and each bandwidth range are first set (the bandwidth range setting is based on the fact that it is continuously tested in the calculation process, and is related to the four-dimensional space-time distance).
The specific solving process is as follows:
the first step is as follows: will be lambda and
Figure BDA0002040515720000115
set to m, then set to hS2dDetermining h corresponding to the minimum AICc valueS2d
The second step is that: will be lambda and
Figure BDA0002040515720000116
set to m, then set to μ and hTDetermining the mu and h corresponding to the minimum AICc valueT
The third step: will be lambda and
Figure BDA0002040515720000117
set to m, then set to delta and hSDEMDetermining the delta and h corresponding to the minimum AICc valueSDEM
4) The calculation can be used for solving W (u) in a reverse modei,vi,zi,ti) Then, the hat matrix S of the 4D-GTWR model is solved, and finally, a prediction result is obtained
Figure BDA0002040515720000118
Through the calculation, all parameters in the 4D-GTWR model are determined, the 4D-GTWR model can be finally determined, and when PM2.5 concentration prediction is carried out, predicted meteorological data are substituted into the 4D-GTWR model, so that the PM2.5 concentration of each observation area can be predicted.
The method is verified by specific historical data as follows:
taking a certain observation point as an example for verification, and setting the certain observation point as (u)1,v1,z1,t1) The other two observation points are (u)2,v2,z2,t2) And (u)3,v3,z3,t3) The historical data of independent variables and dependent variables are as shown in table one:
table-history data
Figure BDA0002040515720000121
The data are only a part of historical data, and at least 100 observation points are needed in the process of determining the 4D-GTWR model, and the data are not listed here because of too much data.
The space-time coordinates of the three observation points are shown as a second table, and the two-dimensional position information is the abscissa and the ordinate in the WGS _1984_ World _ Mercator projection coordinate system:
space-time coordinates of two observation points of table
z t u v
161(z1) 49(t1) 12612250.93(u1) 4111074.699(v1)
147(z2) 49(t2) 12612591.07(u2) 4114111.191(v2)
131(z3) 49(t3) 12575408.2(u3) 4105491.236(v3)
The historical data is brought into the model, and the accuracy of the model can be known by analyzing the correlation, wherein the correlation analysis comprises R2(correlation coefficient), RMSE (root mean square error), MAE (mean absolute error), R2The larger the RMSE and MAE, the smaller the accuracy of the model. The calculation formula is as follows:
Figure BDA0002040515720000131
wherein the content of the first and second substances,
Figure BDA0002040515720000132
is an estimated value (predicted value) of the PM2.5 concentration at observation point i,
Figure BDA0002040515720000133
is the average of the observed values of PM2.5 concentrations at observation point i, yiPM2.5 concentration observations for observation point i.
Will be at the topThe historical data are respectively brought into a GTWR model (the GTWR model is the prior art and is not introduced too much), and the result of the GTWR model is obtained, and comprises independent variable regression coefficients of independent variables in each observation point and dependent variable regression fitting vectors of each observation point
Figure BDA0002040515720000134
The bandwidth parameter is 2.2895, then
Figure BDA0002040515720000135
As a result of linear correlation analysis with the actual vector value y (PM2.5 observed value at each observation point), as shown in fig. 2, a linear correlation analysis curve y is 1.0536x-2.3442 (the curve is a correlation analysis curve, x in the formula is PM2.5 observed value, y is predicted value of PM2.5), and R is obtained2RMSE, MAE, in fig. 2, scattered points have both PM2.5 estimated values (predicted values) and PM2.5 observed values (measured values), the abscissa is PM2.5 (measured values), the ordinate is PM2.5 predicted values, the effect of the model can be seen by the relationship of the scattered points and the linear correlation analysis curve, and the obtained results are shown in table three:
TABLE III GTWR model
Figure BDA0002040515720000136
In the third table, Min is the minimum value, LQ is the lower quartile, Med is the median, UQ is the upper quartile, Max is the maximum value, the variation of the regression coefficient of each variable along with the empty position is described, intercept is the constant term of the regression coefficient of the independent variable, AOD in the table represents the independent variable regression coefficient corresponding to the independent variable AOD, relative humidity represents the independent variable regression coefficient corresponding to the independent variable relative humidity, air pressure represents the independent variable regression coefficient corresponding to the independent variable air pressure, air temperature represents the independent variable regression coefficient corresponding to the independent variable air temperature, wind speed represents the independent variable regression coefficient corresponding to the independent variable wind speed, wind direction represents the independent variable regression coefficient corresponding to the independent variable wind direction, and rainfall data is 0, so there is no relevant regression coefficient.
Will be at the topThe historical data are respectively substituted into a 4D-GTWR model, and the result of the 4D-GTWR model is obtained and comprises the independent variable regression coefficient of independent variables in each observation point and the dependent variable regression fitting vector of each observation point
Figure BDA0002040515720000141
The bandwidth parameter is 0.0063, then
Figure BDA0002040515720000142
Linear correlation analysis is performed on the actual vector value y (observed value PM2.5 at each observation point), and as a result, as shown in fig. 3, a linear correlation analysis curve y is obtained, wherein R is 1.0253x-1.27492RMSE, MAE, as shown in fig. 3, and fig. 3 is the same as that shown in the abscissa, ordinate, scattered points and curves of fig. 2, and the results are shown in table four, without much description:
TABLE IV 4D-GTWR model
Figure BDA0002040515720000143
The meanings in Table four are the same as those in Table three, and it can be seen from the results that in the GTWR model, R is20.8811, RMSE 12.9579, MAE 9.5815, 4D-GTWR model, where R2R of 0.9496, RMSE 8.5931, MAE 5.9498, 4D-GTWR model2R of GTWR model2Large, the RMSE and MAE of the 4D-GTWR model are smaller than those of the GTWR model, indicating that the 4D-GTWR model is more accurate.
PM2.5 concentration prediction apparatus embodiment:
the PM2.5 concentration prediction apparatus proposed in this embodiment includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the PM2.5 concentration prediction method when executing the computer program.
The specific implementation process of the PM2.5 concentration prediction method is already described in the foregoing embodiment of the PM2.5 concentration prediction method, and is not described herein again.

Claims (9)

1. A PM2.5 concentration prediction method is characterized by comprising the following steps:
acquiring PM2.5 concentration historical data, meteorological historical data and atmospheric aerosol optical thickness historical data;
acquiring two-dimensional position information, elevation information and sampling time information of an observation point;
determining parameters of a 4D-GTWR model according to the data, thereby determining the 4D-GTWR model, wherein the 4D-GTWR model is as follows:
Figure FDA0002963639180000011
wherein x isikP independent variables are the k independent variables of the observation point i, and the independent variables are meteorological data; y isiThe dependent variable is a dependent variable of an observation point i, and the dependent variable is PM2.5 concentration data; the parameters of the 4D-GTWR model are as follows: beta is aik(ui,vi,zi,ti)、βi0(ui,vi,zi,ti) And xii,βik(ui,vi,zi,ti) The independent variable regression coefficient of the kth independent variable of the observation point i is related to the space position of the observation point i; beta is ai0(ui,vi,zi,ti) An independent variable regression coefficient constant term of the observation point i; xiiIs the random error of observation point i; (u)i,vi,zi,ti) Is the four-dimensional space coordinate of observation point i, where (u)i,vi) Representing two-dimensional spatial coordinates, (z)i) Representing the elevation space coordinate (t)i) Representing a time coordinate; n is the number of observation points;
predicting the concentration of PM2.5 according to the predicted meteorological data, the atmospheric aerosol optical thickness data and the determined 4D-GTWR model;
the method for calculating the independent variable regression coefficient comprises the following steps: solving a least squares estimate of the independent variable regression coefficient by establishing an objective function of the observation point i, the objective function being:
Figure FDA0002963639180000012
wherein, wij STIs a kernel function between observation point i and observation point j;
least squares estimation of the independent variable regression coefficients
Figure FDA0002963639180000013
Comprises the following steps:
Figure FDA0002963639180000014
wherein, W (u)i,vi,zi,ti) Is a four-dimensional space-time weight matrix composed of wij STAnd (3) forming a matrix, wherein X is an independent variable matrix, and Y is an observed value matrix of a dependent variable.
2. The PM2.5 concentration prediction method according to claim 1, wherein the kernel function is a Gaussian kernel function, and the formula is:
Figure FDA0002963639180000021
wherein h is4D-STIs a four-dimensional space-time bandwidth, dij STIs the four-dimensional space-time distance between the observation point i and the observation point j.
3. The PM2.5 concentration prediction method according to claim 2, wherein the elevation information is digital elevation information, and the four-dimensional space-time distance between the observation point i and the observation point j is calculated by the following formula:
Figure FDA0002963639180000022
wherein the content of the first and second substances,
Figure FDA0002963639180000023
the two-dimensional space distance between the observation point i and the observation point j is obtained;
Figure FDA0002963639180000024
the digital elevation space distance between the observation point i and the observation point j is obtained;
Figure FDA0002963639180000025
the time distance between the observation point i and the observation point j is obtained; lambda, lambda,
Figure FDA0002963639180000026
Delta and mu are scale adjustment factors used for balancing the scale difference of different four-dimensional space-time distances.
4. The PM2.5 concentration prediction method of claim 3 wherein the four-dimensional space-time bandwidth is based on a two-dimensional space bandwidth hS2dDigital elevation space bandwidth hSDEMAnd time bandwidth hTAnd (4) obtaining the product.
5. The PM2.5 concentration prediction method according to claim 4, characterized in that the two-dimensional spatial bandwidth, the digital elevation spatial bandwidth and the time bandwidth are calculated according to the akage information amount criterion, wherein the akage information amount criterion AICc is:
Figure FDA0002963639180000027
Figure FDA0002963639180000028
wherein σ2For an unbiased estimate of random error variance in the 4D-GTWR model, S is the cap moment for the 4D-GTWR modelArray, X1、X2、…、XnAre row vectors consisting of the arguments of observation points 1, 2, …, n, respectively.
6. The PM2.5 concentration prediction method of claim 5, wherein the unbiased estimation of random error variance σ in the 4D-GTWR model2The calculation formula of (2) is as follows:
σ2=RSS4D-GTWR/(n-2tr(S)+tr(STS));
wherein the RSS4D-GTWRIs the sum of the squares of the residuals of the 4D-GTWR model.
7. The PM2.5 concentration prediction method according to claim 6, wherein the Residual Sum of Squares (RSS) of the residuals of the 4D-GTWR model4D-GTWRThe calculation formula of (2) is as follows:
RSS4D-GTWR=YT(I-S)T(I-S)Y;
wherein I is an identity matrix.
8. The PM2.5 concentration prediction method according to claim 1, wherein the meteorological data includes at least air pressure, air temperature, relative humidity, rainfall, wind speed, and wind direction.
9. A PM2.5 concentration prediction apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the PM2.5 concentration prediction method according to any one of claims 1 to 8 when executing the computer program.
CN201910340382.5A 2019-04-25 2019-04-25 PM2.5 concentration prediction method and device Active CN110046771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910340382.5A CN110046771B (en) 2019-04-25 2019-04-25 PM2.5 concentration prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910340382.5A CN110046771B (en) 2019-04-25 2019-04-25 PM2.5 concentration prediction method and device

Publications (2)

Publication Number Publication Date
CN110046771A CN110046771A (en) 2019-07-23
CN110046771B true CN110046771B (en) 2021-04-16

Family

ID=67279470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910340382.5A Active CN110046771B (en) 2019-04-25 2019-04-25 PM2.5 concentration prediction method and device

Country Status (1)

Country Link
CN (1) CN110046771B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111307199A (en) * 2019-12-05 2020-06-19 北京普源精电科技有限公司 Electronic measuring instrument with prediction device and prediction method of electronic measuring instrument
CN111256745A (en) * 2020-02-28 2020-06-09 芜湖职业技术学院 Data calibration method for portable air quality monitor
CN111896680B (en) * 2020-07-08 2022-07-05 天津师范大学 Greenhouse gas emission analysis method and system based on satellite remote sensing data
CN112013822A (en) * 2020-07-22 2020-12-01 武汉智图云起科技有限公司 Multispectral remote sensing water depth inversion method based on improved GWR model
CN112990609B (en) * 2021-04-30 2021-09-14 中国测绘科学研究院 Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression
CN113538239B (en) * 2021-07-12 2024-03-19 浙江大学 Interpolation method based on space-time autoregressive neural network model
CN113901348A (en) * 2021-11-10 2022-01-07 江苏省血吸虫病防治研究所 Oncomelania snail distribution influence factor identification and prediction method based on mathematical model
CN114511087B (en) * 2022-04-19 2022-07-01 四川国蓝中天环境科技集团有限公司 Air quality space inference method and system based on double models
CN117055087B (en) * 2023-10-10 2024-01-30 中国电建集团西北勘测设计研究院有限公司 GNSS real-time positioning and resolving method for haze influence area
CN117786618B (en) * 2024-02-27 2024-05-07 四川国蓝中天环境科技集团有限公司 Application method of regional pollution transmission evaluation method in environment control

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407633B (en) * 2015-07-30 2019-08-13 中国科学院遥感与数字地球研究所 Method and system based on space regression Kriging model estimation ground PM2.5
CN105184012B (en) * 2015-09-28 2017-12-22 宁波大学 A kind of regional air PM2.5 concentration prediction methods
CN106056210B (en) * 2016-06-07 2018-06-01 浙江工业大学 A kind of PM2.5 concentration value Forecasting Methodologies based on hybrid neural networks
CN107133686A (en) * 2017-03-30 2017-09-05 大连理工大学 City-level PM2.5 concentration prediction methods based on Spatio-Temporal Data Model for Spatial
CN107103392A (en) * 2017-05-24 2017-08-29 北京航空航天大学 A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression
CN108763756B (en) * 2018-05-28 2021-06-25 河南工业大学 Aerosol optical thickness and PM2.5 inversion correction method and system

Also Published As

Publication number Publication date
CN110046771A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
CN110046771B (en) PM2.5 concentration prediction method and device
Cordero et al. Using statistical methods to carry out in field calibrations of low cost air quality sensors
Banks et al. Performance evaluation of the boundary-layer height from lidar and the Weather Research and Forecasting model at an urban coastal site in the north-east Iberian Peninsula
Zeng et al. A Regional Gap-Filling Method Based on Spatiotemporal Variogram Model of $\hbox {CO} _ {2} $ Columns
CN106407633B (en) Method and system based on space regression Kriging model estimation ground PM2.5
CN110232471B (en) Rainfall sensor network node layout optimization method and device
Lin et al. ASCAT wind quality control near rain
CN107688906B (en) Multi-method fused transmission line meteorological element downscaling analysis system and method
CN111323352B (en) Regional PM2.5 remote sensing inversion model fusing fine particulate matter concentration data
CN110174359A (en) A kind of Airborne Hyperspectral image heavy metal-polluted soil concentration evaluation method returned based on Gaussian process
CN110595968B (en) PM2.5 concentration estimation method based on geostationary orbit satellite
CN110726431A (en) Operation method of pollution source analysis system with multipoint air quality detection
CN110595960B (en) PM2.5 concentration remote sensing estimation method based on machine learning
CN107909192B (en) Estimation method and device for heavy metal content in soil
CN109579774B (en) Antenna downward inclination angle measurement method based on depth instance segmentation network
CN113901384A (en) Ground PM2.5 concentration modeling method considering global spatial autocorrelation and local heterogeneity
WO2020044127A1 (en) Atmospheric pollution forecasting method
CN110503348B (en) Individual air pollution exposure simulation measurement method based on position matching
CN115166750A (en) Quantitative precipitation estimation method based on dual-polarization Doppler radar data
CN113901348A (en) Oncomelania snail distribution influence factor identification and prediction method based on mathematical model
CN116466368B (en) Dust extinction coefficient profile estimation method based on laser radar and satellite data
CN113049606A (en) Large-area high-precision insulator pollution distribution assessment method
CN117037449A (en) Group fog monitoring method and system based on edge calculation
CN110321528A (en) A kind of Hyperspectral imaging heavy metal-polluted soil concentration evaluation method based on semi-supervised geographical space regression analysis
CN113191536A (en) Near-ground environment element prediction model training and prediction method based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant