CN117332906B - Machine learning-based three-dimensional space-time grid air quality prediction method and system - Google Patents

Machine learning-based three-dimensional space-time grid air quality prediction method and system Download PDF

Info

Publication number
CN117332906B
CN117332906B CN202311628815.XA CN202311628815A CN117332906B CN 117332906 B CN117332906 B CN 117332906B CN 202311628815 A CN202311628815 A CN 202311628815A CN 117332906 B CN117332906 B CN 117332906B
Authority
CN
China
Prior art keywords
data
dimensional space
machine learning
model
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311628815.XA
Other languages
Chinese (zh)
Other versions
CN117332906A (en
Inventor
王新锋
韩子祯
关天奕
辛鑫
宋晓萌
王一丹
张庆竹
任鹏杰
陈竹敏
王桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202311628815.XA priority Critical patent/CN117332906B/en
Publication of CN117332906A publication Critical patent/CN117332906A/en
Application granted granted Critical
Publication of CN117332906B publication Critical patent/CN117332906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a machine learning-based three-dimensional space-time grid air quality prediction method and system, and relates to the technical field of air quality prediction. The method comprises the following steps: acquiring meteorological data and source emission list data with different heights, preprocessing the meteorological data, and constructing a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data; comparing the predicted value and the measured value of the machine learning composite model, and correcting the machine learning composite model according to the comparison result; and predicting the air quality of the region to be detected by using a machine learning composite model, and evaluating the local emission contribution and the emission reduction effect through scene simulation. The invention can better show the space-time distribution characteristics of the atmospheric pollutants in the future time period of the regional space range, and quantitatively evaluate the contribution degree of the local source row and the emission reduction effect to a certain extent.

Description

Machine learning-based three-dimensional space-time grid air quality prediction method and system
Technical Field
The invention relates to the technical field of air quality prediction, in particular to a three-dimensional space-time grid air quality prediction method and system based on machine learning.
Background
With the acceleration of the process of urban and industrial production, PM in the atmosphere 2.5 、O 3 The pollution conditions are severe, the air pollution has obvious influence on the health of human bodies and the ecological environment, and the air pollution becomes an environmental prominence problem which is focused on by people, so that the future air quality is predicted to trace the emission pollution sources or evaluate the treatment effect, and the air pollution control method has great significance for the human society.
The continuous development of computer technology is beneficial to improving the accuracy of air quality prediction of regional environments and reducing the prediction time, and in recent years, an air quality prediction model is gradually developed from a physical modeling and data driving model to a machine learning model. The conventional regional air quality model based on the physical and chemical mechanism of the atmosphere often has the following problems: the method has the advantages of large operation amount, more consumed resources, long time consumption and lower spatial resolution. The existing big data-based air quality prediction website or platform outputs basically a near-ground two-dimensional result, has a good prediction effect on air quality of three-dimensional space-time grid space-time distribution, is difficult to screen and evaluate an overhead pollution source, does not directly consider the influence of source emission generally, and cannot quantitatively evaluate the contribution of local emission and the emission reduction effect. Therefore, it is very interesting to explore air quality prediction methods based on machine-learning data-driven three-dimensional spatiotemporal feature extraction.
Through searching and finding existing patents and related technologies, the existing air quality prediction method comprises the following steps:
(1) The invention relates to a time sequence prediction method for urban air quality taking time-space correlation into consideration, and provides a time sequence prediction method for urban air quality taking time-space correlation into consideration, which is disclosed in Chinese patent publication No. CN 111340288A. The method predicts PM for 1-12 hours in the future by using random forest model 2.5 Concentration. A space-time correlation cube is introduced to extract space-time information, and a singular spectrum analysis and random forest coupling model is designed to accurately fit the air quality of the future stage. However, the data sources used in the patent are single, only the data of the stations are monitored, and the prediction of the air quality in the vertical direction is lacking.
(2) The invention discloses an air quality detection system based on an LSTM-CNN model, and provides an air quality detection system based on the LSTM-CNN model in Chinese patent with publication number of CN 115730684A. The system expands the vertical profile of satellite data by using an LSTM-CNN model and an interpolation method to obtain stereoscopic observation data, so that the horizontal resolution and the vertical resolution of the stereoscopic observation data are respectively improved to more than 2 times and 4 times of the original observation data. However, the air quality monitoring system can only realize the prediction of the current day O through historical data 3 And PM 2.5 Is not predictive of contaminant concentration over a future time period. In addition, the LSTM-CNN model is often time-consuming when processing data with large calculation amount, and the model does not correct the prediction result after the prediction is finished.
Therefore, it can be seen that although the current air quality prediction model based on machine learning realizes the simulation of air pollutants in different time periods or different horizontal and vertical spaces, the comprehensive prediction of air quality in a three-dimensional space in a future period of time is lacking, and assimilation fusion of multiple source data sets is lacking, so that accurate prediction is difficult to realize.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide a three-dimensional space-time grid air quality prediction method and system based on machine learning, which can well reproduce horizontal advection and vertical diffusion processes and continuously predict air quality in future time periods of regional space ranges.
In order to achieve the above object, the present invention is realized by the following technical scheme:
the invention provides a machine learning-based three-dimensional space-time grid air quality prediction method, which comprises the following steps of:
acquiring meteorological data and source emission list data with different heights, preprocessing the meteorological data, and constructing a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data;
training a machine learning composite model by utilizing a three-dimensional space-time grid data set, wherein a random forest algorithm is firstly adopted to train an ozone number density model and an optical thickness model, prediction is carried out according to the trained model, a prediction result and an actual measurement value are filled into the three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network, so that the machine learning composite model is obtained;
comparing the predicted value and the actual measured value of the machine learning composite model, and correcting the machine learning composite model according to the comparison result;
and predicting the air quality of the region to be detected by using a machine learning composite model, and evaluating the local emission contribution and the emission reduction effect through scene simulation.
Further, the three-dimensional space-time grid data set is a set of data sets with time periods of warm seasons and cold seasons, the total time period of the three-dimensional space-time grid data set is more than or equal to 2 weeks, and the time difference between the three-dimensional space-time grid data set and the prediction day is less than or equal to 2 years.
Furthermore, the training of the atmospheric pollutant model adopts a cross network or deep neural network training model when the continuous data quantity of the three-dimensional space-time grid data set time sequence exceeds 1000, and adopts a random forest training model when the continuous data quantity of the three-dimensional space-time grid data set time sequence is lower than 1000.
Further, when meteorological data and source emission list data of different heights are acquired, meteorological data and source emission list data of different sources and different resolutions are acquired, and satellite profile data of monitoring point data, optical thickness and ozone number density of different sources are also acquired at the same time.
Further, the specific steps of constructing the three-dimensional space-time grid data set include:
standardizing the meteorological data set, wherein the time series of the standardized data set comprises a training period and a prediction period; and merging the meteorological data set and the source emission inventory data set, so that the source emission inventory data are merged into the three-dimensional space-time grid data set, and finally, the satellite profile data of optical thickness and ozone number density are merged into the three-dimensional space-time grid data set.
Further, the source emission list data is integrated into the three-dimensional space-time grid data set by acquiring the earth surface height data and time information from the meteorological data, and combining the data of the latitude and longitude corresponding to the source emission list data according to the height information of the meteorological data.
Further, the specific steps of comparing the predicted value and the measured value of the machine learning composite model and correcting the machine learning composite model according to the comparison result include:
calculating the ratio of the measured value to the predicted value of the atmospheric pollutants at the current time;
filling interpolation of the calculated ratio in a future time period, and dynamically correcting a predicted value of the atmospheric pollutants according to a ratio result;
and carrying out secondary dynamic correction on the predicted value of the atmospheric pollutants by using a relative humidity exponential decay formula according to the relative humidity changes of the air at different heights.
Further, an atmospheric pollutant model is constructed according to the atmospheric pollutant condition of the monitoring point, and the atmospheric pollutant is PM 2.5 、PM 10 、NO 2 、SO 2 、O 3 And one or more of CO.
Further, the specific steps of predicting the air quality and the emission reduction effect of the region to be detected by using the machine learning composite model include:
processing meteorological data and source emission list data of the region to be detected by using a machine learning composite model to obtain an air quality prediction result of the region to be detected;
setting the emission intensity of the local source to be zero, re-inputting the emission intensity of the local source into a machine learning composite model, and obtaining the contribution of the local emission according to the change of the predicted concentration;
and reducing pollutant emission concentration in the source emission list data according to preset requirements, and re-inputting the pollutant emission concentration into the machine learning composite model to obtain an emission reduction effect prediction result.
A second aspect of the present invention provides a machine learning based three-dimensional space-time grid air quality prediction system comprising:
the data acquisition module is configured to acquire meteorological data and source emission list data with different heights, preprocess the meteorological data and construct a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data;
the model training module is configured to train a machine learning composite model by utilizing a three-dimensional space-time grid data set, wherein a random forest algorithm is firstly adopted to train an ozone number density model and an optical thickness model, prediction is carried out according to the trained models, a prediction result and an actual measurement value are filled into the three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network, so that the machine learning composite model is obtained;
the model correction module is configured to compare the predicted value and the actual measured value of the machine learning composite model and correct the machine learning composite model according to the comparison result;
and the prediction module is configured to predict the air quality of the region to be detected by using the machine learning composite model and evaluate the local emission contribution and the emission reduction effect through scene simulation.
The one or more of the above technical solutions have the following beneficial effects:
the invention discloses a machine learning-based three-dimensional space-time grid air quality prediction method and system, which designs data-driven three-dimensional space-time grid space-time distribution, has good display effect and is beneficial to regional transmission evaluation of an air pollution three-dimensional space.
The invention carries out two times of training on the data set of the three-dimensional space-time grid which is processed and fused by using algorithms such as random forests and the like, and firstly simulates the ozone number density (O) 3 ) And an optical thickness (AOD) model, adding the obtained satellite remote sensing vertical profile to the total data set, and retraining atmospheric contaminants (such as PM 2.5 ) The model is beneficial to improving the prediction accuracy of the air quality. In addition, the method combines known real data and relative humidity changes with different heights to dynamically correct the predicted data twice, so that the accuracy and reliability of the predicted model can be further improved.
The method takes the disclosed three-dimensional weather analysis data, source emission list, satellite remote sensing vertical profile data and environmental air monitoring site data as model input data, establishes a three-dimensional space-time grid air quality prediction model and a prediction method based on a machine learning algorithm, and can be further used for air quality scene simulation, evaluating local emission contribution and emission reduction effect, analyzing high-altitude transmission of atmospheric pollutants and the like.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a machine learning based three-dimensional space-time grid air quality prediction method in accordance with a first embodiment of the present invention;
FIG. 2 is a flow chart of a machine learning composite model data processing in accordance with a first embodiment of the present invention;
FIG. 3 is a flowchart of training, verifying and predicting a random forest model in accordance with a first embodiment of the present invention;
FIG. 4 is a diagram of multi-source dataset formation in accordance with a first embodiment of the present invention;
FIG. 5 is a schematic diagram of the prediction performance of an AOD model according to a first embodiment of the present invention;
FIG. 6 is a diagram of O in a first embodiment of the invention 3 Model predictive representation schematics;
FIG. 7 is a PM in a first embodiment of the invention 2.5 Model predictive representation schematics.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It should be noted that, in the embodiments of the present invention, the weather data and the source emission data related to the monitoring station, the organization or the institution and the like are related, when the embodiments of the present invention are applied to specific products or technologies, the user permission or consent is required to be obtained, and the collection, the use and the processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof;
embodiment one:
the first embodiment of the invention provides a machine learning-based three-dimensional space-time grid air quality prediction method, as shown in figure 1, using PM 2.5 For example, firstly, acquiring and standardizing meteorological data, extracting features, resampling and interpolating the meteorological data and the emission list data, combining satellite profile data to obtain a multi-source data three-dimensional space-time grid data set, inputting the three-dimensional space-time grid data set into a machine learning composite model to obtain PM 2.5 Is a predicted result of (a). Wherein, as shown in FIG. 2, the machine compound model obtains AOD and O from the multi-source data three-dimensional space-time network data set through a random forest model 3 According to the PM of the monitoring station 2.5 Generating a new three-dimensional space-time grid data set by using the data, and processing the new three-dimensional space-time grid data set by using a cross network, a deep neural network or a random forest model to obtain PM 2.5 Prediction of (2)As a result. After that for PM 2.5 And (3) performing altitude correction and relative humidity correction, wherein the corrected model is used for evaluating emission contribution and emission reduction results.
The method specifically comprises the following steps:
step 1, acquiring meteorological data and source emission list data with different heights, preprocessing the meteorological data, and constructing a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data.
And 2, training a machine learning composite model by utilizing a three-dimensional space-time grid data set based on algorithms such as random forests, cross networks, deep neural networks and the like. The random forest model training, verifying and predicting process is shown in fig. 3, and comprises a screening process of a three-dimensional space-time grid training set and a testing set, result correction is carried out according to historical data rules, scientific principles and the like as required, then a three-dimensional air quality predicting model is generated, and the three-dimensional space-time grid predicting data is obtained by combining the added three-dimensional space-time grid real-time data for prediction.
And 3, comparing the predicted value and the actual measured value of the machine learning composite model, and correcting the machine learning composite model according to the comparison result.
And 4, predicting the air quality of the region to be detected by using a machine learning composite model, and evaluating the local emission contribution and the emission reduction effect through scene simulation.
In step 1, when the meteorological data and source emission list data of different heights are acquired, the meteorological data and source emission list data of different sources and different resolutions are acquired, namely, the data come from different monitoring sites, organizations or institutions. PM (particulate matter) of monitoring points of different sources is also acquired simultaneously 2.5 Data, optical thickness and ozone number density.
The three-dimensional space-time grid data set is a set of data sets with time periods of warm seasons and cold seasons, the total time period of the three-dimensional space-time grid data set is more than or equal to 2 weeks, and the time difference between the three-dimensional space-time grid data set and the prediction current day is less than or equal to 2 years.
The method for constructing the three-dimensional space-time grid data set comprises the following specific steps of:
standardizing the meteorological data set, wherein the time series of the standardized data set comprises a training period and a prediction period; and merging the meteorological data set and the source emission inventory data set, so that the source emission inventory data are merged into the three-dimensional space-time grid data set, and finally, the satellite remote sensing vertical profile data of optical thickness and ozone number density are merged into the three-dimensional space-time grid data set.
The method of the meteorological data set standardization is to adjust each space-time resolution of the data set to a uniform size, wherein the space-time resolution comprises longitude, latitude, altitude and time resolution. The specific method is to divide the height of the data set into a certain layer number, group the data according to the time, latitude, longitude and layer number dimension of the data, and calculate the average value of each group. Then, interpolation is carried out on the data in longitude and latitude, then interpolation is carried out in height dimension, and finally time resampling is carried out, so that time series data of different characteristics of the same coordinates and different time stamps are obtained.
In this embodiment, as shown in fig. 4, the preprocessed two-dimensional multi-source data is combined into a three-dimensional space-time grid data set by using left connection, and the interpolation method of data set filling may be one or more of linear interpolation, polynomial interpolation, spline interpolation, forward filling interpolation or backward filling interpolation.
The source emission list data is integrated into the three-dimensional space-time grid data set by acquiring surface height data and time information from the meteorological data, and combining the data of latitude and longitude corresponding to the source emission list data according to the height information of the meteorological data.
In this embodiment, the processing and merging of multi-source datasets handles the standardization and merging of multiple data simultaneously by creating a multi-process and multi-threaded environment.
In a specific embodiment, a three-dimensional space-time grid air quality data set is constructed, and a city atmospheric pollutant source emission list, environmental air quality national control station monitoring data, satellite remote sensing vertical profile data and three-dimensional meteorological simulation data are acquired by taking Shandong province as an example. The source emission list data is from a bloom emission list data set, the meteorological simulation data is from a meteorological data-FNL/GFS data set, the satellite remote sensing vertical profile data is from a CALIOP satellite data set, and the environment monitoring data is from a national atmospheric environment monitoring site data set.
Data collection and preprocessing: air quality detection data is collected over a specified date and altitude range, including emissions inventory data, weather data, monitoring point data, and satellite profile data. Selecting data of 2017, 6, 23, 7, 11, 23, 12, 7 days, preprocessing the data, and matching according to the space site position and acquisition time to obtain time sequence data of different characteristics of the same coordinate and different time stamps;
the data preprocessing sequence is that firstly, a weather data set with stronger data integrity is standardized, and then source list data with large data quantity is three-dimensionally processed, wherein the preprocessing of the source list data set is obtained by acquiring surface height data and time information from the weather data, combining data corresponding to latitude and longitude according to the height information of the data, and calculating the actual height by adding the ground height and a value from the surface height. And then when the satellite remote sensing vertical profile data is preprocessed, the meteorological data height data and the CALIOP satellite vertical data are combined by utilizing the corrected relative humidity. Finally, the data sets are combined into a file, and a file containing the target variable is generated.
The data set normalization method is to match each data according to the space site position and the acquisition time. The data structure was highly layered, the height of each layer was set to 60 m, and the data average value of each layer was calculated. And groups the processed data by using time, latitude, longitude, and number of layers, and calculates an average value of each group. And then performing spatial linear interpolation, namely interpolating in longitude and latitude dimensions, and then interpolating in altitude dimensions. And then time resampling is carried out, and each data set is unified into a standard format respectively to obtain time sequence data of different characteristics of the same coordinate and different time stamps.
The preprocessed data is subjected to high-level processing, precision conversion and the likeMethod, reading all preprocessed data sets only including time, latitude, longitude, layer number and O 3 And a column of AODs. The preprocessed two-dimensional multisource data is combined into a three-dimensional space-time grid data set using left connections based on latitude, longitude, time and number of layers.
The data is processed and merged by creating a multi-process and multi-threaded environment. For each generated date, a new thread is created and started to run.
In step 2, the measured ozone density and aerosol optical thickness (AOD) are small amounts of banded data covering only a small portion of the investigation region, and it is necessary to first spatially predict, expand to the entire spatial region, and then predict in the future time. Ozone density and aerosol optical thickness, and PM 2.5 Belonging to different pollutant/atmospheric parameters, thus utilizing O 3 And the vertical profile of the AOD can predict PM 2.5 (concentration distribution in the vertical direction).
In a specific embodiment, the ozone number density model and the optical thickness model are trained by a random forest algorithm to obtain a vertical profile with high spatial coverage, and the prediction result and atmospheric pollutants (such as PM 2.5 ) The actual measurement value is filled into a three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network, so that a machine learning composite model is obtained. After the machine learning composite model is obtained, three-dimensional meteorological data of a period to be predicted are input into the machine learning composite model, and the atmospheric pollutant concentration of the day and the future 3 days hour by hour is predicted.
Wherein, the atmospheric pollutant model is constructed according to the atmospheric pollutant condition of the monitoring point, and the atmospheric pollutant is PM 2.5 、PM 10 、NO 2 、SO 2 、O 3 And one or more of CO, in this embodiment in PM 2.5 The machine learning composite model is illustrated as an example.
In a specific embodiment, the preprocessed three-dimensional spatiotemporal grid data is read, defining AODs and O 3 ModelFeatures and target variables of (a). Training and prediction of models under the Scikit-learn framework. After the training and testing sets are divided, a regression model such as a random forest, a cross network or a deep neural network is used for training and calculating the score of the model on the testing set to obtain a simulated satellite vertical profile, wherein the prediction performance result of the AOD model is shown as figure 5, and O 3 The model predictive performance results are shown in fig. 6.
AOD and O 3 Model predicted AOD and O 3 Results are populated into the dataset and PM is trained using the updated dataset 2.5 And (5) compounding the model. After the training and testing sets are divided, the model is trained by using algorithms such as random forests, cross networks or deep neural networks, and the score of the model on the testing set is calculated, and the model is stored. Loading a trained AOD model, O 3 And (5) a model, and reading prediction data. Creation of AODs, O by selecting different columns in the total dataset 3 Sub-data set, using AOD model, O 3 Model and model pair AOD, O 3 And (5) predicting. Loading trained PM 2.5 Composite model, adding AOD and O to a dataset 3 To obtain a new data set using PM 2.5 The composite model predicts, and the prediction result is shown in fig. 7.
In step 3, the specific steps of comparing the predicted value and the measured value of the machine learning composite model and correcting the machine learning composite model according to the comparison result include:
s1: and calculating the ratio of the measured value to the predicted value of the atmospheric pollutants at the current time.
Calculation of the current atmospheric pollutants (e.g. PM 2.5 ) Ratio of measured value to predicted value:
P ri = P i / P vi ,(1),
wherein P is ri P is the ratio of the observed value to the predicted value of the pollutant on the ith day i For the ith day PM 2.5 Measured value, P vi Is the predicted value of the contaminant on the ith day.
S2: and filling interpolation of the calculated ratio in a future time period, and dynamically correcting the predicted value of the atmospheric pollutants according to the ratio result.
This ratio data is then interpolated over an hour-by-hour time of 3 days into the future. Linear interpolation methods, forward fill and backward fill methods are used to interpolate first in the time and height dimensions and then in the horizontal dimensions (longitude and latitude). And then, correcting the predicted value, including twice correction, and one or more times of correction can be used according to actual conditions. The predicted values of different heights are corrected for the first time, and the formula is as follows:
P a1i = P vi ×P ri (2),
wherein P is ri P is the ratio of the observed value to the predicted value of the pollutant on the ith day vi P, which is the predicted value of the pollutant on the ith day a1i The correction value is once for the ith contaminant.
S3: and carrying out secondary dynamic correction on the predicted value of the atmospheric pollutants by using a relative humidity exponential decay formula according to the relative humidity changes of the air at different heights.
And filling the missing value and then performing second dynamic correction. If L i <12, the correction formula is
P a2i =P a1i (3).
Wherein P is a2i Is the ith secondary correction value;
if L i >11, then using the relative humidity exponential decay formula:
(4)。
wherein L is i Is the layer value of the i-th data.
In step 4, the specific steps of predicting the air quality and the emission reduction effect of the region to be detected by using the machine learning composite model include:
processing meteorological data and source emission list data of the region to be detected by using a machine learning composite model to obtain an air quality prediction result of the region to be detected;
setting the emission intensity of the local source to be zero, re-inputting the emission intensity of the local source into a machine learning composite model, and obtaining the contribution of the local emission according to the change of the predicted concentration;
and reducing pollutant emission concentration in the source emission list data according to preset requirements, and re-inputting the pollutant emission concentration into the machine learning composite model to obtain an emission reduction effect prediction result. PM after emission reduction is obtained 2.5 Concentration values. And evaluating emission contribution and emission reduction effect according to the result.
Specifically, according to the P of the selected specific place ri Value, map PM at specified location 2.5 A plot of concentration versus time; PM after emission reduction in specific place is selected 2.5 And (5) drawing a concentration value and a line graph of emission reduction and emission reduction effects. In addition, specific time, height and longitude and latitude can be selected, and a thermodynamic diagram of the horizontal distribution of the concentration of the atmospheric pollutants, a thermodynamic diagram of the vertical distribution of the concentration of the atmospheric pollutants, a two-dimensional ray diagram of the prediction of the concentration of the atmospheric pollutants and the like can be drawn.
The method in this embodiment uses a monitoring function to realize full-automatic operation. Specifically, the watchdog library is used to monitor files in the file system, which monitors a specified folder when a file of a particular format is created.
To further demonstrate the superior effect of the model of the present invention, the model predictive performance was validated against the measured values, and the comparative simulation data selected ozone number density, AOD and PM 2.5 Concentration value, result finding R of linear regression equation representing relation between predicted value and measured value 2 And the three-dimensional air quality model is larger than 0.96, which shows that the three-dimensional air quality model and the three-dimensional air quality model have good prediction performance and higher fitting result.
In this embodiment, AOD, O 3 The model is based on prediction in three-dimensional space of random forest, space resampling; but is PM 2.5 The method designs three-dimensional space-time grid fusion multi-source data based on prediction of random forest or neural network and time interpolation on future time, and utilizes algorithms such as random forest, cross network or deep neural network to perform AOD and O 3 And PM 2.5 Prediction of concentration-different time-space sequences. Combining PM 2.5 PM as a function of measured values and relative humidity changes at different heights 2.5 The prediction value is dynamically corrected, so that the accuracy and reliability of the prediction model can be further improved. The method realizes rapid forecast, early warning and evaluation of future atmospheric pollution, can automatically extract and dynamically display three-dimensional space-time grid distribution and variation trend of the atmospheric pollutants, intelligently identify the time period and the area where the atmospheric pollution is likely to occur, accurately evaluate the local emission contribution and the expected emission reduction effect, and is favorable for the transmission evaluation of the atmospheric pollution area.
The method solves the problems of large calculation amount, long time consumption and low spatial resolution of the traditional regional air quality model, realizes the prediction of the three-dimensional space-time grid air quality and the rapid prediction and evaluation of the future atmospheric pollution, can identify the time period and the region which are likely to generate the atmospheric pollution through the prediction result, and is beneficial to accurately evaluating the influence of a large pollution source.
Embodiment two:
the second embodiment of the invention provides a machine learning-based three-dimensional space-time grid air quality prediction system, which comprises:
the data acquisition module is configured to acquire meteorological data and source emission list data with different heights, preprocess the meteorological data and construct a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data;
the model training module is configured to train a machine learning composite model by utilizing a three-dimensional space-time grid data set, wherein a random forest algorithm is firstly adopted to train an ozone number density model and an optical thickness model, prediction is carried out according to the trained models, a prediction result and an actual measurement value are filled into the three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network, so that the machine learning composite model is obtained;
the model correction module is configured to compare the predicted value and the actual measured value of the machine learning composite model and correct the machine learning composite model according to the comparison result;
and the prediction module is configured to predict the air quality of the region to be detected by using the machine learning composite model and evaluate the local emission contribution and the emission reduction effect through scene simulation.
The steps involved in the second embodiment correspond to those of the first embodiment of the method, and the detailed description of the second embodiment can be found in the related description section of the first embodiment.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented by general-purpose computer means, alternatively they may be implemented by program code executable by computing means, whereby they may be stored in storage means for execution by computing means, or they may be made into individual integrated circuit modules separately, or a plurality of modules or steps in them may be made into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (7)

1. The machine learning-based three-dimensional space-time grid air quality prediction method is characterized by comprising the following steps of:
acquiring meteorological data and source emission list data with different heights, preprocessing the meteorological data, and constructing a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data;
satellite profile data of monitoring point data, optical thickness and ozone number density of different sources are also obtained simultaneously;
training a machine learning composite model by utilizing a three-dimensional space-time grid data set, wherein a random forest algorithm is firstly adopted to train an ozone number density model and an optical thickness model, prediction is carried out according to the trained model, a prediction result and an actual measurement value are filled into the three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network algorithm, so that the machine learning composite model is obtained;
the specific steps for constructing the three-dimensional space-time grid data set include: standardizing the meteorological data set, wherein the time series of the standardized data set comprises a training period and a prediction period; combining the meteorological data set and the source emission list data set, so that the source emission list data are integrated into the three-dimensional space-time grid data set, and finally integrating satellite remote sensing vertical profile data of optical thickness and ozone number density into the three-dimensional space-time grid data set;
comparing the predicted value and the measured value of the machine learning composite model, and correcting the machine learning composite model according to the comparison result, wherein the specific steps comprise: calculating the ratio of the measured value to the predicted value of the atmospheric pollutants at the current time; filling interpolation of the calculated ratio in a future time period, and dynamically correcting a predicted value of the atmospheric pollutants according to a ratio result; performing secondary dynamic correction on the predicted value of the atmospheric pollutants by using a relative humidity exponential decay formula according to the relative humidity changes of the air at different heights;
and predicting the air quality of the region to be detected by using a machine learning composite model, and evaluating the local emission contribution and the emission reduction effect through scene simulation.
2. The machine learning based three-dimensional space-time grid air quality prediction method of claim 1, wherein the three-dimensional space-time grid data set is a set of data sets having time periods of warm seasons and cool seasons, and the total time period of the three-dimensional space-time grid data set is 2 weeks or more and the time difference from the prediction day is 2 years or less.
3. The machine learning based three-dimensional space-time grid air quality prediction method of claim 1, wherein the atmospheric contaminant model uses a cross network or deep neural network training model when the amount of data continuous in the three-dimensional space-time grid data set time sequence exceeds 1000, and uses a random forest training model when the amount of data continuous in the three-dimensional space-time grid data set time sequence is lower than 1000.
4. The machine learning based three-dimensional space-time grid air quality prediction method of claim 1, wherein the source emission inventory data is merged into the three-dimensional space-time grid data set by combining data of the source emission inventory data corresponding to latitude and longitude according to altitude information of the meteorological data by acquiring surface altitude data and time information from the meteorological data.
5. The machine learning-based three-dimensional space-time grid air quality prediction method according to claim 1, wherein the atmospheric pollutant model is constructed according to the condition of atmospheric pollutants at monitoring points, and the atmospheric pollutants are PM 2.5 、PM 10 、NO 2 、SO 2 、O 3 And one or more of CO.
6. The machine-learning-based three-dimensional space-time grid air quality prediction method according to claim 1, wherein the specific step of performing prediction evaluation on the air quality, the local emission contribution and the emission reduction effect of the region to be measured by using the machine-learning composite model comprises the following steps:
processing meteorological data and source emission list data of the region to be detected by using a machine learning composite model to obtain an air quality prediction result of the region to be detected;
setting the emission intensity of the local source to be zero, re-inputting the emission intensity of the local source into a machine learning composite model, and obtaining the contribution of the local emission according to the change of the predicted concentration;
and reducing pollutant emission concentration in the source emission list data according to preset requirements, and re-inputting the pollutant emission concentration into the machine learning composite model to obtain an emission reduction effect prediction result.
7. A machine learning based three-dimensional space-time grid air quality prediction system for implementing the method of any of claims 1-6, comprising:
the data acquisition module is configured to acquire meteorological data and source emission list data with different heights, preprocess the meteorological data and construct a three-dimensional space-time grid data set according to the preprocessed meteorological data and source emission list data;
the model training module is configured to train a machine learning composite model by utilizing a three-dimensional space-time grid data set, wherein a random forest algorithm is firstly adopted to train an ozone number density model and an optical thickness model, prediction is carried out according to the trained models, a prediction result and an actual measurement value are filled into the three-dimensional space-time grid data set, and then the filled data set is utilized to train an atmospheric pollutant model through a random forest, a cross network or a deep neural network, so that the machine learning composite model is obtained;
the model correction module is configured to compare the predicted value and the actual measured value of the machine learning composite model and correct the machine learning composite model according to the comparison result;
and the prediction module is configured to predict the air quality of the region to be detected by using the machine learning composite model and evaluate the local emission contribution and the emission reduction effect through scene simulation.
CN202311628815.XA 2023-12-01 2023-12-01 Machine learning-based three-dimensional space-time grid air quality prediction method and system Active CN117332906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311628815.XA CN117332906B (en) 2023-12-01 2023-12-01 Machine learning-based three-dimensional space-time grid air quality prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311628815.XA CN117332906B (en) 2023-12-01 2023-12-01 Machine learning-based three-dimensional space-time grid air quality prediction method and system

Publications (2)

Publication Number Publication Date
CN117332906A CN117332906A (en) 2024-01-02
CN117332906B true CN117332906B (en) 2024-03-15

Family

ID=89293864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311628815.XA Active CN117332906B (en) 2023-12-01 2023-12-01 Machine learning-based three-dimensional space-time grid air quality prediction method and system

Country Status (1)

Country Link
CN (1) CN117332906B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340288A (en) * 2020-02-25 2020-06-26 武汉墨锦创意科技有限公司 Urban air quality time sequence prediction method considering space-time correlation
CN112836862A (en) * 2021-01-15 2021-05-25 上海市环境监测中心(上海长三角区域空气质量预测预报中心) Ensemble forecasting method, system and medium based on machine learning algorithm
CN113627529A (en) * 2021-08-11 2021-11-09 成都佳华物链云科技有限公司 Air quality prediction method, device, electronic equipment and storage medium
CN113804829A (en) * 2021-08-20 2021-12-17 重庆市生态环境监测中心 Atmospheric pollution space-air-ground integrated real-time monitoring system and method
CN114266200A (en) * 2022-02-24 2022-04-01 山东大学 Nitrogen dioxide concentration prediction method and system
CN114580696A (en) * 2020-12-02 2022-06-03 中国人民解放军战略支援部队信息工程大学 PM (particulate matter)2.5Concentration prediction method
CN114898820A (en) * 2022-04-12 2022-08-12 江苏蓝创智能科技股份有限公司 Method for predicting and early warning ozone and particulate matters based on multi-mode air quality model
CN114926749A (en) * 2022-07-22 2022-08-19 山东大学 Near-surface atmospheric pollutant inversion method and system based on remote sensing image
CN114943303A (en) * 2022-06-16 2022-08-26 福州大学 Time sequence AOD reconstruction method based on multi-sensor remote sensing
KR20230018625A (en) * 2021-07-30 2023-02-07 대한민국(기상청 국립기상과학원장) Method of Calculating Real Time Visibility using Random Forest Machine Learning and Data of Weather Observations and Model Predictions
CN116011317A (en) * 2022-11-29 2023-04-25 北京工业大学 Small-scale near-real-time atmospheric pollution tracing method based on multi-method fusion
KR20230119508A (en) * 2022-02-07 2023-08-16 한국공학대학교산학협력단 Air quality prediction method using artificial intelligence
CN116822624A (en) * 2023-06-14 2023-09-29 浙江大学 Near-surface O3 estimation method based on depth forest model framework
CN117031582A (en) * 2023-06-27 2023-11-10 华南理工大学 Ozone hour concentration forecasting method based on recursive space-time learning and simulation monitoring fusion

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114037911B (en) * 2022-01-06 2022-04-15 武汉大学 Large-scale forest height remote sensing inversion method considering ecological zoning

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340288A (en) * 2020-02-25 2020-06-26 武汉墨锦创意科技有限公司 Urban air quality time sequence prediction method considering space-time correlation
CN114580696A (en) * 2020-12-02 2022-06-03 中国人民解放军战略支援部队信息工程大学 PM (particulate matter)2.5Concentration prediction method
CN112836862A (en) * 2021-01-15 2021-05-25 上海市环境监测中心(上海长三角区域空气质量预测预报中心) Ensemble forecasting method, system and medium based on machine learning algorithm
KR20230018625A (en) * 2021-07-30 2023-02-07 대한민국(기상청 국립기상과학원장) Method of Calculating Real Time Visibility using Random Forest Machine Learning and Data of Weather Observations and Model Predictions
CN113627529A (en) * 2021-08-11 2021-11-09 成都佳华物链云科技有限公司 Air quality prediction method, device, electronic equipment and storage medium
CN113804829A (en) * 2021-08-20 2021-12-17 重庆市生态环境监测中心 Atmospheric pollution space-air-ground integrated real-time monitoring system and method
KR20230119508A (en) * 2022-02-07 2023-08-16 한국공학대학교산학협력단 Air quality prediction method using artificial intelligence
CN114266200A (en) * 2022-02-24 2022-04-01 山东大学 Nitrogen dioxide concentration prediction method and system
CN114898820A (en) * 2022-04-12 2022-08-12 江苏蓝创智能科技股份有限公司 Method for predicting and early warning ozone and particulate matters based on multi-mode air quality model
CN114943303A (en) * 2022-06-16 2022-08-26 福州大学 Time sequence AOD reconstruction method based on multi-sensor remote sensing
CN114926749A (en) * 2022-07-22 2022-08-19 山东大学 Near-surface atmospheric pollutant inversion method and system based on remote sensing image
CN116011317A (en) * 2022-11-29 2023-04-25 北京工业大学 Small-scale near-real-time atmospheric pollution tracing method based on multi-method fusion
CN116822624A (en) * 2023-06-14 2023-09-29 浙江大学 Near-surface O3 estimation method based on depth forest model framework
CN117031582A (en) * 2023-06-27 2023-11-10 华南理工大学 Ozone hour concentration forecasting method based on recursive space-time learning and simulation monitoring fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
莫炜聪."基于深度学习的空气质量预测研究".CNKI 优秀硕士学位论文全文库.全文. *

Also Published As

Publication number Publication date
CN117332906A (en) 2024-01-02

Similar Documents

Publication Publication Date Title
CN109978249B (en) Population data spatialization method, system and medium based on partition modeling
Tian et al. Analysis of spatial and seasonal distributions of air pollutants by incorporating urban morphological characteristics
Pontius et al. Comparing the input, output, and validation maps for several models of land change
CN111666656A (en) Rainfall estimation method and rainfall monitoring system based on microwave rainfall attenuation
Ma et al. Evaluating carbon fluxes of global forest ecosystems by using an individual tree-based model FORCCHN
CN116011317B (en) Small-scale near-real-time atmospheric pollution tracing method based on multi-method fusion
CN110503267B (en) Urban financial invasion case prediction system and prediction method based on space-time scale self-adaptive model
CN108764527B (en) Screening method for soil organic carbon library time-space dynamic prediction optimal environment variables
CN116343103B (en) Natural resource supervision method based on three-dimensional GIS scene and video fusion
CN115204618A (en) CCMVS regional carbon source convergence inversion evaluation system
CN115203189A (en) Method for improving atmospheric transmission quantification capability by fusing multi-source data and visualization system
CN115438848A (en) PM based on deep mixed graph neural network 2.5 Long-term concentration prediction method
CN115730684A (en) Air quality detection system based on LSTM-CNN model
Panagos et al. Multi-scale European Soil Information System (MEUSIS): a multi-scale method to derive soil indicators
CN114254802B (en) Prediction method for vegetation coverage space-time change under climate change drive
CN115994685A (en) Method for evaluating current situation of homeland space planning
CN110716998A (en) Method for spatializing fine-scale population data
CN116415110B (en) Method for carrying out carbon emission partition gridding based on multisource remote sensing density data
CN117332906B (en) Machine learning-based three-dimensional space-time grid air quality prediction method and system
Zhao et al. High-resolution spatiotemporal patterns of China’s FFCO2 emissions under the impact of LUCC from 2000 to 2015
CN116110210B (en) Data-driven landslide hazard auxiliary decision-making method in complex environment
CN115239027B (en) Method and device for forecasting air quality check set
Li et al. Generating daily high-resolution and full-coverage XCO2 across China from 2015 to 2020 based on OCO-2 and CAMS data
CN116167003A (en) Near-ground artificial source nitrogen dioxide high-definition product estimation method and system
CN115393731A (en) Method and system for generating virtual cloud picture based on interactive scenario and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant