CN112884243A

CN112884243A - Air quality analysis and prediction method based on deep learning and Bayesian model

Info

Publication number: CN112884243A
Application number: CN202110282474.XA
Authority: CN
Inventors: 富众杰; 林海平; 黃炳强
Original assignee: Hangzhou Vocational and Technical College
Current assignee: Hangzhou Vocational and Technical College
Priority date: 2021-03-16
Filing date: 2021-03-16
Publication date: 2021-06-01
Anticipated expiration: 2041-03-16
Also published as: CN112884243B

Abstract

The invention discloses an air quality analysis and prediction method based on deep learning and a Bayesian model. The main points of the technical solution are that the AQI data of a target monitoring point is obtained; the AQI data is preprocessed, and the AQI data is normalized. ; Build a deep learning convolutional network model, a recurrent neural network model and an Yebesian dynamic linear model respectively; input the AQI data into the deep learning convolutional network model and the Yebesian dynamic linear model respectively, and the Yebesian dynamic linear model outputs the first output after running. 1. Predict AQI data; input the features extracted by the deep learning convolutional network model into the recurrent neural network model and output the second predicted AQI data; input the first predicted AQI data and the second predicted AQI data into the mixed model of value, mix After the model runs, the final predicted AQI data is output. This air quality analysis and prediction method can analyze and predict air quality, evaluate atmospheric improvement, identify pollution sources, and propose air pollution prevention and control suggestions.

Description

Air quality analysis and prediction method based on deep learning and Bayesian model

Technical Field

The invention relates to an atmospheric pollutant concentration prediction method, in particular to an air quality analysis prediction method based on deep learning and a Bayesian model.

Background

The quality of the atmospheric quality is a problem which continuously receives attention in recent years, and a large number of atmospheric quality monitoring stations are added in China for monitoring local atmospheric quality and meteorological data. The atmospheric quality data that monitoring station can monitor wherein comprises 6 factors, is respectively: particulate matter (PM2.5 and PM10) and gaseous matter (NO)₂，CO，O₃And SO₂) The data are called AQI data in a unified way; in addition, the monitoring points can also acquire meteorological data of the area, such as weather, temperature, pressure, humidity, wind direction and wind speed, which are collectively called as MEO data.

Because meteorological environment factors are complex, index prediction of atmospheric pollutant concentration is always a complex problem. At present, commonly used prediction methods include a mechanism prediction method based on an atmospheric chemical transmission model and a statistical prediction method based on a machine learning model. The former method is widely applied to actual engineering, but because the atmosphere is a very complex system and is theoretically difficult to operate and fully quantize, a mechanism forecasting method has a large error.

At present, the forecast of weather conditions and the concentration of various pollutants by the national weather bureau is obtained by adopting an atmospheric chemical coupling mode (WRF-Chem) operation. Because numerical model calculations and emission source inventory data have errors of different degrees, the prediction effect of the model on the pollutant concentration is not ideal.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an air quality analysis and prediction method based on deep learning and a Bayesian model, which can analyze and predict air quality, evaluate the atmosphere improvement condition, clarify the pollution source and provide air pollution prevention and control suggestions.

In order to achieve the purpose, the invention provides the following technical scheme: an air quality analysis and prediction method based on deep learning and Bayesian model comprises the following steps:

step S1: acquiring AQI data of a target monitoring point;

step S2: preprocessing AQI data, judging abnormal values in a data sequence according to a Laobe criterion, removing the abnormal values, and completing missing data at a certain moment by adopting a linear interpolation method;

step S3: carrying out normalization processing on the AQI data;

step S4: respectively constructing a deep learning convolution network model, a cyclic neural network model and a leaf bass dynamic linear model;

step S5: respectively inputting the normalized AQI data into a deep learning convolution network model and a leaf-Bayesian dynamic linear model, wherein after the deep learning convolution network model operates, a long input sequence is converted into a short sequence formed by high-level features, and after the leaf-Bayesian dynamic linear model operates, a first prediction AQI data is output;

step S6: inputting a sequence consisting of features extracted by the deep learning convolutional network model into a cyclic neural network model, and outputting second prediction AQI data after the cyclic neural network model operates;

step S7: and constructing a mixed model, inputting the first prediction AQI data and the second prediction AQI data into the value mixed model, and outputting final prediction AQI data after the mixed model operates.

The invention is further configured to: the normalization processing in step S3 is to reduce the influence of different orders of magnitude or different dimensions on the data by keeping the value range of the data within a relatively small fluctuation range, set the characteristic distribution as a normal distribution, and map the characteristic to the standard normal distribution by the variance and the mean, and the calculation formula is:

wherein y is_meanIs the mean value of all the sample data,y_stdis the standard deviation of all sample data.

The invention is further configured to: the step S4 specifically includes:

step S41, selecting training data and test data from the AQI data according to the constructed model, and completing initialization of a deep learning convolution network model, a cyclic neural network model and a leaf Bayes dynamic linear model;

step S42, training a deep learning convolution network model, a cyclic neural network model and a leaf bass dynamic linear model by using training data;

step S43, obtaining a test prediction result according to the test data by utilizing the trained deep learning convolution network model, the trained cyclic neural network model and the trained leaf-Bayes dynamic linear model;

and step S44, predicting by using the trained deep learning convolution network model, the trained cyclic neural network model and the trained leaf-Bayes dynamic linear model.

The invention is further configured to: in step S5, the bayesian dynamic linear model includes: observing an equation, a state equation and initial information, regarding the prediction distribution as conditional probability distribution, solving the prediction distribution according to prior information, solving posterior information by using a Bayesian formula, and correcting the prior information to solve a predicted value.

The invention is further configured to: for the recurrent neural network model, the loss function of the training phase is as follows:

where a is the prediction value and y is the sample value.

The invention is further configured to: the cyclic neural network model also comprises an Adam algorithm and a Dropout algorithm;

the Adam algorithm is used for calculating a first moment estimation and a second moment estimation of the gradient to design independent adaptive learning rates for different parameters;

the Dropout algorithm is used to reduce the dependency between features, reducing the probability of over-fitting occurring.

The invention is further configured to: step S8, obtaining MEO data;

step S9, carrying out correlation analysis based on the MEO data and the AQI data;

step S10, carrying out backward trace and potential source contribution analysis based on the MEO data and the AQI data;

and step S11, importing the correlation analysis result, the backward trace and the potential source contribution analysis result into the final prediction AQI data together to obtain a comprehensive improvement suggestion.

The invention is further configured to: the correlation analysis in step S9 specifically includes: taking PM2.5 and PM10 as first variables, and taking weather, temperature, air pressure, humidity, wind speed and wind direction as second variables, the following formulas are introduced:

wherein x_iAnd y_iIn order to compare the two variables of the correlation,

is a variable x_iThe average value of (a) of (b),

is a variable y_iR is a spearman correlation coefficient, r is +1 or-1 when the two variables are perfectly monotonically correlated, and r is 0 when the two variables are uncorrelated.

The invention is further configured to: the backward trajectory and potential source contribution analysis in step S10 specifically includes: dividing a research area into i multiplied by j grids according to the longitude and latitude, wherein the PSCF calculation formula is as follows:

wherein n is_ijTo pass through a certain pointNumber of all air flow paths, m, of grid (i, j)_ijIs the number of contamination traces passing through grid (i, j).

In conclusion, the invention has the following beneficial effects: obtaining air quality data AQI (PM2.5, PM10, NO)₂，CO，O₃，SO₂) The historical monitoring data is obtained by considering the time sequence characteristics of air quality data, judging abnormal values in a data sequence by adopting a Lauda criterion and removing the abnormal values, completing missing data at a certain moment by adopting a linear interpolation method, mapping different characteristic data onto the same scale before data modeling, carrying out normalization processing on the characteristic data, and then constructing a deep learning convolution network model, a cyclic neural network model and a leaf Bayesian dynamic linear model.

The deep learning convolutional neural network CNN is used as a feature extraction: the air quality data has multiple dimensions and difficult feature extraction, the deep learning convolutional neural network CNN locally extracts features through convolutional kernels, and weights are shared, so that the defect of excessive parameters of an artificial neural network is overcome, the feature extraction effect is good, the deep learning convolutional neural network CNN has strong feature extraction capability, a long input sequence can be converted into a Short sequence consisting of high-level features, and the sequence consisting of the extracted features is used as the input of a recurrent neural network-long Short-Term memory neural network LSTM (Long Short Term memory).

The recurrent neural network model (long-short term memory neural network LSTM) is used as a prediction model: because the concentration of air pollutants has strong correlation with time, the memory-related problem can be well treated by using the long-short term memory neural network LSTM. The LSTM is improved and optimized on the basis of the RNN, the problem of gradient disappearance in the training process is solved, a group of memory modules are contained in a model structure and are mutually associated to replace memory units in the common RNN, the LSTM is easier to train than the common RNN, and the LSTM has good research effects in multiple fields at present.

The LSTM input is an hour characteristic, namely AQI and six pollutant indexes at a certain moment, and the output is a neuron for predicting AQL

Bayesian dynamic linear model DLM: bayesian prediction is a predictive method developed to predict the need for an incident. The method not only depends on historical measurement data to predict according to the knowledge of a model, but also comprises the experience information and subjective judgment of experts to predict the emergency, and is particularly useful for predicting the emergency.

The basic idea of Bayesian prediction is to establish a dynamic model, regard the prediction distribution as conditional probability distribution, solve the prediction distribution according to prior information, solve posterior information by using Bayesian formula, and correct the prior information to solve the prediction value. The Bayesian dynamic linear model consists of an observation equation, a state equation and initial information.

Mixing the models: after the model framework is built, a long-short term memory neural network LSTM + Bayesian dynamic linear model DLM hybrid model is built. The input of the LSTM model is historical AQI data and six pollutant indexes, and the output is prediction AQI; the input of the Bayesian dynamic linear model is historical AQI data and empirical information, and the output is predicted AQI. 2 prediction model outputs AQI are fused to obtain a new prediction result, so that the model becomes feature-diversified, and has stronger learning ability and higher prediction accuracy.

Drawings

Fig. 1 is a schematic block diagram of an air quality analysis prediction method.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. In which like parts are designated by like reference numerals. It should be noted that the terms "front," "back," "left," "right," "upper" and "lower" used in the following description refer to directions in the drawings, and the terms "bottom" and "top," "inner" and "outer" refer to directions toward and away from, respectively, the geometric center of a particular component.

The first embodiment is as follows: referring to fig. 1, in order to achieve the above object, the present invention provides the following technical solutions: an air quality analysis and prediction method based on deep learning and Bayesian model comprises the following steps:

step S1: acquiring AQI data of a target monitoring point;

step S3: carrying out normalization processing on the AQI data;

step S5: inputting the normalized AQI data into a deep learning convolution network model and a leaf Bayes dynamic linear model respectively, converting a long input sequence into a short sequence consisting of high-level features after the deep learning convolution network model operates, and outputting first prediction AQI data after the leaf Bayes dynamic linear model operates;

step S6: inputting a sequence consisting of features extracted by the deep learning convolutional network model into the cyclic neural network model, and outputting second prediction AQI data after the cyclic neural network model operates;

step S7: and constructing a mixed model, inputting the first prediction AQI data and the second prediction AQI data into the mixed model, and outputting final prediction AQI data after the mixed model operates.

The design of the invention is as follows: obtaining air quality data AQI (PM2.5, PM10, NO)₂，CO，O₃，SO₂) The historical monitoring data is obtained by considering the time sequence characteristics of air quality data, judging abnormal values in a data sequence by adopting a Lauda criterion and removing the abnormal values, completing missing data at a certain moment by adopting a linear interpolation method, mapping different characteristic data onto the same scale before data modeling, carrying out normalization processing on the characteristic data, and then constructing a deep learning convolution network model, a cyclic neural network model and a leaf Bayesian dynamic linear model.

The normalization processing in step S3 is to reduce the influence of different orders of magnitude or different dimensions on the data by keeping the value range of the data within a relatively small fluctuation range, set the characteristic distribution as a normal distribution, and map the characteristic to the standard normal distribution by the variance and the mean, and the calculation formula is:

wherein y is_meanIs the mean of all sample data, y_stdIs the standard deviation of all sample data.

Step S4 specifically includes:

In step S5, the bayesian dynamic linear model includes: observing an equation, a state equation and initial information, regarding the prediction distribution as conditional probability distribution, solving the prediction distribution according to prior information, solving posterior information by using a Bayesian formula, and correcting the prior information to solve a predicted value.

The LSTM neural network model effect and optimization target are defined by loss functions, and the degree of inconsistency between the predicted value and the true value of the network model is estimated. The optimization problem aims to minimize a loss function, and network parameters are optimized according to the proximity degree of a predicted value and a true value to obtain an optimal model. The air quality prediction problem belongs to a regression problem, and a mean square error loss function is adopted and defined as follows:

where a is the prediction value and y is the sample value.

The recurrent neural network model also comprises an Adam algorithm and a Dropout algorithm;

the Adam algorithm is used for calculating a first moment estimation and a second moment estimation of the gradient and designing independent adaptive learning rates for different parameters;

adam designs independent adaptive learning rates for different parameters by computing first and second moment estimates of the gradient. The Adam algorithm takes advantage of both the adaptive gradient algorithm (AdaGrad) and the root mean square propagation (RMSProp) algorithm. Adam not only calculates the adaptive parameter learning rate based on the first moment mean value like the RMSProp algorithm, but also fully utilizes the second moment mean value of the gradient, and the Adam algorithm can adapt to the harsh conditions of sparse parameters, unstable target, noise and the like, has high calculation speed and self-adjustment of parameters and can be suitable for most occasions.

The Dropout algorithm can effectively relieve the occurrence of overfitting and improve the accuracy of prediction. When a complex feedforward neural network training sample is small, the trained model is easy to generate overfitting. In the process of training the neural network, a Dropout algorithm is adopted to randomly discard a part of neural network units, the training process is temporarily removed, and the activation value of a certain neuron stops working with a certain probability p during forward propagation, so that the generalization of the model is stronger, the training load is reduced, and the training speed is improved.

After the data is prepared and the model and parameters are set, deep learning will be trained and verified several times until a best-fit target and desired model are generated.

Step S8, obtaining MEO data;

step S9, carrying out correlation analysis based on the MEO data and the AQI data; and performing correlation analysis between the monitoring meteorological data and the atmospheric quality data by using a Spearman correlation coefficient. Meteorological conditions are one of the important factors restricting air quality, and influence the generation, diffusion, transportation and the like of air pollutants. A Spearman correlation coefficient method is adopted to analyze the relationship between AQI, six air pollutants and meteorological factors. The Spearman correlation coefficient is used for evaluating the correlation of two statistical variables by using a monotonic equation, when the two variables are completely monotonically correlated, the Spearman correlation coefficient is +1 or-1, and if the coefficient is 0, the two variables are not correlated.

Step S10, carrying out backward trace and potential source contribution analysis based on the MEO data and the AQI data; potential source regions and the contribution of different source regions to the contaminant concentration affecting the contaminant concentration are analyzed. The backward track is a model for analyzing pollutant diffusion and motion paths according to meteorological parameters such as temperature, air pressure and wind direction, and is widely used for research on pollutant loosening paths. The potential source contribution factor PSCF analysis method is used for analyzing the potential source and distribution of a specific pollutant by utilizing backward locus and pollutant concentration combination. The method divides a research area into i multiplied by j grids according to longitude and latitude, and records all airflow tracks passing through a certain grid (i, j) as n_ijThe number of contamination tracks passing through the grid (i, j) is recorded as m_ij。

The correlation analysis in step S9 specifically includes: taking PM2.5 and PM10 as first variables, and taking weather, temperature, air pressure, humidity, wind speed and wind direction as second variables, the following formulas are introduced:

wherein x_iAnd y_iIn order to compare the two variables of the correlation,

is a variable x_iThe average value of (a) of (b),

The backward trajectory and potential source contribution analysis in step S10 specifically includes: dividing a research area into i multiplied by j grids according to the longitude and latitude, wherein the PSCF calculation formula is as follows:

wherein n is_ijFor all the gas flow trajectories through a certain grid (i, j), m_ijIs the number of contamination traces passing through grid (i, j).

Example two:

the spatial correlation among the atmospheric pollutants is researched, and a spatial conversion method is provided. Through airspace division, airspace aggregation and an airspace difference value, the areas around the target monitoring station are divided, so that each area can acquire the atmospheric quality data and the meteorological data in the same format, the atmospheric quality data with sparse space is finally converted into uniform consistent input, and the characteristics among the airspace data are extracted.

Acquiring a set S ═ S of a central monitoring station and a monitoring station in an adjacent area of a target area by collecting historical atmospheric quality observation data and meteorological data₁，S₂，S₃，...S_nAnd historical atmospheric quality monitoring data of each monitoring station

And historical meteorological monitoring data for each monitoring site

The three are used as the input of a deep learning model to obtain the atmospheric quality data of the central monitoring point of the target area in the future period of time

Since the atmospheric pollutants float in a wide geographic space and are in a movable diffusion state at any time under the influence of time and terrain, the atmospheric quality index of a target area in the future of 48 hours is predicted, and not only the historical atmospheric quality index of the target area needs to be considered in detail

And historical meteorological monitoring data

It is also necessary to set the peripheral region S to { S ═ S₁，S₂，S₃，...S_nThe two data of the four-dimensional space are taken into consideration together, and the spatial correlation of the two data is taken into consideration comprehensively.

1) The diffusivity of atmospheric pollution. Because atmospheric pollutants are scattered in different places and can be diffused and transferred under the condition of regional geographic environment over time, more information can be further predicted by utilizing data from a neighborhood space.

2) Spatial correlation. The spatial domain partitions merge the dispersed atmospheric quality data into a certain target region, with closer regions having finer granularity and farther regions having coarser granularity. In addition, regions of different distances show different effects as a function of distance.

3) And (4) expandability. It reduces complexity compared to the conventional spatial aggregation method by determining the upper limit (number of regions) of the input. In addition, the spatial interpolation method overcomes spatial sparsity by filling missing values of the partitioned regions and generating consistent inputs for all monitoring stations, which enables us to train a model using data of different stations together, increasing the accuracy of the model to a certain extent.

The process of the space conversion method comprises the steps of firstly, selecting a target atmospheric quality monitoring station needing to be predicted as a circle center, and generating an inner monitoring area by taking 5 kilometers as a first radius; generating an outer ring by taking 20 kilometers as a second radius, and taking an area outside the inner monitoring area and inside the outer ring as an outer monitoring area; connecting all monitoring stations in an internal monitoring area with a target monitoring point, acquiring internal monitoring angles between two adjacent monitoring stations and the target monitoring point, taking an angular bisector of the internal monitoring angle with the smallest angle in all the internal monitoring angles as an initial axis, taking every 45 degrees as an internal sector area, and dividing 8 internal sector areas; all monitoring stations in the outer monitoring area are connected with target monitoring points, the outer monitoring angle between two adjacent monitoring stations and the target monitoring points is obtained, the angular bisector of the outer monitoring angle with the minimum angle in all the outer monitoring angles is used as an initial axis, every 45 degrees is used as an outer sector area, and 8 inner sector areas are divided.

Therefore, monitoring stations are arranged in each sector area as much as possible, the use of virtual monitoring stations is reduced, and the accuracy is improved.

Then, judging each sector area, and if one or more monitoring stations exist in one area, distributing weights to the recorded data of each monitoring station in the area according to the distances between the monitoring stations and a target monitoring station to perform regression operation so as to obtain the average monitoring data of the area; if the area has no monitoring station, a virtual monitoring station is generated in the center of the areas, and the data of the virtual monitoring station is interpolated by using a classical spatial interpolation method and inverse Distance weighted IDW (inverse Distance weighted).

The key point of this method is to designate one feature as a primary feature and the other features as secondary features. Wherein the main characteristic refers to the historical atmospheric quality index of a target monitoring station

And historical meteorological monitoring data

Its data and predicted target data

All from the same monitoring station, with auxiliary features

And

it is a monitored site from 16 sectors of the perimeter.

And (3) a spatial domain aggregation algorithm: when the airspace is divided, due to the distribution unevenness of the monitoring stations on the geographic factors and the limitation of other factors, a plurality of detection stations may exist in some areas, the data is excessive, the redundancy is increased, the weight is distributed to the recorded data of each monitoring station in the area for regression operation, the average monitoring data of the area is obtained, and the following formula is used for calculation:

wherein y is the average monitoring data of the area, W is different weight values, and the size of W is determined according to the distance between each monitoring point in the area and the target monitoring point.

And (3) space domain difference algorithm: when the space domain is divided, areas obtained by dividing some remote target monitoring stations do not have monitoring stations, a virtual monitoring station is generated in the area to complement the missing value in the area, and the data of the virtual monitoring station in the area are generated by utilizing the captured data of the monitoring stations in the surrounding area. An inverse distance weighting method is to be used which uses a linear weighted set of available values at known points to calculate the assigned value for an unknown point, using the following formula:

where Z (x, y) is the difference prediction output, (x, y) is the difference point coordinates, (x_i，y_i) Is a discrete point coordinate, w_iIs the weight of the discrete point.

The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. An air quality analysis and prediction method based on deep learning and Bayesian model, characterized in that it includes:

Step S1: obtain the AQI data of the target monitoring point;

Step S2: preprocessing the AQI data, judging and removing outliers in the data sequence according to the Layda criterion, and using linear interpolation to complete the missing data at a certain moment;

Step S3: normalize the AQI data;

Step S4: respectively constructing a deep learning convolutional network model, a recurrent neural network model and a Yebesian dynamic linear model;

Step S5: Input the normalized AQI data into the deep learning convolutional network model and the Yebesian dynamic linear model respectively. The sequence of the Yebesian dynamic linear model outputs the first predicted AQI data after running;

Step S6: input the sequence composed of the features extracted by the deep learning convolutional network model into the cyclic neural network model, and the cyclic neural network model outputs the second predicted AQI data after running;

Step S7 : constructing a mixed model, inputting the first predicted AQI data and the second predicted AQI data into the mixed model, and after the mixed model runs, the final predicted AQI data is output.

2. A kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 1, it is characterized in that: the normalization process described in the described step S3 is to put the value range of the data in the In a relatively small fluctuation range, reduce the impact of different orders of magnitude or different dimensions on the data, set the feature distribution as a normal distribution, and map the features to the standard normal distribution through the variance and mean. The calculation formula is :

where y _mean is the mean of all sample data, and y _std is the standard deviation of all sample data.

3. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 1, is characterized in that: in described step S4, specifically include:

Step S41, for the constructed model, select training data and test data in the AQI data, and complete the initialization of the deep learning convolutional network model, the recurrent neural network model and the Yebesian dynamic linear model;

Step S42, using the training data to train the deep learning convolutional network model, the recurrent neural network model and the Yebesian dynamic linear model;

Step S43, using the trained deep learning convolutional network model, cyclic neural network model and Yebesian dynamic linear model to obtain the prediction result of the test according to the test data;

Step S44, using the trained deep learning convolutional network model, recurrent neural network model and Yebesian dynamic linear model for prediction.

4. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 3, it is characterized in that: in described step S5, leaf Bayesian dynamic linear model comprises: observation equation, state equation and initial information , regard the prediction distribution as a conditional general distribution, obtain the prediction distribution according to the prior information, use the Bayesian formula to obtain the posterior information, and modify the prior information to obtain the predicted value.

5. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 3, is characterized in that: for cyclic neural network model, the loss function of its training stage is as follows:

where a is the predicted value and y is the sample value.

6. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 3, is characterized in that: also include Adam algorithm and Dropout algorithm in described cyclic neural network model;

The Adam algorithm is used to calculate the first-order moment estimation and the second-order moment estimation of the gradient and design independent adaptive learning rates for different parameters;

The Dropout algorithm is used to reduce the dependencies between features and reduce the probability of overfitting.

7. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 1, is characterized in that: also comprises step S8, obtains MEO data;

Step S9, carries out correlation analysis based on MEO data and AQI data;

Step S10, performing backward trajectory and potential source contribution analysis based on MEO data and AQI data;

Step S11 , import the correlation analysis result together with the backward trajectory and potential source contribution analysis result into the final predicted AQI data to obtain comprehensive improvement suggestions.

8. a kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 7, is characterized in that: the correlation analysis in described step S9 specifically comprises: take PM2.5 and PM10 as respectively as For the first variable, import the weather, temperature, air pressure, humidity, wind speed and wind direction together as the second variable into the following formula:

where x _i and y _i are the two variables to compare the correlation,

is the mean of the variable x _i ,

is the mean of the variable _yi , and r is the Spearman correlation coefficient. When the two variables are completely monotonically correlated, r is +1 or -1, and when the two variables are not correlated, r is 0.

9. A kind of air quality analysis and prediction method based on deep learning and Bayesian model according to claim 7, is characterized in that: in described step S10, backward trajectory and potential source contribution analysis specifically include: according to latitude and longitude The study area is divided into i×j grids, and the PSCF calculation formula is:

where n _ij is the number of all airflow trajectories passing through a grid (i, j), and m _ij is the number of pollution trajectories passing through the grid (i, j).