CN109685249A

CN109685249A - Air PM2.5 concentration prediction method based on AutoEncoder and BiLSTM fused neural network

Info

Publication number: CN109685249A
Application number: CN201811411059.4A
Authority: CN
Inventors: 张波; 张旱文; 李美子; 赵勤; 秦东明
Original assignee: Shanghai Normal University
Current assignee: Shanghai Normal University; University of Shanghai for Science and Technology
Priority date: 2018-11-24
Filing date: 2018-11-24
Publication date: 2019-04-26

Abstract

The air PM2.5 concentration prediction method based on AutoEncoder and BiLSTM fused neural network that the present invention relates to a kind of, it include: step S1: according to the data of environmental monitoring PM2.5 pollutant concentration and meteorological factor, take PM2.5 as the target contaminant of prediction, constructs the model of target cities PM2.5 concentration prediction；Step S2: selecting trained and test data from environmental monitoring data, completes initialization and training to prediction model；Step S3: air PM2.5 concentration is predicted using the model that training is completed.Compared with prior art, the present invention is able to carry out profound analysis to the feature of contamination data, so as to extract profound connection between data, effectively utilizes environment big data, realizes the promotion of environmental management level.

Description

Air PM2.5 concentration based on AutoEncoder and BiLSTM fused neural network is pre- Survey method

Technical field

The present invention relates to a kind of urban air pollution object concentration prediction methods, are based on more particularly, to one kind The air PM2.5 concentration prediction method of AutoEncoder and BiLSTM fused neural network.

Background technique

Nowadays, many developing countries still suffer from air pollution problems inherent.For example, in 2016, the north is big in China The air quality of urban is below the number of days that national health air quality standard is more than 40%.Air pollution will lead to many Serious health problem, such as respiratory disease, cardiovascular disease and decline in pulmonary function.Therefore, control air pollution causes public affairs Many extensive concerns.Environmental Protection Agency (EPA) is by particulate matter (PM), ground level ozone (O3), carbon monoxide (CO), titanium dioxide Sulphur (SO2) and nitrogen oxides (NOx) are classified as the key pollutants in National Ambient Air Quality Standards.In these pollutants, Particulate matter (PM) is most dangerous, because it is maximum to the negative effect of publilc health and environment, is sent out according to the World Health Organization The information of cloth, PM2.5 (aerodynamic diameter is less than 2.5um) even increase mortality risk.Therefore, quasi- under existing situation The concentration of true prediction pollutant PM2.5 has for providing timely, complete environmental quality information and then protecting public health It is of great importance.Accurate Prediction pollutant concentration needs to make full use of the mass data under current environment, to make more accurate Prediction.

Currently, many researchers study the prediction of pollutant concentration both at home and abroad, wherein most is used Conventional model prediction technique is not bound with deep learning method.The CMAQ such as issued by U.S.EPA traditional prediction model, The pollution level that air is evaluated by the simulation to the concentration of various pollutants matter in air, predicts following air regime. Zhang Kaimei etc. by predicting with statistical analysis and meteorological statistical method the PM2.5 in Nanchang, and establish summer and The statistics specialty of non-summer PM2.5 mass concentration and meteorological element.Giorgio Corani et al. devises multi-tag point The Air Pollution Forecasting model of class, the prediction using bayesian algorithm to multiple pollutant concentrations.Kuremoto et al. using by It limits the depth network that Boltzmann machine is constituted and carries out time series forecasting, introduce time series, there is precision of prediction compared with conventional model It is improved.BT Ong et al. carries out the prediction of pollutant concentration using Zhan Shi from coding and RNN model, and Zhan Shi encodes work certainly For the bottom of prediction model, the input as high level RNN network is exported, makes network with the time using the method for dynamic training Sequence gradually adjusts weight, is that model can reach an ideal state, has compared with other conventional machines learning models Preferable result.

The above conventional model prediction technique and machine learning prediction technique, all in air pollutant concentration prediction work In have the advantages that it is respective.This quadrat method does not all carry out the analysis of deeper to data, from without that can extract data depth The connection of level.On the other hand, there is also correlations with other weather variables for pollutant concentration, and based on traditional prediction side Method is usually to be predicted according to historical data and experience, is not able to verify that influence of other meteorological datas to pollutant concentration.

Summary of the invention

It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind to be based on The air PM2.5 concentration prediction method of AutoEncoder and BiLSTM fused neural network.

The purpose of the present invention can be achieved through the following technical solutions:

A kind of air PM2.5 concentration prediction method based on AutoEncoder and BiLSTM fused neural network, comprising:

Step S1: taking PM2.5 as the mesh of prediction according to the data of environmental monitoring PM2.5 pollutant concentration and meteorological factor Pollutant is marked, the model of target cities PM2.5 concentration prediction is constructed；

Step S2: selecting trained and test data from environmental monitoring data, completes initialization and instruction to prediction model Practice；

Step S3: air PM2.5 concentration is predicted using the model that training is completed.

The model is using AutoEnocder as bottom, using BiLSTM as upper layer, the lower compression and extraction input data Feature, and with extraction time sequence signature and final prediction result is generated for result as the input on upper layer.

The step S2 is specifically included:

Step S21: the data for modeling are normalized, wherein the data for modeling include original Parent pollutant data and original meteorological data；

Step S22: by data set according to 80%, 10%, 10% ratio cut partition training set, verifying collection and test set；

Step S23: training set data training pattern is utilized；

Step S24: the generalization ability of verifying collection and the resulting prediction model of test set data verification training is utilized.

In the step S21, specifically it is normalized using min-max standardized method:

Wherein: NewVariable be data normalization treated value, max (variable) be data maximum value, Min (variable) is the minimum value of data, and variable is the original value of data.

The step S23 is specifically included:

Step S231: initial data is converted to two-dimensional matrix:

Wherein: X is the two-dimensional matrix converted by initial data；

Step S232: reconfiguring two-dimensional matrix, the data splitting needed:

X '=X* θ (β₁,β₂,…,β_n)

Wherein: X ' is the data after being reconfigured based on two-dimensional matrix X, θ is the two-dimensional matrix being made of 0 and 1, β_iIt is 0 Vector or 1 vector；

Step S233: data characteristics is compressed and extracted to data splitting, and reconstructs data splitting:

X "=σ ' (W ' Y+b ')

Y=σ (WX '+b)

Wherein: σ is relu activation primitive, and W is weight matrix, and b is bias vector；X " (i, j) is the number of combinations after reconstruct According to σ ' is relu activation primitive, and W ' is the weight matrix of decoding stage, and Y is after encoding as a result, b ' is decoding stage Bias vector.

Step S234: being trained using AutoEncoder of the data splitting after weight to model, is set until error is less than Determine threshold value, executes step S235；

Step S235: entire model is trained.

Reconstructed error are as follows:

∈ (X ', X ")=| | X '-σ ' (W ' (σ (WX '+b))+b ') | |²

Wherein: ∈ (X ', X ") is reconstructed error, | | | | it is norm.

Step S24 is specifically included:

Step S241: the prediction result that test set data input model is obtained；

Step S242: related coefficient is calculated:

Wherein: ρ (O, P) is related coefficient, and Cov (O, P) is the covariance of observation and predicted value, and D (O) is observation Variance, D (P) are the variance of predicted value；

Step S243: root-mean-square error is calculated:

Wherein: RMSE is root-mean-square error, O_iFor the value of truthful data, P_iFor the value of model prediction, i is time serial number, n For the total duration of prediction.

Compared with prior art, the invention has the following advantages:

1) in view of the influence that other meteorological variables predict pollutant concentration, and determine any meteorological variables to pollution Object concentration influences maximum.

2) prediction work of pollutant the experience and historical experience that a large amount of historical datas sum up are not depended only on into The changing rule of pollutant is concluded, so as to fully consider atmospheric environment this problem complicated and changeable.

3) it is able to carry out profound analysis, to the feature of contamination data so as to extract profound connection between data System effectively utilizes environment big data, realizes the promotion of environmental management level.

4) the more traditional prediction technique of the accuracy predicted is high, under same operating time and operating condition, can produce Raw better result.

Detailed description of the invention

Fig. 1 is the key step flow diagram of the method for the present invention；

Fig. 2 is the flow diagram in the embodiment of the present invention；

Fig. 3 is the structural schematic diagram of prediction model constructed by the present invention.

Specific embodiment

The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to Following embodiments.

The present invention is first defined air pollutant concentration prediction:

Define the prediction of 1 air pollutant concentration: right mainly by Historical Pollution object and relevant weather data information Concentration of the PM2.5 air pollutants in the following certain time is predicted, is environmental science, Meteorological Science, computer science Deng all in one of project of primary study, thus there is certain subject crossing.

Define 2 Classical forecast methods: the air pollutant concentration prediction technique of non-deep learning is referred to as traditional prediction side Method, the prediction of the empirical model such as based on historical data and statistical method；Based on statistics and mathematical method or model foundation The prediction of probabilistic model；Utilize the prediction of integrated approach；And the prediction model etc. established based on conventional machines study, it belongs to Classical forecast method.

In the following, providing pre- using the prediction model of the invention based on AutoEncoder and BiLSTM fused neural network The method for surveying PM2.5 concentration, as depicted in figs. 1 and 2, comprising:

Step S1: taking PM2.5 as the mesh of prediction according to the data of environmental monitoring PM2.5 pollutant concentration and meteorological factor Pollutant is marked, the model of target cities PM2.5 concentration prediction is constructed；The model is using AutoEnocder as bottom, with BiLSTM For upper layer, the lower compression and extraction input data feature, and it is special with extraction time sequence using result as the input on upper layer It levies and generates final prediction result.

In Fig. 2, prior to the training of entire model, loss function uses MSE, and passes through anti-for the training of AutoEncoder Error propagation is carried out to propagation algorithm and is connected to the network the update of weight；Entire model is added in trained AutoEncoder It is trained.The two-dimensional matrix of input carries out feature extraction and compression by the Encoder of AutoEncoder, and by its result It is input in BiLSTM, carries out the extraction of time series feature, and export final prediction result.Wherein autoencoder network is sharp Function living uses relu, and loss function uses mse, and the activation primitive of BiLSTM uses tanh, and loss function uses mse, uses Adam optimization algorithm updates neural network weight.

Step S2: selecting trained and test data from environmental monitoring data, completes initialization and instruction to prediction model Practice, specifically include:

Step S21: since pollutant is different with the unit of meteorological data, place is normalized in the data for being accordingly used in modeling Reason improves the training speed and precision of prediction of model, chooses min-max standardized method in the present invention, i.e., specifically use min- Max standardized method is normalized:

Wherein: NewVariable be data normalization treated value, max (variable) be data maximum value, Min (variable) is the minimum value of data, and variable is the original value of data.Initial data mean value and standard deviation are given, It is indicated respectively with μ and σ, treated data all meet μ=0, the standardized normal distribution of σ=1.

The rationally error threshold of setting model, value range is between 0.001-0.0001, and learning rate is in 0.01-0.3 Between value, maximum number of iterations be 1000 times, the self-loopa coefficient of BiLSTM takes 0.006, λ that 1e-6, ζ is taken to take 0.9.For AutoEncoder network, coding layer and decoding layer take bilayer, and wherein activation primitive uses relu, and loss function uses MSE, optimizer use gradient descent algorithm (sgd), and BiLSTM network is single layer, neuronal quantity 50.

Step S23: training set data training pattern is utilized；

Firstly, it is necessary to convert two-dimensional matrix for the pollutant data and meteorological data of the training set of input, matrix it is every One behavior contaminant information and various meteorological data information, it is each to be classified as a certain specific contaminant information or meteorological data Information,

Firstly, defining loss function:

1. the loss function for defining its training stage is as follows for AutoEncoder network:

Wherein, O_iIt is the true value of target contaminant, P_iIt is the predicted value of target contaminant, i is time serial number, and N is prediction Total duration.MSE is smaller, it was demonstrated that the accuracy of prediction is higher.

2. the loss function for defining its training stage is as follows for entire model:

Function setup is root-mean-square error.In formula, O_iIt is the true value of target contaminant, P_iIt is the prediction of target contaminant Value, i are time serial number, and N is the total duration of prediction.Loss function is smaller, it was demonstrated that the accuracy of prediction is higher.

Then training process specifically includes:

Step S231:

Wherein: X is the two-dimensional matrix converted by initial data；

Step S232: reconfiguring two-dimensional matrix, the data splitting needed:

X '=X* θ (β₁,β₂,…,β_n) (5)

Wherein: X ' is the data after being reconfigured based on two-dimensional matrix X, and θ is the two-dimensional matrix being made of 0 and 1, β_iIt is 0 Vector or 1 vector；

X "=σ ' (W ' Y+b ') (6)

Y=σ (WX '+b) (7)

Wherein: σ is relu activation primitive, and W is weight matrix, and b is bias vector；X " (i, j) is the number of combinations after reconstruct According to σ ' is relu activation primitive, and W ' is the weight matrix of decoding stage, and Y is after encoding as a result, b ' is the inclined of decoding stage Set vector.

In order to which training is from encoding model, need to minimize reconstructed error, reconstructed error calculation formula is as follows:

∈ (X ', X ")=| | X '-σ ' (W ' (σ (WX '+b))+b ') | |² (8)

Wherein: ∈ (X ', X ") is reconstructed error, | | | | it is norm.

It mainly include the following factor { PM2.5 concentration, temperature, wind speed, dew point temperature in the two-dimensional matrix of this stage input Degree, snowfall, precipitation, other meteorological variables }, the two-dimensional matrix of input is compressed, real data characteristics, net are obtained Input value accurately can be translated into pollutant concentration situation by network, establish the mapping for being input to output.With formula (7) weighing apparatus Amount is decoded the data characteristics after compression from coding from the accuracy of coding, make according to the obtained data of decoding with not The data of compression are compared, constantly training, keep error smaller and smaller, after network meets expected expectation, stop the first rank The training of section, into the training of second stage.

Step S235: entire model is trained.Two-dimentional input matrix is after AutoEncoder compression and feature extraction The one-dimensional vector with timing for highly concentratedization being converted into has time series pre- as BiLSTM layers of input, model The function of survey, input of the d hours values as entire model before t moment, the target of prediction is N hours after t moment The concentration value (d be all the time window set with N) of PM2.5.X is enabled to indicate input, t indicates dynamic time series, U, W table Showing that weight matrix, h indicate hidden layer information, b indicates biasing, indicates training process unidirectional in BiLSTM with following formula, Opposite direction process is similar with one-way process:

Certain data informations of the past PM2.5 of A.BiLSTM selective amnesia first and other meteorological variables,

f_t=σ (U_fh_t-1+W_fx_t+b_f) (9)

B. it determines to store new information in location mode, which comes from two parts, the sigmoid layer of " input threshold " Determining the information updated, tanh layers create new candidate value vector,

i_t=σ (U_ih_t-1+W_ix_t+b_i) (10)

C. the update of oldState is carried out,

D. final decision output information, the PM2.5 concentration as predicted,

o_t=σ (U_oh_t-1+W_ox_t+b_o) (12)

The predicted value of BiLSTM output exports final result by softmax.Adam optimization algorithm is used in entire model, Network weight is updated by the continuous iteration of training data, while in order to exclude neural metwork training when is also easy to produce overfitting problem It influences, the present invention is controlled using dropout the relevant technologies, avoids overfitting problem, training process continues to model Performance meets expectation.After model training, each connection weight and parameter also determine therewith.

Step S24: specific to wrap using the generalization ability of verifying collection and the resulting prediction model of test set data verification training It includes:

Step S241: the prediction result that test set data input model is obtained；

Step S242: related coefficient is calculated:

Step S243: root-mean-square error is calculated:

If calculating the error of resulting true value and observation in preset threshold value, and it is better than traditional prediction method Prediction result under square one then illustrates that model meets and is expected, can be used for predicting the city in the following certain time The concentration of PM2.5.

Performance Evaluation is carried out to the resulting prediction model of training.Through experimental analysis and comparison, compared to other existing sides Method, in the same circumstances, it is higher as a result, and can make full use of the pollution of magnanimity that model of the invention can generate accuracy Object and meteorological data.

In conclusion the prediction model based on AutoEncoder and BiLSTM fused neural network constructed by the present invention It is built upon in the research of already present two kinds of deep neural networks, using the characteristics of two kinds of neural networks and advantage, establishes A kind of model that can predict the PM2.5 concentration in the future certain time of target cities.Used loss function is also previous Research present in, and being proved to can weighing result accuracy well.So the present invention is polluted for previous prediction The deficiency of the method for object concentration takes full advantage of already present research achievement, proposes and is merged based on two kinds of deep neural networks Prediction model.The model carries out the extraction and compressed data of important feature to input data using AutoEncoder as bottom Model training efficiency can be improved, export input of the result as high level BiLSTM network, extract the time series of pollutant Feature can fully take into account the temporal associativity of pollutant, can adequately utilize the letter in the past with future using BiLSTM Breath can obtain more having accurate prediction result, thus have substantial application prospect.

Claims

1. a kind of air PM2.5 concentration prediction method based on AutoEncoder and BiLSTM fused neural network, feature exist In, comprising:

Step S1: according to the data of environmental monitoring PM2.5 pollutant concentration and meteorological factor, with the target dirt that PM2.5 is prediction Object is contaminated, the model of target cities PM2.5 concentration prediction is constructed；

Step S2: selecting trained and test data from environmental monitoring data, completes initialization and training to prediction model；

2. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 1 is dense Spend prediction technique, which is characterized in that the model is using AutoEnocder as bottom, using BiLSTM as upper layer, the lower compression With extract input data feature, and with extraction time sequence signature and generate final prediction using result as the input on upper layer and tie Fruit.

3. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 2 is dense Spend prediction technique, which is characterized in that the step S2 is specifically included:

Step S21: the data for modeling are normalized, wherein the data for modeling include original dirt Contaminate object data and original meteorological data；

Step S23: training set data training pattern is utilized；

4. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 3 is dense Spend prediction technique, which is characterized in that in the step S21, be specifically normalized using min-max standardized method:

Wherein: NewVariable be data normalization treated value, max (variable) be data maximum value, min It (variable) is the minimum value of data, variable is the original value of data.

5. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 3 is dense Spend prediction technique, which is characterized in that the step S23 is specifically included:

Step S231: initial data is converted to two-dimensional matrix:

Wherein: X is the two-dimensional matrix converted by initial data；

Step S232: reconfiguring two-dimensional matrix, the data splitting needed:

X '=X* θ (β₁,β₂,…,β_n)

Wherein: X ' is the data after being reconfigured based on two-dimensional matrix X, and θ is the two-dimensional matrix being made of 0 and 1, β_iFor 0 vector or 1 vector；

X "=σ ' (W ' Y+b ')

Y=σ (WX '+b)

Wherein: σ is relu activation primitive, and W is weight matrix, and b is bias vector；X " (i, j) is the data splitting after reconstruct, σ ' For relu activation primitive, W ' is the weight matrix of decoding stage, and Y is after encoding as a result, b ' is being biased towards for decoding stage Amount.

Step S234: being trained using AutoEncoder of the data splitting after weight to model, until error is less than setting threshold Value executes step S235；

Step S235: entire model is trained.

6. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 5 is dense Spend prediction technique, which is characterized in that reconstructed error are as follows:

∈ (X ', X ")=| | X '-σ ' (W ' (σ (WX '+b))+b ') | |²

Wherein: ∈ (X ', X ") is reconstructed error, | | | | it is norm.

7. a kind of air PM2.5 based on AutoEncoder and BiLSTM fused neural network according to claim 3 is dense Spend prediction technique, which is characterized in that step S24 is specifically included:

Step S241: the prediction result that test set data input model is obtained；

Step S242: related coefficient is calculated:

Wherein: ρ (O, P) is related coefficient, and Cov (O, P) is the covariance of observation and predicted value, and D (O) is the side of observation Difference, D (P) are the variance of predicted value；

Step S243: root-mean-square error is calculated:

Wherein: RMSE is root-mean-square error, O_iFor the value of truthful data, P_iFor the value of model prediction, i is time serial number, and n is pre- The total duration of survey.