CN105184012B

CN105184012B - A kind of regional air PM2.5 concentration prediction methods

Info

Publication number: CN105184012B
Application number: CN201510626776.9A
Authority: CN
Inventors: 史旭华; 俞杰; 童楚东; 傅晓钦; 汪伟峰; 蓝艇; 杨忠; 李海琴; 陈煜琛
Original assignee: ENVIRONMENT MONITORING CENTER OF NINGBO; Ningbo University
Current assignee: ENVIRONMENT MONITORING CENTER OF NINGBO; Ningbo University
Priority date: 2015-09-28
Filing date: 2015-09-28
Publication date: 2017-12-22
Anticipated expiration: 2035-09-28
Also published as: CN105184012A

Abstract

The invention discloses a kind of regional air PM2.5 concentration prediction methods, this method constructs the training sample data of Support vector regression model to be trained by historical data first, then the Support vector regression model trained by training sample data, the Support vector regression model after this is trained is as PM2.5 concentration prediction models；Population optimizing algorithm is combined with PM2.5 concentration prediction models again, pass through the continuous optimizing iteration of population optimizing algorithm, go to reconstruct the input parameter of PM2.5 concentration prediction models using the position of particle so as to constantly, until the final global polarity of population is obtained after the completion of iteration, go to reconstruct the input parameter of PM2.5 concentration prediction models using the position of particle corresponding to the final global extremum of population, it is PM2.5 concentration that the input parameter is inputted in PM2.5 concentration prediction models to obtained output；Advantage is can to reduce the dimension of PM2.5 concentration prediction model input parameters, improves PM2.5 concentration prediction accuracys rate.

Description

Regional air PM2.5 concentration prediction method

Technical Field

The invention relates to a PM2.5 concentration prediction method, in particular to a regional air PM2.5 concentration prediction method.

Background

Along with the rapid advance of industrialization, the atmospheric haze phenomenon is more and more serious, PM2.5 is one of the main ruderal of the haze phenomenon, the particle size is small, the particles can suspend in the air for a long time and spread, and toxic and harmful substances can be carried into respiratory tracts and lungs, so that the direct threat to the health of human bodies is caused. Therefore, the PM2.5 concentration prediction is significant based on the historical monitoring data of the atmosphere. However, the influence factors of the PM2.5 concentration are complex and various, such as: immediate industrial emissions, atmospheric pollution particulate concentration, meteorological conditions, seasons, and solar radiation, making concentration predictions challenging.

The first application of the method for predicting the air pollutant concentration is a multiple regression model, with the development of artificial neural networks, many scholars study the neural network model to predict the air pollutant concentration, and the displayed effects are superior to those of the multiple regression model. However, neural networks generally have the problem of overfitting due to insufficient training data of atmospheric pollutant concentration. A Support Vector Machine (SVM) is a new learning Machine proposed in the 90 s of the 20 th century, and the SVM has been widely applied in the fields of classification, time prediction, function estimation and the like because the SVM has a good generalization capability in the aspect of solving the problems of small samples and nonlinearity. Sanchez et al, xue et al, nieto et al propose using a Support Vector Regression (SVR) model f (X) = < ω, X > + b, where ω, b are parameters to be identified, X is the input of the model, and <, > represents the dot product operation. A method for predicting the concentration of air pollutants. In both an SVR model and an artificial neural network model, the time sequence characteristics of variables are required to be considered for the prediction of pollutant concentration, and a time-lag factor is considered in the input parameters of the model, so that the dimension of the input parameters of the model is high, training sample data is insufficient, and the prediction accuracy of the model is low. To this end, researchers have proposed a genetic-neural network-based (GA-ANN) model, which uses A Neural Network (ANN) model to perform predictive modeling on the atmospheric pollutant concentration, and uses a Genetic Algorithm (GA) to perform 0 or 1 selection on input parameters according to a prediction performance index, so that although the dimensionality of the input parameters is reduced, overfitting of the model is avoided to some extent, but at the same time, useful information of some input parameters is lost, and the prediction accuracy is still low.

Disclosure of Invention

The invention aims to provide a regional air PM2.5 concentration prediction method which can improve PM2.5 concentration prediction accuracy on the basis of reducing input parameter dimension of a prediction model.

The technical scheme adopted by the invention for solving the technical problems is as follows: a method for predicting PM2.5 concentration of regional air comprises the following steps:

(1) constructing training sample data of a regression model of a support vector machine to be trained:

(1) -1 marking the current time as time t, time t-n representing the first n times of the current time, n =1,2,3,4, \ 8230;

the atmospheric visibility of regional air measured at the moment t-1 is recorded as x _N (t-1) haze is recorded as x _Z (t-1) air temperature is denoted as x _T (t-1) the gas pressure is denoted by x _p (t-1) wind speed x _w (t-1)；

Will t-t _d The concentration of SO2 in the air of the region measured at the moment is recorded as x _so2 (t-t _d ) And the concentration of NO2 is denoted as x _No2 (t-t _d ) And the concentration of PM10 is recorded as x _pm10 (t-t _d ) And the concentration of PM2.5 is recorded as x _pm25 (t-t _d ) And the concentration of O3 is recorded as x _o3 (t-t _d ) And the concentration of CO is recorded as x _co (t-t _d )；t _d Is a time lag factor and t _d =1,2, \8230, d, d is an integer greater than or equal to 2; the PM2.5 concentration predicted at time t is recorded as y _pm25 (t)；

(1) -2 regression model f (X) = of support vector machine to be trained<ω,X&The input parameter of gt; + b is recorded as X _in (T), the output of the regression model of the support vector machine to be trained is recorded as Y _out (T)，X _in (T) and Y _out (T) forming pairs of training parameters, order

Y _out (T)＝x _pm25 (T+1) (2)

Wherein T = T-2,t-3,t-4, \ 8230t, T-501, [ 2 ]]Is a matrix symbol, X _in (T) is a matrix of 1 × (6 × d + 5), symbol × is a multiplication symbol;

(1) -3, acquiring historical measurement data of SO2 concentration, NO2 concentration, PM10 concentration, PM2.5 concentration, O3 concentration, CO concentration, atmospheric visibility, turbidity, air temperature record, air pressure and air speed in regional air before time t to obtain { X } _in (t-2),Y _out (t-2)}，{X _in (t-3),Y _out (t-3)}，…，{X _in (t-501),Y _out (t-501) } 500 training parameter pairs in total, and taking the 500 training parameter pairs as training sample data of a regression model of the support vector machine to be trained; (2) support vector machine regression model f (X) = to-be-trained by adopting training sample data<ω,X&B, training to obtain model parameters omega and b of a regression model of the support vector machine; substituting the model parameters omega and b of the support vector machine regression model into the support vector machine regression model f (X) =<ω,X> + b is the branch after trainingA support vector machine regression model, wherein the trained support vector machine regression model is used as a PM2.5 concentration prediction model;

(3) initializing particle swarm parameters: in the [0,1 ]]Randomly generating a particle swarm with the size of N within the range of N =100, wherein each particle contains a position and a velocity attribute, randomly initializing the velocity and the position of each particle in the particle swarm, and recording the current position of the initialized ith particle as W _i (0) The current speed is denoted as V _i (0) I =1,2, \ 8230;, N, where W _i (0) And V _i (0) Are all 1 × m matrices, W _i (0)＝[w _1i (0),w _2i (0),…,w _mi (0)],i＝1,2,…,N，w _1i (0),w _2i (0),…,w _mi (0)∈[0,1]，V _i (0)＝[v _1i (0),v _2i (0),…,v _mi (0)],i＝1,2,…,N，v _1i (0),v _2i (0),…,v _mi (0)∈[0,1]M =6 × d +5; record the current position of the ith particle after the kth iteration as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current speed after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)]，k＝1，2，…，k _max ；k _max The total iteration number of the particle swarm is obtained;

(4) defining the current position of each particle in the particle swarm as a weight coefficient matrix of an input parameter of a PM2.5 concentration prediction model; respectively adopting the current positions of N particles to input parameters X of a PM2.5 concentration prediction model _in (t-1) reconstructing to obtain N reconstructed PM2.5 concentration prediction model input parameters in the initial state, and recording the input parameters of the ith reconstructed PM2.5 concentration prediction model in the initial state as

(5) Will be provided withThe initial value of the individual extreme value of the ith particle in the particle swarm is recorded asF _i (0)＝|x _pm25 (t-1)-y _outi (t, 0) |, where | is the symbol of absolute value, y _outi (t, 0) isInputting the parameters into a PM2.5 concentration prediction model as input parameters to obtain the output of the PM2.5 concentration prediction model; recording the initial value of the global extreme value of the particle swarm

(6) Updating the input parameters of the reconstructed PM2.5 concentration prediction model by adopting the updated position after each iteration of the particle swarm, updating the input parameters of the reconstructed PM2.5 concentration prediction model after the kth iteration of the particle swarm for the kth time, and recording the input parameters of the ith reconstructed PM2.5 concentration prediction model after the kth update as the input parameters of the reconstructed PM2.5 concentration prediction model

Will be provided withInputting the parameters into a PM2.5 concentration prediction model to obtain the output of the PM2.5 concentration prediction model, and recording the output asWill be provided withSubstituting into formula F _i (k)＝|x _pm25 (t-1)-y _outi In (t, k) |, calculating to obtain the fitness value F corresponding to the ith particle after the kth updating _i (k) (ii) a The symbol of absolute value is | |; after the kth iteration, the individual extreme value of the ith particle in the particle swarm is recorded asThe individual extreme value of the ith particle after the kth iterationThe corresponding particle position is denoted W _i (a) A is an integer of 0 to k inclusive; recording the global extreme value of the particle swarm after the kth iteration as the global extreme value min represents taking the minimum value; the global extreme valueThe position of the corresponding particle is denoted as W _h (g) H is an integer of 1 to N, and g is an integer of 0 to k;

(7) updating and iterating the particle swarm from k =1 to obtain the updated speed and position of each particle, and updating the individual extreme value of the particle and the global extreme value of the particle swarm according to the step (6) until k = k _max Ending the updating iteration to obtain a final global extreme value of the particle swarm;

(8) and updating the input parameters of the reconstructed PM2.5 concentration prediction model by adopting the positions of the particles corresponding to the global extreme values of the final particle swarm, inputting the updated input parameters into the PM2.5 concentration prediction model, and recording the output of the PM2.5 concentration prediction model as the predicted PM2.5 concentration.

The process of updating and iterating the particle swarm in the step (7) is as follows: k is the current iteration number of the particle swarm, and the current position of the ith particle after the kth iteration is recorded as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current speed of the ith particle after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)](ii) a The particles update their own velocity and position according to equations (4) and (5):

W _i (k+1)＝W _i (k)+V _i (k) (5)

in the formula (I), the compound is shown in the specification,taking the inertia weight as 0.6; i =1,2,. N; k is the current iteration number, and k is taken as the maximum _max ，k _max ＝1000；V _i (k) Is the velocity of the particle; c. C ₁ And c ₂ Is a non-negative constant, called acceleration factor, and is taken as ₁ ＝c ₂ ＝2；r ₁ And r ₂ Is distributed in [0,1 ]]Random number in between, V _i (k)∈[-V _max ,V _max ]，W _i (k)∈[-X _max ,X _max ]，v _min Represents the minimum velocity, v, of the ith particle _max Represents the maximum velocity, X, of the ith particle _max Take 10,V _max And 5, taking.

Compared with the prior art, the method has the advantages that training sample data of the regression model of the support vector machine to be trained are firstly constructed through historical data, then the regression model of the trained support vector machine is obtained through the training sample data, and the trained regression model of the support vector machine is used as a PM2.5 concentration prediction model; combining the particle swarm optimization algorithm with the PM2.5 concentration prediction model, continuously optimizing and iterating through the particle swarm optimization algorithm, continuously adopting the positions of the particles to reconstruct the input parameters of the PM2.5 concentration prediction model until the iteration is completed to obtain the final global polarity of the particle swarm, adopting the positions of the particles corresponding to the final global extreme value of the particle swarm to reconstruct the input parameters of the PM2.5 concentration prediction model, and inputting the input parameters into the PM2.5 concentration prediction model to obtain the output which is the predicted PM2.5 concentration; the method combines a support vector machine regression model and a particle swarm optimization algorithm, utilizes the support vector machine regression model as a PM2.5 concentration prediction model, and utilizes the particle swarm optimization algorithm to carry out unequal weight assignment between 0 and 1 on input variables of the PM2.5 concentration prediction model, so that the dimension of input parameters of the PM2.5 concentration prediction model can be reduced, the influence of the number of time-lag factors is not considered to a certain extent, and the accuracy of PM2.5 concentration prediction can be improved.

Drawings

Fig. 1 is a distribution diagram of a PM2.5 concentration measured value and a predicted value when d =2 and prediction is performed by using a regression model of an existing support vector machine;

fig. 2 is a distribution diagram of the measured PM2.5 concentration value and the predicted value when d =2 and the PM2.5 concentration is predicted by using the prediction method of the present invention;

fig. 3 is a distribution diagram of the measured PM2.5 concentration value and the predicted value when d =5 and prediction is performed by using the regression model of the conventional support vector machine;

fig. 4 is a distribution diagram of the measured PM2.5 concentration value and the predicted value when the PM2.5 concentration is predicted by the prediction method of the present invention when d = 5.

Detailed Description

The invention is described in further detail below with reference to the accompanying examples.

The first embodiment is as follows: a regional air PM2.5 concentration prediction method comprises the following steps:

the atmospheric visibility of regional air measured at the moment t-1 is recorded as x _N (t-1) haze is recorded as x _Z (t-1) air temperature is denoted as x _T (t-1) the gas pressure is denoted x _p (t-1) wind speed x _w (t-1)；

Will t-t _d The concentration of SO2 in the air of the area measured at the moment is recorded as x _so2 (t-t _d ) And the concentration of NO2 is denoted as x _No2 (t-t _d ) And the concentration of PM10 is recorded as x _pm10 (t-t _d ) And the concentration of PM2.5 is recorded as x _pm25 (t-t _d ) And the concentration of O3 is marked as x _o3 (t-t _d ) And the concentration of CO is recorded as x _co (t-t _d )；t _d Is a time lag factor and t _d =1,2, \8230d, d equals 5; the PM2.5 concentration predicted at time t is recorded as y _pm25 (t)；

(1) -2 regression model f (X) = of support vector machine to be trained<ω,X&The input parameter of + b is recorded as X _in (T), the output of the regression model of the support vector machine to be trained is recorded as Y _out (T)，X _in (T) and Y _out (T) forming pairs of training parameters, order

Y _out (T)＝x _pm25 (T+1) (2)

Wherein T = T-2,t-3,t-4, \ 8230t, T-501, [ 2 ]]Is a matrix symbol, X _in (T) is a matrix of 1 × (6 × d + 5), and symbol × is a multiplication symbol;

(1) -3, acquiring historical measurement data of SO2 concentration, NO2 concentration, PM10 concentration, PM2.5 concentration, O3 concentration, CO concentration, atmospheric visibility, turbidity, air temperature record, air pressure and air speed in regional air before t time to obtain { X } _in (t-2)，Y _out (t-2)}，{X _in (t-3)，Y _out (t-3)}，…，{X _in (t-501)，Y _out (t-501) } 500 training parameter pairs in total, and taking the 500 training parameter pairs as training sample data of a regression model of the support vector machine to be trained;

(2) training a support vector machine regression model f (X) = < omega, X > + b to be trained by adopting training sample data to obtain model parameters omega and b of the support vector machine regression model; substituting the model parameters omega and b of the obtained support vector machine regression model into a support vector machine regression model f (X) = < omega, X > + b to obtain a trained support vector machine regression model, and taking the trained support vector machine regression model as a PM2.5 concentration prediction model;

(3) initializing particle swarm parameters: in the [0,1 ]]Randomly generating a particle swarm with the population size of N, wherein N =100, each particle comprises a position and a velocity attribute, randomly initializing the velocity and the position of each particle in the particle swarm, and recording the current position of the initialized ith particle as W _i (0) The current speed is denoted as V _i (0) I =1,2, \8230, N, where W _i (0) And V _i (0) Are all 1 × m matrices, W _i (0)＝[w _1i (0),w _2i (0),…,w _mi (0)],i＝1,2,…,N，w _1i (0),w _2i (0),…,w _mi (0)∈[0,1]，V _i (0)＝[v _1i (0),v _2i (0),…,v _mi (0)],i＝1,2,…,N，v _1i (0),v _2i (0),…,v _mi (0)∈[0,1]M =6 × d +5; record the current position of the ith particle after the kth iteration as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current speed after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)]，k＝1，2，…，k _max ；k _max The total iteration number of the particle swarm is obtained;

(5) Recording the initial value of the individual extreme value of the ith particle in the particle swarmF _i (0)＝|x _pm25 (t-1)-y _outi (t, 0) |, where | is the symbol of absolute value, y _outi (t, 0) isInputting the output of the PM2.5 concentration prediction model into the PM2.5 concentration prediction model as an input parameter; recording the initial value of the global extremum of the particle swarm

(6) Updating the input parameters of the reconstructed PM2.5 concentration prediction model by using the updated position after each iteration of the particle swarm, wherein the position after the kth iteration of the particle swarm is repeatedUpdating the input parameters of the reconstructed PM2.5 concentration prediction model for the kth time, and recording the input parameters of the ith reconstructed PM2.5 concentration prediction model after the kth time of updating as the input parameters

Will be provided withInputting the parameters into a PM2.5 concentration prediction model to obtain the output of the PM2.5 concentration prediction model, and recording the output asWill be provided withSubstituting into formula F _i (k)＝|x _pm25 (t-1)-y _outi In (t, k) |, calculating to obtain the fitness value F corresponding to the ith particle after the kth updating _i (k) (ii) a | | is an absolute value symbol; after the kth iteration, the individual extreme value of the ith particle in the particle swarm is recorded asThe individual extreme value of the ith particle after the kth iterationThe corresponding particle position is denoted W _i (a) A is an integer of 0 to k inclusive; recording the global extreme value of the particle swarm after the kth iteration as the global extreme value min represents taking the minimum value; the global extreme valueThe position of the corresponding particle is denoted as W _h (g) H is an integer of 1 to N, and g is an integer of 0 to k;

(8) and updating the input parameters of the reconstructed PM2.5 concentration prediction model by adopting the positions of the particles corresponding to the global extreme values of the final particle swarm, and inputting the updated input parameters into the PM2.5 concentration prediction model, wherein the output of the PM2.5 concentration prediction model is the predicted PM2.5 concentration.

The second embodiment: a regional air PM2.5 concentration prediction method comprises the following steps:

Will t-t _d The concentration of SO2 in the air of the area measured at the moment is recorded as x _so2 (t-t _d ) And the concentration of NO2 is denoted as x _No2 (t-t _d ) And the concentration of PM10 is recorded as x _pm10 (t-t _d ) Concentration of PM2.5Is marked as x _pm25 (t-t _d ) And the concentration of O3 is recorded as x _o3 (t-t _d ) And the concentration of CO is recorded as x _co (t-t _d )；t _d Is a time lag factor and t _d =1,2, \ 8230;, d, d equals 5; the PM2.5 concentration predicted at time t is recorded as y _pm25 (t)；

Y _out (T)＝x _pm25 (T+1) (2)

Wherein T = T-2, T-3, T-4, \8230, T-501, [ 2 ]]Is a matrix symbol, X _in (T) is a matrix of 1 × (6 × d + 5), and symbol × is a multiplication symbol;

(1) -3, acquiring historical measurement data of SO2 concentration, NO2 concentration, PM10 concentration, PM2.5 concentration, O3 concentration, CO concentration, atmospheric visibility, turbidity, air temperature record, air pressure and air speed in regional air before time t to obtain { X } _in (t-2)，Y _out (t-2)}，{X _in (t-3)，Y _out (t-3)}，…，{X _in (t-501)，Y _out (t-501) } 500 training parameter pairs in total, and taking the 500 training parameter pairs as training sample data of a regression model of the support vector machine to be trained;

(5) Recording the initial value of the individual extreme value of the ith particle in the particle swarmF _i (0)＝|x _pm25 (t-1)-y _outi (t, 0) |, where | | | is the absolute value symbol, y _outi (t, 0) isInputting the output of the PM2.5 concentration prediction model into the PM2.5 concentration prediction model as an input parameter; recording the initial value of the global extremum of the particle swarm

(6) Updating the input parameters of the reconstructed PM2.5 concentration prediction model by adopting the updated position of the particle swarm after each iteration, updating the input parameters of the reconstructed PM2.5 concentration prediction model after the kth iteration of the particle swarm for the kth time, and recording the input parameters of the ith reconstructed PM2.5 concentration prediction model after the kth update as the input parameters of the reconstructed PM2.5 concentration prediction model

Will be provided withInputting the parameters into a PM2.5 concentration prediction model to obtain the output of the PM2.5 concentration prediction model, and recording the output asWill be provided withSubstituting into formula F _i (k)＝|x _pm25 (t-1)-y _outi In (t, k) |, calculating to obtain the fitness value F corresponding to the ith particle after the kth updating _i (k) (ii) a | | is an absolute value symbol; after the kth iteration, the individual extreme value of the ith particle in the particle swarm is recorded asThe individual extreme value of the ith particle after the kth iterationThe corresponding particle position is denoted W _i (a) A is an integer of 0 or more and k or less; recording the global extreme value of the particle swarm after the kth iteration as the global extreme value min represents taking the minimum value; the global extremumThe position of the corresponding particle is denoted as W _h (g) H is an integer of 1 to N, and g is an integer of 0 to k;

(8) and updating the input parameters of the reconstructed PM2.5 concentration prediction model by using the positions of the particles corresponding to the global extreme values of the final particle swarm, inputting the updated input parameters into the PM2.5 concentration prediction model, and recording the output of the PM2.5 concentration prediction model as the predicted PM2.5 concentration.

In this embodiment, the process of performing update iteration on the particle swarm in step (7) is as follows: k is the current iteration number of the particle swarm, and the current position of the ith particle after the kth iteration is recorded as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current velocity of the ith particle after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)](ii) a The particles update their own velocity and position according to equations (4) and (5):

W _i (k+1)＝W _i (k)+V _i (k)(5)

in the formula (I), the compound is shown in the specification,taking the inertia weight as 0.6; i =1,2,. Ang, N; k is the current iteration number, and k is taken as the maximum _max ，k _max ＝1000；V _i (k) Is the velocity of the particle; c. C ₁ And c ₂ Is a non-negative constant, called acceleration factor, and is taken as ₁ ＝c ₂ ＝2；r ₁ And r ₂ Is distributed in [0,1 ]]Random number in between, V _i (k)∈[-V _max ,V _max ]，W _i (k)∈[-X _max ,X _max ]，v _min Represents the minimum velocity, v, of the ith particle _max Represents the maximum velocity, X, of the ith particle _max Get 10,V _max And 5, taking.

In order to verify the accuracy of the method, the regional air monitoring data of 4-5 months in 2013 of Ningbo city environment monitoring center is adopted for verification, and the time t is in hours. Particle swarm optimizationThe choice of parameters in the algorithm is related to the type of problem, and theoretically there is no parameter value, acceleration factor c, that is adapted to all the problems ₁ Is 1.5,c ₂ Is 1.1, random number r ₁ And r ₂ Set to 0.6 and 0.8, respectively, the number of iterations k _max ＝300。

When d =2, the PM2.5 concentration is predicted by using the conventional support vector machine regression model and the prediction method of the present invention, respectively, where fig. 1 is a distribution diagram of the PM2.5 concentration measured value and the predicted value when performing prediction by using the conventional support vector machine regression model, and fig. 2 is a distribution diagram of the PM2.5 concentration measured value and the predicted value when performing prediction of the PM2.5 concentration by using the prediction method of the present invention. In fig. 1 and 2, the abscissa represents the actual measurement value of PM2.5, the ordinate represents the predicted value, the dashed line segment is the reference line where the actual measurement value is equal to the predicted value, and the star point distribution diagram can reflect the deviation degree between the predicted value and the actual measurement value of the model. In order to accurately describe the deviation degree between the predicted value and the measured value, the absolute value of the residual error is defined as follows:wherein n is ₂ Indicates the number of predicted values, y _i Is the ith actual value, y _outi Is the ith prediction value. The sum of absolute values of residuals when the existing regression model of the support vector machine is adopted for prediction is 1.7887, and the sum of absolute values of residuals when the prediction method is adopted for predicting the concentration of PM2.5 is 1.5994. As can be seen from the analysis of fig. 1 and fig. 2, the degree of deviation between the predicted value and the measured value obtained by the method of the present invention is significantly smaller than the degree of deviation between the predicted value and the measured value obtained by the conventional support vector machine regression model.

When d =5, the PM2.5 concentration is predicted by using the conventional support vector machine regression model and the prediction method of the present invention, respectively, where fig. 3 is a distribution diagram of the PM2.5 concentration measured value and the predicted value when performing prediction by using the conventional support vector machine regression model, and fig. 4 is a distribution diagram of the PM2.5 concentration measured value and the predicted value when performing prediction of the PM2.5 concentration by using the prediction method of the present invention. In fig. 3 and 4, the abscissa represents the actual measured value of PM2.5, the ordinate represents the predicted value, the dashed line segment is the reference line where the actual measured value is equal to the predicted value, and the star point distribution diagram can reflect the deviation degree between the predicted value and the actual measured value of the model. The sum of absolute values of residuals when the existing regression model of the support vector machine is adopted for prediction is 2.1140, and the sum of absolute values of residuals when the prediction method is adopted for predicting the concentration of PM2.5 is 1.7035. The dimension of the input parameter of the regression model of the input support vector machine after particle swarm optimization by adopting the method is reduced from 50 to 27.

As can be seen from the results of d =2 and d =5, compared with the conventional method for predicting by using a regression model of a support vector machine, the prediction method of the present invention reduces the dimension of the input parameter, and the deviation degree between the obtained predicted value and the measured value is significantly small. The relative error between the prediction method of the present invention and the conventional method using the support vector machine regression model for prediction at d =2 and d =5 is shown in table 1:

TABLE 1 relative error Rate (%)

Claims

1. A regional air PM2.5 concentration prediction method is characterized by comprising the following steps:

recording the atmospheric visibility of regional air measured at the time t-1 as x _N (t-1) haze is recorded as x _Z (t-1) air temperature is denoted as x _T (t-1) the gas pressure is denoted x _p (t-1) wind speed x _w (t-1)；

Will t-t _d The concentration of SO2 in the air of the region measured at the moment is recorded as x _so2 (t-t _d ) And the concentration of NO2 is denoted as x _No2 (t-t _d ) And the concentration of PM10 is denoted as x _pm10 (t-t _d ) And the concentration of PM2.5 is recorded as x _pm25 (t-t _d ) And the concentration of O3 is marked as x _o3 (t-t _d ) And the concentration of CO is denoted as x _co (t-t _d )；t _d Is a time lag factor and t _d =1,2, \8230, d, d is an integer greater than or equal to 2; the PM2.5 concentration predicted at time t is recorded as y _pm25 (t)；

(1) -2 model f (X) = support vector machine to be trained<ω,X&The input parameter of + b is recorded as X _in (T), the output of the regression model of the support vector machine to be trained is recorded as Y _out (T)，X _in (T) and Y _out (T) forming pairs of training parameters, order

Y _out (T)＝x _pm25 (T+1) (2)

(1) -3, acquiring historical measurement data of SO2 concentration, NO2 concentration, PM10 concentration, PM2.5 concentration, O3 concentration, CO concentration, atmospheric visibility, turbidity, air temperature record, air pressure and air speed in regional air before t time to obtain { X } _in (t-2)，Y _out (t-2)}，{X _in (t-3)，Y _out (t-3)}，…，{X _in (t-501)，Y _out (t-501) } 500 training parameter pairs in total, and taking the 500 training parameter pairs as training sample data of a regression model of a support vector machine to be trained;

(3) initializing particle swarm parameters: in the [0,1 ]]Randomly generating a particle swarm with the population size of N, wherein N =100, each particle comprises a position and a velocity attribute, randomly initializing the velocity and the position of each particle in the particle swarm, and recording the current position of the initialized ith particle as W _i (0) Current velocity is denoted as V _i (0) I =1,2, \8230, N, where W _i (0) And V _i (0) Are all 1 × m matrices, W _i (0)＝[w _1i (0),w _2i (0),…,w _mi (0)],i＝1,2,…,N，w _1i (0),w _2i (0),…,w _mi (0)∈[0,1]，V _i (0)＝[v _1i (0),v _2i (0),…,v _mi (0)],i＝1,2,…,N，v _1i (0),v _2i (0),…,v _mi (0)∈[0,1]M =6 × d +5; record the current position of the ith particle after the kth iteration as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current speed after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)]，k＝1，2，…，k _max ；k _max The total iteration number of the particle swarm is obtained;

x _no2 (t-1)×w _(d+1)i (0),x _no2 (t-2)×w _(d+2)i (0),…,x _no2 (t-d)×w _(2d)i (0),

x _pm10 (t-1)×w _(2d+1)i (0),x _pm10 (t-2)×w _(2d+2)i (0),…,x _pm10 (t-d)×w _(3d)i (0),

x _pm25 (t-1)×w _(3d+1)i (0),x _pm25 (t-2)×w _(3d+2)i (0),…,x _pm25 (t-d)×w _(4d)i (0),

x _O3 (t-1)×w _(4d+1)i (0),x _O3 (t-2)×w _(4d+2)i (0),…,x _O3 (t-d)×w _(5d)i (0),

x _CO (t-1)×w _(5d+1)i (0),x _CO (t-2)×w _(5d+2)i (0),…,x _CO (t-d)×w _(6d)i (0),

x _N (t-1)w _(6d+1)i (0),x _Z (t-1)w _(6d+2)i (0),x _T (t-1)w _(6d+3)i (0),

x _P (t-1)w _(6d+4)i (0),x _w (t-1)w _(6d+5)i (0)]

(5) Recording the initial value of the individual extreme value of the ith particle in the particle swarmF _i (0)＝|x _pm25 (t-1)-y _outi (t, 0) |, where | is the symbol of absolute value, y _outi (t, 0) isInputting the parameters into a PM2.5 concentration prediction model as input parameters to obtain the output of the PM2.5 concentration prediction model; recording the initial value of the global extreme value of the particle swarm

x _no2 (t-1)×w _(d+1)i (k),x _no2 (t-2)×w _(d+2)i (k),…,x _no2 (t-d)×w _(2d)i (k),

x _pm10 (t-1)×w _(2d+1)i (k),x _pm10 (t-2)×w _(2d+2)i (k),…,x _pm10 (t-d)×w _(3d)i (k),

x _pm25 (t-1)×w _(3d+1)i (k),x _pm25 (t-2)×w _(3d+2)i (k),…,x _pm25 (t-d)×w _(4d)i (k),

x _O3 (t-1)×w _(4d+1)i (k),x _O3 (t-2)×w _(4d+2)i (k),…,x _O3 (t-d)×w _(5d)i (k),

x _CO (t-1)×w _(5d+1)i (k),x _CO (t-2)×w _(5d+2)i (k),…,x _CO (t-d)×w _(6d)i (k),

x _N (t-1)w _(6d+1)i (k),x _Z (t-1)w _(6d+2)i (k),x _T (t-1)w _(6d+3)i (k),

x _P (t-1)w _(6d+4)i (k),x _w (t-1)w _(6d+5)i (k)]

Will be provided withInputting the parameters into a PM2.5 concentration prediction model to obtain the output of the PM2.5 concentration prediction modelGo out, mark asWill be provided withSubstituting into formula F _i (k)＝|x _pm25 (t-1)-y _outi In (t, k) |, calculating to obtain the fitness value F corresponding to the ith particle after the kth updating _i (k) (ii) a The symbol of absolute value is | |; and recording the individual extreme value of the ith particle in the particle swarm after the kth iteration asThe individual extreme value of the ith particle after the kth iterationThe corresponding particle position is denoted W _i (a) A is an integer of 0 or more and k or less; recording the global extreme value of the particle swarm after the kth iteration as the global extreme value min represents taking the minimum value; the global extremumThe position of the corresponding particle is denoted as W _h (g) H is an integer of 1 to N, and g is an integer of 0 to k;

2. The method according to claim 1, wherein the step (7) of performing update iteration on the particle swarm comprises: k is the current iteration number of the particle swarm, and the current position of the ith particle after the kth iteration is recorded as W _i (k)，W _i (k)＝[w _1i (k),w _2i (k),…,w _mi (k)]And the current speed of the ith particle after the kth iteration is recorded as V _i (k)，V _i (k)＝[v _1i (k),v _2i (k),…,v _mi (k)](ii) a The particles update their own velocity and position according to equations (4) and (5):

W _i (k+1)＝W _i (k)+V _i (k) (5)

in the formula (I), the compound is shown in the specification,taking the inertia weight as 0.6; i =1,2,. N; k is the current iteration number, and k is taken as the maximum _max ，k _max ＝1000；V _i (k) Is the velocity of the particle; c. C ₁ And c ₂ Is a non-negative constant, called acceleration factor, taking c ₁ ＝c ₂ ＝2；r ₁ And r ₂ Is distributed in [0,1 ]]Random number in between, V _i (k)∈[-V _max ,V _max ]，W _i (k)∈[-X _max ,X _max ]，v _min Represents the minimum velocity, v, of the ith particle _max Represents the maximum velocity, X, of the ith particle _max Take 10,V _max And taking 5.