CN113011455B

CN113011455B - Air quality prediction SVM model construction method

Info

Publication number: CN113011455B
Application number: CN202110140388.5A
Authority: CN
Inventors: 宋国君; 刘帅; 何伟; 张波; 宋天一
Original assignee: Beijing Shuhuitong Information Technology Co ltd
Current assignee: Beijing Shuhuitong Information Technology Co ltd
Priority date: 2021-02-02
Filing date: 2021-02-02
Publication date: 2024-01-05
Anticipated expiration: 2041-02-02
Also published as: CN113011455A

Abstract

The invention discloses an air quality prediction SVM model construction method, which comprises the steps of collecting air quality data, meteorological data and pollution source continuous emission data; processing the acquired data by sumif function calculation; constructing model variables, processing air quality data into conventional air quality variables through In functions, processing conventional meteorological parameter variables through calculation of meteorological data, and carrying pollution source emission data into pollutant emission variables of pollution sources through weighting and calculation; and (3) establishing a prediction model, establishing a model by adopting an SVM method, modeling, and performing a test run model after modeling is completed. According to the invention, by fully utilizing the principle of the existing big data and information of the Internet of things and the Internet and the innovative big data statistical analysis thinking and method tool and service city air quality management decision, high-level innovation is made in scientific research, high-level researchers, doctor and basic-level professionals are cultivated, and statistical prediction diagnosis technical support is provided for air quality management in heavy pollution areas.

Description

Air quality prediction SVM model construction method

Technical Field

The invention relates to the technical field of air quality prediction, in particular to an air quality prediction SVM model construction method.

Background

Most of the existing air quality prediction documents are modeled by adopting a neural network method. From the choice of explanatory variables, most studies only consider weatherThe influence of factors on the concentration of the monitoring points is not studied and considered. Zhou Shuhua (2017) establishes a statistical prediction model of PM2.5 concentration of different seasons and days in Yibin city by using a stepwise regression analysis method, and comprehensively analyzes the relation between the PM2.5 concentration and the previous six pollutant concentrations. Meanwhile, the relation between PM2.5 and meteorological elements such as the current day of air pressure, air temperature, temperature difference, rainfall, average wind speed, sunlight time and the like is explained, and the simulation relative error is 28.5%. Mo Xianlie (2003) 6 meteorological factors such as wind speed, wind direction, relative humidity, cloud cover, average air temperature and highest air temperature are selected as input values by using an artificial neural network method, 365 groups of two-year total O3 daily average value data in Dalian city are selected as training sets, 61 groups of data are selected as test sets, and a plurality of O are selected ₃ The daily average concentration is predicted, and the average relative error between the measured concentration and the predicted concentration is 21.49 percent. Liu Jie (2014) applies a method of combining a support vector machine and a fuzzy granulation time sequence, performs feature extraction on data samples by using a triangular membership function as input of the support vector machine according to daily variation periodic patterns of different seasons of PM2.5, establishes a time sequence prediction model of PM2.5 mass concentration by taking monitoring values of PM2.5 h mass concentration of monitoring points as sample data, and fits R ² The absolute error range of simulation can reach 0.94, and is between 0.2 mug/m < 3 > -46.85 mug/m < 3 >. Sun Baolei (2017) variable screening method by using BP neural network and combining average influence value (MIV) to make 5 environmental monitoring points SO in Kunming urban area ₂ 、NOx、O ₃ And 6 pollutant concentration monitoring data such as CO, PM10, PM2.5 and the like are used for establishing a Kunming city air quality prediction model. A total of 694 sets of two-year data are selected as training sets, and 350 sets of one-year data are selected as test sets, wherein the ratio of the standard deviation of the predicted value to the standard deviation of the measured value is 0.6. However, the above studies have not attempted to incorporate the pollution sources into a model that is silent about the control and planning significance of the pollution sources.

Disclosure of Invention

Aiming at the technical problems in the related art, the invention provides an air quality prediction SVM model construction method, which can overcome the defects of the prior art method.

In order to achieve the technical purpose, the technical scheme of the invention is realized as follows:

the air quality prediction SVM model construction method comprises the steps of collecting air quality data, meteorological data and pollution source continuous emission data; collecting air quality data includes PM2.5, NOx and SO needing to be collected at each monitoring point ₂ CO and O ₃ Concentration data of (2); the meteorological data acquisition comprises the data of air pressure, humidity, wind speed, wind direction and rainfall of an urban meteorological station to be acquired; collecting pollution source data comprises the steps of collecting emission amount of particulate matters and SO ₂ Emission data.

PM2.5, NOx, SO for each detection point was calculated in Excel by sumif function ₂ CO and O ₃ Processing the air quality data as a 24 hour average; the atmospheric pressure, humidity and wind data values are processed into average values of 24 hours in Excel through a sumif function, so that meteorological data are processed, and pollution source emission data are processed through a pollution source emission data calculation method.

Constructing a model variable, and processing the model variable into an air quality variable by solving a logarithmic value of air quality data by an In function In Excel; the meteorological variable is processed by firstly calculating the average value and standard deviation of air pressure, humidity and wind speed by using an average function and a std function, subtracting the average value from Excel, and dividing the standard deviation by the average value to perform standardized processing on the air pressure, humidity and wind speed data values so as to form the air pressure, humidity and wind speed variables; and (3) weighting the pollution sources, calculating a weighted average value of the pollution sources, processing the pollution source emission data, and then carrying the pollution source emission data into a pollution source variable.

Modeling was performed using an SVM method by calling the libsvm toolkit in matlab. During modeling, a training set test set is selected, firstly, the svmtrain function is called to train the training set, a built SVM model is stored in a model_test, then the built SVM model_test is utilized, the test set is tested by calling a prediction function, the test set is stored in a accuracy, and a 'relative error MSE' for evaluating a test effect is found in the accuracy.

And (3) constructing a planning model, incorporating a background concentration value of the pollution source under zero emission into the model, temporarily taking out part of samples from the test set, calculating a weighted average value of the pollution source, and placing the pollution source emission after the weighted average value into the test set.

Model test run, which provides model structure parameters, pollution source weight and a test set sample for a developer; and the software automatically runs the primary model, and is adjusted according to comparison between the test result and the software output result.

The invention has the beneficial effects that: by fully utilizing the big data and information of the existing Internet of things and the Internet, innovation big data statistical analysis thinking and method tools and the principle of service city air quality management decision, high-level innovation is made in scientific research, high-level researchers, doctor students and basic-level professionals are cultivated, and statistical prediction diagnosis technical support is provided for air quality management in heavy pollution areas.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an overall technical flow of a statistical predictive diagnosis model study of urban air quality big data according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.

The air quality prediction SVM model construction method comprises the steps of collecting air quality data, meteorological data and pollution sourcesData including PM2.5, NOx, SO for city monitoring points ₂ CO and O ₃ Is the concentration data of the (a), the original data is the hour data; the meteorological data comprise air pressure, humidity, wind speed, wind direction and rainfall data of an urban meteorological station, and the original data are hour data, wherein the wind direction needs to have data of degrees for later use; the pollution source data comprises pollution source continuous emission data of urban national control point sources, including particulate matter emission, NOx emission and SO ₂ Emission data, raw data is hour data.

Processing the collected data in Excel to calculate PM2.5, NOx and SO at each detection point by sumif function ₂ CO and O ₃ Is processed to 24 hour average value; the meteorological data are processed by calculating air pressure, humidity and wind speed data values in Excel by using a sumif function, and processing the data values into average values of 24 hours, wherein the wind direction data values are processed into daily dominant wind directions by representing the daily dominant wind directions by using wind directions under the maximum wind speed of the hours, and if the wind direction data are in degrees, the daily dominant wind directions are processed by converting the wind direction data into four discrete values of east, south, west, north and the like, and the rainfall data values are processed into rainfall by 24 hours, wherein the rainfall data values are rainy, marked as rainfall and non-rainy, and marked as non-rainfall; the treatment of the pollution source emission data is to calculate the daily average emission flow of the pollution source by using the hour emission flow of the pollution source; calculating the daily average emission concentration of the pollution source by using the hour emission concentration of the pollution source; the daily average discharge amount of the pollution source is obtained by multiplying the daily average discharge flow of the pollution source by the daily average discharge concentration of the pollution source.

The monitoring points may have a small amount of data missing of concentration, weather and pollution sources, and the data is supplemented according to a differential filling method. The differential filling method is to fill with the average value of the data before and after the missing sample. Or may be implemented in a programming language on matlab. Then please contact the company when there is a large area of data missing, discuss the specific solution. Some samples are deleted if necessary (deletion of samples should be done after the next model variable build is completed in order not to delete useful hysteresis variables).

Then, model variable construction is performed, and the air quality data is processed into the air quality variable needed in the model by solving the logarithmic value of the air quality data and realizing the air quality data by an ln function in Excel.

The average value and standard deviation of the air pressure, humidity and wind speed are calculated by using an average function and a std function, then the average value is subtracted from Excel, the standard deviation is divided by the average value, and the air pressure, humidity and wind speed data values are subjected to standardization processing to form air pressure, humidity and wind speed variables.

The wind direction variable is represented by virtual variables in the model, 4 virtual variables are required to be set, the wind direction variable represents four dominant wind directions of southeast, northwest and northwest, 4 columns are listed in Excel, each class represents one wind direction, and the virtual variables are distinguished by 0-1; the rainfall data is characterized in that 1 virtual variable is set, whether rainfall occurs or not is distinguished by 0-1, 0 represents non-rainfall, 1 represents rainfall, and a virtual variable whether rainfall occurs or not is formed.

The pollution source data are processed into pollution source variables, and as hundreds of pollution sources or drain outlets are frequently arranged in the future in the process of prediction, the data of the emission amount of the pollution sources are required to be further processed and brought into a model, the weighted average value of the pollution sources is calculated by weighting each pollution source, different weighting methods are arranged at different monitoring points after the pollution sources are weighted, the specific weighting methods are described in detail later, and the pollution source variables are formed after the weighted average value of the pollution sources is standardized.

Then, in the future prediction, other auxiliary variables are added in addition to the variables, the trend variable is used for constructing the trend variable, and then the model is incorporated, wherein the construction method is that the year is used as the variable, and the number 1, 2 and 3 can be used for expressing the trend variable, and the year can be used for expressing the trend variable; the method comprises the steps of constructing periodic variables by using sin and cos functions in Excel to form two rows of variables, wherein T in the functions is month; with hysteresis variables, construction of hysteresis variables of part of the variables, e.g. hysteresis-first-period variables of monitoring-point concentration, by hysteresis of monitoring-point concentrationLater, forming a lag phase variable of the concentration of the monitoring point; there are other virtual variables, such as workday variables: constructing a virtual variable of 0-1 with 'whether the virtual variable is workday', wherein 0 represents non-workday, 1 represents workday and heating period variable: with "whether heating" a virtual variable of "0-1" is constructed, 0 representing unadopted, and 1 representing heating. In the model, it should be noted that the explained variable and the explained variable also need to reasonably correspond, for example, the explained variable is PM2.5, and the pollution variable in the explained variable is particulate matter emission; the interpreted variable is NOx, and the pollution source variable in the interpreted variable is NOx emission; is interpreted as SO ₂ The pollution source variable in the explained variable is SO ₂ Discharge amount; is interpreted as a variable O ₃ The pollution source variable in the explained variable is O ₃ Discharge amount.

After model variables are built, a prediction model is built, firstly, a model is built by adopting an SVM method, and in matlab, a libsvm tool package is called, so that the whole modeling process can be completed.

In the modeling, firstly, a training set is required to be selected, wherein the training set can be a data set in 2016-2017, and meanwhile, a test set is selected and can be a data set in 2018.

The training set needs to prepare a train_X variable which represents an interpretation variable including a meteorological variable, a pollution source variable and other variables, and a train_Y variable which represents an interpreted variable, namely a concentration variable of a monitoring point. It should be noted that the train_y variable can only be one column of data, the train_x variable can be multiple columns of data (each column represents an explanatory variable), and the number of rows of the train_x and train_y variables should be the same, and the svmtrain function needs to be called during training, and the training statement is: model_test=svmtrain (track, '-s 4-t 2-c 1-g 0.5'). The g parameter value in the sentence can be adjusted according to the actual situation, the inverse of the number of the explanatory variables is taken, and after training is completed, the established SVM model is stored in the model_test. Next, the test set is tested using the established SVM model_test. And calling a prediction function during testing, wherein test sentences are as follows: [ prediction_y, accuracy, precision_values ] =svmpredict (testy, testx, model_test), after the test is completed, the test effect is saved in accuracy, and finally, the accuracy is opened by double-clicking, wherein the third is the "relative error MSE" we use to evaluate the test effect.

The parameters to be saved in the model are a support vector, a support vector coefficient, a model b value and a model modeling coefficient, wherein the support vector is saved in an SVs array of matlab; the support vector coefficient is stored in a determinant of sv_coef of matlab, and is a determinant of n 1, wherein n is the number of support vectors; the model b value is the negative number of the rho value output by matlab; the model modeling coefficient is the g value of the svmtrain function used in training, and if the g value is a default value in modeling, the g value is the reciprocal of the number of the explanatory variables and needs to be informed to a programmer of software.

Then, the atmospheric pollution in the area is identified, and the problem of the atmospheric pollution transmission in the area needs to be considered for PM2.5 and ozone. The current idea is to take out individually samples of atmospheric pollution in the presence of a region. In the case where it is necessary to identify what is the presence of atmospheric pollution in the area. For PM2.5, the following conditions are satisfied, noted as regional atmospheric pollution:

1. is that the total emission amount of pollutant source particles is not higher than other times;

2. it is the PM2.5 concentration that is much higher than other times;

3. it is the ratio PM2.5/PM10 that is much higher than other times;

4. the correlation between the urban monitoring points and the surrounding cities PM2.5 is obvious;

5. it is that city monitoring points are not significantly correlated with surrounding cities PM 10.

Samples with area contamination are then identified. (which may be identified by a virtual variable).

After modeling, planning model construction is carried out, and in the model, the relation between the pollution source emission and the monitoring point concentration needs to be identified, so that the monitoring point concentration is proved to be increased along with the increase of the pollution source emission. The background concentration value of the pollution source under zero emission is brought into the model, all data of the background concentration monitoring points are taken as samples by finding urban background concentration monitoring points, the samples are placed into the original training set, and the samples with little pollution source under zero emission need to be supplemented in the whole samples for model learning; and temporarily removing a part of the sample from the test set, wherein the sample comprises a sample with regional pollution: namely, the total emission amount of the pollution source is not high, but the concentration of PM2.5 at the monitoring point is very high, meanwhile, the PM2.5/PM10 ratio is found to be increased, the correlation between the concentration value of PM2.5 at the monitoring point and the concentration value of PM2.5 at the monitoring point of the surrounding city is obviously increased, and the correlation between the concentration value of PM10 at the monitoring point and the concentration value of PM10 at the monitoring point of the surrounding city is not high. Error samples: when the total emission amount of the pollution source is high and the meteorological conditions are unfavorable for diffusion, if the concentration of the monitoring point on the current day is at an extremely low level, the monitoring point is regarded as an error sample, and the sample is not removed from the test set.

The method for calculating the weighted average value of the pollution sources comprises the following specific steps of: the first step, extracting the position information (coordinate point of each pollution source) of the pollution source; secondly, extracting position information (coordinate point of each monitoring point) of the monitoring points; thirdly, calculating the distance between the pollution source and the monitoring point (Euclidean distance calculation is adopted); calculating azimuth angles of a pollution source and a monitoring point, wherein the azimuth angles are expressed by degrees); and fifthly, calculating the effective distance of the pollution source and the monitoring point on the same day by combining the wind direction degree on the same day and the azimuth angles of the pollution source and the monitoring point. (calculation of effective distance Using Gaussian diffusion model.)

And placing the weighted average pollution source emission into a test set, preferably performing a correlation test before placing the weighted average pollution source emission into the test set, performing correlation analysis on the weighted average pollution source emission and the corresponding monitoring point concentration, if the weighted average pollution source emission can prove that the weighted average pollution source emission has obvious positive correlation, indicating that the weighting is effective, and incorporating the model, and if the weighted average pollution source emission cannot prove that the weighted average pollution source emission has obvious positive correlation, indicating that the weighting method also needs to be slightly adjusted. How to adjust in particular, and discuss again.

Finally, the model is run, and the modeler provides data to the software developer, including model structure parameters (support vector, support vector coefficients, rho values, g values), pollution source weights, and a test set sample. The software needs to automatically run the model once, and according to the test result, possible errors in the software are adjusted by comparing the test result with the output result of the software.

As shown in FIG. 1, the data quality assessment predictions include weather predictions, fixed source emissions predictions, moving source emissions predictions.

The fixed source emission prediction model is based on continuous monitoring data of fixed source emission and other information, a statistical analysis tool is applied to analyze the change rule of the fixed source emission rate of a specific industry, a heavy point pollution source monitoring data abnormal value diagnosis model is researched, and a fixed source emission monitoring data analysis method and a technical specification are provided. And developing a fixed source emission control scheme compiling technical specification study, wherein the study comprises an existing emission data statistical characteristic analysis method, a fixed source emission reduction cost effectiveness analysis method, an emission data comparison analysis technical method and an emission control scheme compiling technical specification. And developing a fixed source emission control scheme design of an exemplary city based on an air quality management target and a fixed source emission prediction model research, and providing information support for air quality prediction and diagnosis.

The mobile source emission prediction comprises a road motor vehicle lane emission, a road motor vehicle emission estimation tool and a prediction diagnosis, and a road motor vehicle dynamic emission accounting tool and a prediction evaluation technology are developed based on traffic big data. Wherein the emissions accounting includes emissions accounting at a bicycle level and a road network level. The single vehicle layer emission accounting is mainly based on the driving condition information acquired by the big data of the internet of vehicles in real time, and the single vehicle emission and the space-time distribution thereof are accounted for, so that information support is provided for road network layer emission accounting. The research and development of the road network hierarchical emission accounting tool are that firstly, multi-source big data in the road traffic field are researched and collected, the quality of the data with different sources is evaluated, and a road network emission accounting model is constructed; secondly, developing a road section flow distribution algorithm research based on the section traffic flow detection data, and calibrating and checking a flow expansion result; finally, a road network level motor vehicle emission simulation technical study is developed, and a road network motor vehicle dynamic emission list programming technical specification based on big data is provided. The method is characterized by researching and analyzing the space-time distribution characteristics and rules of traffic flow, combining with future scene knowledge mining such as traffic demand prediction models, motor vehicle emission control schemes, internet big data analysis and the like, developing motor vehicle emission prediction models and technologies, evaluating emission reduction effects of the emission control schemes, and providing information support for air quality prediction diagnosis.

The air quality management internet big data analysis model is used for carrying out deep research on air quality management internet big data acquisition, integration, analysis and mining key technologies and policy evaluation methods. Firstly, in order to more accurately estimate the air quality of an area which cannot be covered by an air quality monitoring station and the evaluation information of the air quality, the management of a fixed source, a mobile source and the like of the public, social perception data is utilized to acquire internet big data field knowledge covering related microblogs, forums, network media and the like, and the knowledge is integrated with internet of things monitoring data of the air quality, the fixed source, the mobile source and the like to form an internet of things-internet big data integration linkage standard. Secondly, aiming at structured, semi-structured and unstructured space-time big data of air quality management, a modeling theory and a method of multi-source heterogeneous, multi-granularity and multi-dimensional data facing real-time mining and analysis are researched, a topic mining model (LDA) facing air quality management, an abnormal emergency detection method and the like are developed, the monitoring and evaluation effects of air quality management are improved, big data analysis technology facing air quality management is developed, the big data analysis technology comprises a distributed and streaming computing model, and the big data analysis efficiency of air quality management is improved. Finally, in order to more accurately evaluate and predict the effectiveness of air quality management measures, an air quality management policy evaluation method based on social perception data is provided, and comprehensive, three-dimensional and internet monitoring technical support is provided for air quality feeling, management effects on fixed sources, mobile sources and the like, future social activity events, emission scenario analysis, policy evaluation and the like.

The urban air quality prediction and diagnosis model is based on environmental protection Internet of things big data (weather, pollution sources and public opinion), and adopts a support vector machine, an artificial neural network and a multivariate space-time model to discriminate factors influencing air quality. The support vector machine model is used for distinguishing training samples from test samples, repeatedly testing the fitting effect of the model, selecting proper kernel function types and parameters, outputting support vectors and coefficients thereof required for establishing a prediction model, and giving out a corresponding structural body model; the artificial neural network adopts an SVM algorithm to form a network structure suitable for air quality prediction, and the problems of selection of the number of input nodes, the number of hidden layers, the number of hidden layer neurons, the number of output nodes, transfer functions and the like in the network structure are researched; the space-time model comprehensively considers the space-time effect, and researches the problems of the selection, collinearity, endogenous and the like of the space weight matrix. On the basis, the application conditions of the model are evaluated, and the prediction results obtained by different methods are weighted and averaged by adopting a nonlinear combination prediction method. And on the basis of the weight distribution problem, the sum of the absolute values of the prediction errors of the models is taken as a criterion to obtain an optimal combined prediction result.

In summary, by means of the technical scheme, the invention makes high-level innovation in scientific research by fully utilizing the big data and information of the Internet of things and the Internet, innovating big data statistical analysis thinking and method tools and serving city air quality management decision-making principle, and cultures high-level researchers, doctor and basic-level professionals, thereby providing statistical prediction diagnosis technical support for air quality management in heavy pollution areas.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. The air quality prediction SVM model construction method is characterized by comprising the following steps:

s1, collecting air quality data, city meteorological data and pollution source continuous emission data of city national control points of each monitoring point of a city;

s2, processing the collected air quality data and meteorological data through sumif function calculation, and processing pollution source emission data through a pollution source emission data calculation method;

s3, constructing model variables, and processing air quality data into air quality variables needed In the model through an In function; subtracting the average value of the meteorological variables from the processed meteorological variables, dividing the average value by the standard deviation of the meteorological variables, and carrying out standardization processing to obtain standardized meteorological variables; the treatment of the pollution source variable is to weigh each pollution source, calculate the weighted average value of the pollution source, and then form the pollution source variable after standardization;

s4, building a prediction model, building a model by adopting an SVM method, and calling a libsvm tool kit in matlab to perform modeling;

s5, during modeling, selecting a training set and a testing set, firstly, calling a svmtrain function training set, storing a built SVM model into a model_test, then, utilizing the built SVM model_test, testing the testing set by calling a prediction function, storing a testing effect into a accuracy, and finding a relative error MSE for evaluating the testing effect in the accuracy;

s6, constructing a planning model, incorporating a background concentration value of the pollution source under zero emission into the model, temporarily taking out part of samples from the test set, calculating a weighted average value of pollutant emission amounts of the pollution source, and putting the pollutant emission amounts of the pollution source after the weighted average value into the test set;

s7, model test operation, namely providing model structure parameters, pollution source pollutant emission weight and a test set sample for a developer; and the software automatically runs the primary model, and is adjusted according to comparison between the test result and the software output result.

2. The air quality prediction SVM model building method of claim 1, wherein the collecting air quality data in step S1 includes collecting PM2.5, NOx, SO for each monitoring point ₂ CO and O ₃ Concentration data of (2); collecting meteorological data comprises collecting air pressure, humidity, wind speed, wind direction and rainfall data of an urban meteorological station; collecting pollution source data includes collecting particulate matter emission data and SO ₂ Emission data.

3. According toThe air quality prediction SVM model construction method according to claim 1, wherein PM2.5, NOx, SO of each monitoring point are calculated in Excel through sumif function in step S2 ₂ CO and O ₃ Processing the air quality data as a 24 hour average; meteorological data were processed by sumif function in Excel by processing barometric pressure, humidity, and wind data values to 24 hour averages.

4. The method for constructing an air quality prediction SVM model according to claim 1, wherein the processing of the pollution source emission data in step S2 specifically includes processing the pollution source emission data into daily emission data by using an hour emission flow of the pollution source, calculating a daily average emission flow of the pollution source, using an hour emission concentration of the pollution source, calculating a daily average emission concentration of the pollution source, multiplying the daily average emission flow of the pollution source by the daily average emission concentration of the pollution source, and obtaining the daily average emission amount of the pollution source.

5. The method of constructing an air quality prediction SVM model according to claim 1, wherein In step S3, the process of processing into air quality variables is by obtaining a logarithmic value of air quality data by an In function In Excel; the weather variable is processed by firstly calculating the average value and standard deviation of the air pressure, humidity and wind speed by using an average function and a std function, subtracting the daily average value of the air pressure, humidity and wind speed from the value of the air pressure, humidity and wind speed of the sample in Excel, and dividing the standard deviation of the air pressure, humidity and wind speed by the standard deviation of the air pressure, humidity and wind speed to perform standardized processing on the data values of the air pressure, humidity and wind speed to form the variables of the air pressure, humidity and wind speed.

6. The air quality prediction SVM model building method according to claim 1, wherein in step S5, the training set: the train_X variable represents an interpretation variable, including a meteorological variable and a pollution source variable; the train_y variable represents an interpreted variable, i.e., a concentration variable of the monitoring point.