CN112732691A - Atmospheric environment prediction method based on multiple model comparison - Google Patents

Atmospheric environment prediction method based on multiple model comparison Download PDF

Info

Publication number
CN112732691A
CN112732691A CN202110017749.7A CN202110017749A CN112732691A CN 112732691 A CN112732691 A CN 112732691A CN 202110017749 A CN202110017749 A CN 202110017749A CN 112732691 A CN112732691 A CN 112732691A
Authority
CN
China
Prior art keywords
model
concentration
linear regression
neural network
meteorological data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110017749.7A
Other languages
Chinese (zh)
Inventor
曹敏
刘娇龙
赵娜
张叶
刘斯扬
聂永杰
尹春林
杨政
肖华根
廖斌
胡昌斌
韩彤
魏龄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Yunnan Power Grid Co Ltd filed Critical Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority to CN202110017749.7A priority Critical patent/CN112732691A/en
Publication of CN112732691A publication Critical patent/CN112732691A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01WMETEOROLOGY
    • G01W1/00Meteorology
    • G01W1/10Devices for predicting weather conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Analysis (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Optimization (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Ecology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Atmospheric Sciences (AREA)
  • Operations Research (AREA)
  • Environmental Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides an atmospheric environment prediction method based on multiple model comparison, which comprises the steps of obtaining the conventional pollutant concentration and meteorological data of a target city, and constructing a database; preprocessing a database; construction of a predicted PM for input factors from meteorological data10Concentration Y1The multivariate linear regression model of (1); adjusting the input factors of the multiple linear regression model, and constructing the predicted PM through a stepwise recursion mode10Concentration Y2Optimum linearity ofA regression model; training a BP neural network model according to a network structure, the concentration of the pretreated conventional pollutants and meteorological data; optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model; comparing the four models, by PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit; optimal BP neural network parameters are obtained through selection, intersection and variable operation iterative evolution, accurate prediction results are obtained, and the prediction method is more suitable for medium and long term prediction of atmospheric pollutants.

Description

Atmospheric environment prediction method based on multiple model comparison
Technical Field
The application relates to the field of environmental monitoring and early warning data mining and analysis, in particular to an atmospheric environment prediction method based on multiple model comparison.
Background
With the aggravation of environmental pollution and the improvement of environmental awareness of people, in order to reduce the occurrence of atmospheric environmental pollution events, the monitoring of atmospheric environment is gradually dedicated, a large number of atmospheric detection systems are constructed, and a large number of historical monitoring data are accumulated.
The existing historical monitoring data is only used for generating real-time monitoring, daily report, monthly report and annual report, wherein the atmospheric conventional pollutant data comprises PM2.5, PM10, SO2, CO, O3 and NO2, and the meteorological data comprises humidity, air temperature, air speed, wind direction and air pressure; the value of the data is not only embodied by statistical data such as daily reports, monthly reports, annual reports and the like; however, with the development of air pollution and prevention and treatment research, the prediction of the atmospheric environment is also important,
disclosure of Invention
The application provides an atmospheric environment prediction method based on multiple model comparison, and aims to solve the technical problem of lack of existing environment monitoring and early warning data mining and analysis.
In order to achieve the above purpose, the embodiments of the present application adopt the following technical solutions:
an atmospheric environment prediction method based on multiple model comparisons is provided, and the method comprises the following steps:
acquiring the conventional pollutant concentration and meteorological data of a target city, and constructing a database of the conventional pollutant concentration and meteorological data;
preprocessing the conventional pollutant concentration and meteorological data in the database;
constructing a prediction PM for an input factor based on the meteorological data10Concentration Y1The multivariate linear regression model of (1); the input factors comprise air pressure, humidity, wind speed, wind direction and air temperature;
adjusting the input factors of the multiple linear regression model, and constructing a predicted PM through a stepwise recursion mode10Concentration Y2The optimal linear regression model of (1); the adjusted input factor comprises PM2.5Temperature, O3Wind speed, air pressure, humidity, season;
training a BP neural network model according to a network structure and the conventional pollutant concentration and meteorological data after pretreatment;
optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model;
comparing the four models, by PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit;
wherein the normal pollutant concentration comprises PM2.5,PM10,SO2,CO,O3,NO2(ii) a The meteorological data includes humidity, air temperature, wind speed, wind direction, and air pressure.
In one possible implementation, the preprocessing includes:
carrying out consistency check on the conventional pollutant concentration and meteorological data;
aiming at invalid data and missing data, processing by estimation, deletion, global variable filling or a random difference complementing method;
normalizing the out-of-range data to a [0,1] interval by normalization for the out-of-range data.
In one possible implementation, the normalization is obtained by the following formula:
Figure BDA0002887290800000021
in the formula (I), the compound is shown in the specification,
Figure BDA0002887290800000022
for normalized data, xiAs raw data, xmaxIs the maximum value of the raw data, xminIs the minimum of the raw data.
In one possible implementation, the construction of the predicted PM for the input factor based on the meteorological data is performed10The multiple linear regression model of concentration is:
Y1=-3.81X1+2.213X2-55.100X3-0.212X4-1.302X5+398.112
in the formula, Y1Predicting PM for multiple linear regression models10Concentration; x1Is the air pressure; x2Is humidity; x3Is the wind speed; x4Is the wind direction; x5Is the air temperature.
In one possible implementation, the multiple linear regression model is subjected to an F-test;
when the significance of the F-test is greater than or equal to 0.00 and less than 0.01, the multiple linear regression model is established.
In one possible implementation, the optimal linear regression model:
Y2=30.231X′1+19.629X′2-0.312X′3+8.531X′4+0.891X′5+5.121X′6+10.031X′7-90.132
in the formula, Y2Predicting PM for optimal linear regression model10Concentration; x'1Is PM2.5;X′2Is the air temperature; x'3Is O3;X′4Is the wind speed; x'5Is the air pressure; x'6Is humidity; x'7Is the season.
In one possible implementation, constructing the BP neural network model includes:
taking the concentration of the pretreated conventional pollutants and meteorological data as input;
determining a network structure; the network structure comprises a network layer number, an input layer node number, an output layer node number, an activation function, a training method and a training parameter.
In one possible implementation, the cross probability in the genetic algorithm decreases as the fitness function increases, and the mutation probability increases as the fitness function increases.
In one possible implementation, the PM10Mean square error MSE, PM10Mean absolute error MAE and goodness of fit R2By the following formula:
Figure BDA0002887290800000031
Figure BDA0002887290800000032
Figure BDA0002887290800000033
in the formula, YactIs an actual value, YpreAs a predicted value, YmeanIs the average of the actual values.
The application provides an atmospheric environment prediction method based on multiple model comparison, which comprises the steps of obtaining the conventional pollutant concentration and meteorological data of a target city, and constructing a database of the conventional pollutant concentration and meteorological data; preprocessing the conventional pollutant concentration and meteorological data in the database; constructing a prediction PM for an input factor based on the meteorological data10Concentration Y1The multivariate linear regression model of (1); the input factors comprise air pressure, humidity, wind speed, wind direction and air temperature; adjusting the input factors of the multiple linear regression model, and constructing a predicted PM through a stepwise recursion mode10Concentration Y2The optimal linear regression model of (1); the adjusted input factor comprises PM2.5Temperature, O3Wind speed, air pressure, humidity, season; training a BP neural network model according to a network structure and the conventional pollutant concentration and meteorological data after pretreatment; optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model; comparing the four models, by PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit; through selection, intersection and change, optimal BP neural network parameters are obtained, accurate prediction results are obtained by utilizing improved neural network prediction, and the prediction method is more suitable for medium-term and long-term prediction of atmospheric pollutants.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart of an atmospheric environment prediction method based on multiple model comparisons according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for atmospheric environment prediction based on multiple model comparisons according to an embodiment of the present disclosure for constructing a multiple linear regression model;
FIG. 3 is a flowchart of an optimal linear regression model constructed in the atmospheric environment prediction method based on multiple model comparisons according to the embodiment of the present application;
FIG. 4 is a flowchart of a BP neural network model training method in an atmospheric environment prediction method based on multi-model comparison according to an embodiment of the present application;
fig. 5 is a flowchart of an optimal BP neural network model in an atmospheric environment prediction method based on multiple model comparisons according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The present application is described in further detail below with reference to the attached drawing figures:
the embodiment of the application provides an atmospheric environment prediction method based on multiple model comparisons, and as shown in fig. 1, the method includes the following steps:
s101, acquiring conventional pollutant concentration and meteorological data of a target city, and constructing a database of the conventional pollutant concentration and meteorological data; wherein the normal pollutant concentration comprises PM2.5,PM10,SO2,CO,O3,NO2(ii) a The meteorological data includes humidity, air temperature, wind speed, wind direction, and air pressure.
S102, preprocessing the conventional pollutant concentration and meteorological data in the database; the pretreatment comprises the following steps: carrying out consistency check on the conventional pollutant concentration and meteorological data; aiming at invalid data and missing data, processing by estimation, deletion, global variable filling or a random difference complementing method; normalizing the out-of-range data to a [0,1] interval by normalization for the out-of-range data. The normalization is obtained by the following formula:
Figure BDA0002887290800000041
in the formula (I), the compound is shown in the specification,
Figure BDA0002887290800000042
for normalized data, xiAs raw data, xmaxIs the maximum value of the raw data, xminIs the most of the original dataA small value. The normalized input and output values will fall to 0,1]Interval, finally using formula
Figure BDA0002887290800000043
Figure BDA0002887290800000044
Recalculated back to the true output value.
S103, constructing a prediction PM for an input factor according to the meteorological data10Concentration Y1The multivariate linear regression model of (1); the input factors comprise air pressure, humidity, wind speed, wind direction and air temperature; performing F test on the multiple linear regression model; as shown in table 1, when the significance of the F-test is 0.00 or more and less than 0.01, the multiple linear regression model has significant statistical significance.
TABLE 1
Figure BDA0002887290800000051
Air pressure, air temperature, wind speed, wind direction to PM10Concentration Y1With a negative effect, humidity on PM10Concentration Y1Having a positive impact, said constructing a predicted PM for an input factor from said meteorological data10The multiple linear regression model of concentration is:
Y1=-3.81X1+2.213X2-55.100X3-0.212X4-1.302X5+398.112
in the formula, Y1Predicting PM for multiple linear regression models10Concentration; x1Is the air pressure; x2Is humidity; x3Is the wind speed; x4Is the wind direction; x5Is the air temperature. Comparing the actual value in the database with the predicted PM10And (6) comparing the concentrations. The degree of fit of the multiple linear regression model is not high.
TABLE 2
Figure BDA0002887290800000052
S104, adjusting the input factors of the multiple linear regression model, and constructing and predicting PM through a stepwise recursion mode10Concentration Y2The optimal linear regression model of (1); the adjusted input factor comprises PM2.5Temperature, O3Wind speed, barometric pressure, humidity, season, adjusted as in table 3, the optimal linear regression model:
Y2=30.231X′1+19.629X′2-0.312X′3+8.531X′4+0.891X′5+5.121X′6+10.031X′7-90.132
in the formula, Y2Predicting PM for optimal linear regression model10Concentration; x'1Is PM2.5;X′2Is the air temperature; x'3Is O3;X′4Is the wind speed; x'5Is the air pressure; x'6Is humidity; x'7Is the season.
TABLE 3
Figure BDA0002887290800000061
From the result, the model introducing other pollutant concentrations, seasonal factors and meteorological factors has obviously improved fitting goodness compared with the model introducing meteorological factors only. The fitting degree of the optimized multiple linear regression model is as high as 0.821.
S105, training a BP neural network model according to a network structure and the preprocessed conventional pollutant concentration and meteorological data; the construction of the BP neural network model comprises the following steps: taking the concentration of the pretreated conventional pollutants and meteorological data as input; determining a network structure; the network structure comprises a network layer number, an input layer node number, an output layer node number, an activation function, a training method and a training parameter. The model uses a three-layer neural network with a hidden layer to make a prediction of the atmosphere. The number of input nodes is 11, the number of output nodes is 1, the hidden layer of the neural network uses a Sigmoid activation function, and the output layer uses a thread activation function. Before training the neural network, parameters such as initial weight and learning efficiency are determined, as shown in table 4.
TABLE 4
Figure BDA0002887290800000062
When the BP neural network structure is 11-6-1, the training function is thingdx, the training times are 5000, the training target is 0.005, the training step length is 25 and the learning rate is 0.01, the change trend of the future atmospheric pollutant concentration can be predicted, and the future big data pollutant concentration can be relatively accurately predicted. The goodness of fit of the traditional BP neural network is improved by 0.03 compared with the improved multiple linear regression model. The BP neural network is better than the improved multiple linear regression model in prediction effect on the whole. However, the BP neural network has the defects of insufficient global searching capability, easy falling into local optimum and slow training speed.
S106, optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model; in order to avoid the early trapping of the BP neural network based on the genetic algorithm into the local optimum, the cross probability and the mutation probability in the genetic algorithm are changed, the cross probability and the mutation probability are changed into fixed values, the cross probability is gradually reduced along with the increase of the fitness function, and the mutation probability is gradually increased along with the increase of the fitness function.
In the neural network, input data is 11 layers, output data is 1 layer, and when the number of hidden nodes is 10, the prediction effect of the network is optimal. According to the network structure, the total weight of the optimized neural network can be calculated to be 120, and 11 thresholds are calculated, so that the code length of an individual in the genetic algorithm is determined to be 131. The population rule in the genetic algorithm is set to be 20, the evolution times is set to be 50, the cross probability is 0.2, the mutation probability is 0.1, and the absolute value of the error between the actual output value and the expected output value of the network is used as the fitness value of an individual, as shown in table 6.
TABLE 6
Figure BDA0002887290800000071
The process is as follows: randomly initializing a population; calculating population fitness and finding out optimal individuals; carrying out selection, crossing and mutation operations; judging whether the evolution is finished or not, if not, returning to the second step; and finally, the found optimal individual is given to a BP neural network, and the network is used for prediction. The goodness of fit of the obtained optimal BP neural network model is 0.886.
S107, comparing the four models, and passing PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit; the PM10Mean square error MSE, PM10Mean absolute error MAE and goodness of fit R2By the following formula:
Figure BDA0002887290800000072
Figure BDA0002887290800000073
Figure BDA0002887290800000074
in the formula, YactIs an actual value, YpreAs a predicted value, YmeanIs the average of the actual values. The predicted effects of the four prediction models were compared and analyzed as shown in table 7.
TABLE 7
Figure BDA0002887290800000075
The optimal linear regression model, the BP neural network and the optimal BP neural network model in the four prediction models can relatively accurately predict the concentration of future atmospheric pollution, and the multiple linear regression model can only roughly predict the approximate trend and cannot mutually predict the approximate trendThe future weather pollutant concentration can be accurately predicted. The optimal BP neural network model is used for medium and long term prediction, and can accurately predict the trend turning point in the medium and long term period range. The best prediction effect is obtained by the optimal BP neural network model shown in the table 7. Wherein the optimal BP neural network model PM10The predicted mean square error, mean error and goodness-of-fit are respectively lower than 0.062,3.42 and 0.031 of the BP neural network, lower than 0.065,6.8 and 0.042 of the optimal linear regression model, and lower than 0.376,24.91 and 0.114 of the linear regression model.
The application provides an atmospheric environment prediction method based on multiple model comparison, which comprises the steps of obtaining the conventional pollutant concentration and meteorological data of a target city, and constructing a database of the conventional pollutant concentration and meteorological data; preprocessing the conventional pollutant concentration and meteorological data in the database; constructing a prediction PM for an input factor based on the meteorological data10Concentration Y1The multivariate linear regression model of (1); the input factors comprise air pressure, humidity, wind speed, wind direction and air temperature; adjusting the input factors of the multiple linear regression model, and constructing a predicted PM through a stepwise recursion mode10Concentration Y2The optimal linear regression model of (1); the adjusted input factor comprises PM2.5Temperature, O3Wind speed, air pressure, humidity, season; training a BP neural network model according to a network structure and the conventional pollutant concentration and meteorological data after pretreatment; optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model; comparing the four models, by PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit; through selection, intersection and change, optimal BP neural network parameters are obtained, accurate prediction results are obtained by utilizing improved neural network prediction, and the prediction method is more suitable for medium-term and long-term prediction of atmospheric pollutants.
The above-mentioned contents are only for explaining the technical idea of the present application, and the protection scope of the present application is not limited thereby, and any modification made on the basis of the technical idea presented in the present application falls within the protection scope of the claims of the present application.
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments have been discussed in the foregoing disclosure by way of example, it should be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.

Claims (9)

1. An atmospheric environment prediction method based on multiple model comparisons is characterized by comprising the following steps:
acquiring the conventional pollutant concentration and meteorological data of a target city, and constructing a database of the conventional pollutant concentration and meteorological data;
preprocessing the conventional pollutant concentration and meteorological data in the database;
constructing a prediction PM for an input factor based on the meteorological data10Concentration Y1The multivariate linear regression model of (1); the input factors comprise air pressure, humidity, wind speed, wind direction and air temperature;
adjusting the input factors of the multiple linear regression model, and constructing a predicted PM through a stepwise recursion mode10Concentration Y2The optimal linear regression model of (1); the adjusted input factor comprises PM2.5Temperature, O3Wind speed, air pressure, humidity, season;
training a BP neural network model according to a network structure and the conventional pollutant concentration and meteorological data after pretreatment;
optimizing the threshold and the weight of the BP neural network model based on a genetic algorithm to obtain an optimal BP neural network model;
comparing the four models, by PM10Mean square error, PM10Determining a final selected model by averaging the absolute error and the goodness of fit;
wherein the normal pollutant concentration comprises PM2.5,PM10,SO2,CO,O3,NO2(ii) a The meteorological data includes humidity, air temperature, wind speed, wind direction, and air pressure.
2. The atmospheric environment prediction method based on multiple model comparisons according to claim 1, characterized in that the preprocessing comprises:
carrying out consistency check on the conventional pollutant concentration and meteorological data;
aiming at invalid data and missing data, processing by estimation, deletion, global variable filling or a random difference complementing method;
normalizing the out-of-range data to a [0,1] interval by normalization for the out-of-range data.
3. The atmospheric environment prediction method based on multiple model comparisons according to claim 2, characterized in that the normalization is obtained by the following formula:
Figure FDA0002887290790000011
in the formula (I), the compound is shown in the specification,
Figure FDA0002887290790000012
for normalized data, xiAs raw data, xmaxIs the maximum value of the raw data, xminIs the minimum of the raw data.
4. The multi-model-comparison-based atmospheric environment prediction method of claim 1, wherein the predicted PM is constructed for the input factors according to the meteorological data10The multiple linear regression model of concentration is:
Y1=-3.81X1+2.213X2-55.100X3-0.212X4-1.302X5+398.112
in the formula, Y1Predicting PM for multiple linear regression models10Concentration; x1Is the air pressure; x2Is humidity; x3Is the wind speed; x4Is the wind direction; x5Is the air temperature.
5. The atmospheric environment prediction method based on multiple model comparisons according to claim 4, characterized in that the multiple linear regression model is subjected to an F test;
when the significance of the F-test is greater than or equal to 0.00 and less than 0.01, the multiple linear regression model is established.
6. The atmospheric environment prediction method based on multiple model comparisons according to claim 1, characterized in that the optimal linear regression model:
Y2=30.231X′1+19.629X′2-0.312X′3+8.531X′4+0.891X′5+5.121X′6+10.031X′7-90.132
in the formula, Y2Predicting PM for optimal linear regression model10Concentration; x'1Is PM2.5;X′2Is the air temperature; x'3Is O3;X′4Is the wind speed; x'5Is the air pressure; x'6Is humidity; x'7Is the season.
7. The atmospheric environment prediction method based on multiple model comparisons according to claim 1, wherein constructing the BP neural network model comprises:
taking the concentration of the pretreated conventional pollutants and meteorological data as input;
determining a network structure; the network structure comprises a network layer number, an input layer node number, an output layer node number, an activation function, a training method and a training parameter.
8. The atmospheric environment prediction method based on multiple model comparisons according to claim 1, wherein the cross probability in the genetic algorithm decreases with the increase of the fitness function, and the mutation probability increases with the increase of the fitness function.
9. The atmospheric environment prediction method based on multiple model comparisons according to claim 1, wherein the PM is10Mean square error MSE, PM10Mean absolute error MAE and goodness of fit R2By the following formula:
Figure FDA0002887290790000021
Figure FDA0002887290790000022
Figure FDA0002887290790000023
in the formula, YactIs an actual value, YpreAs a predicted value, YmeanIs the average of the actual values.
CN202110017749.7A 2021-01-07 2021-01-07 Atmospheric environment prediction method based on multiple model comparison Withdrawn CN112732691A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110017749.7A CN112732691A (en) 2021-01-07 2021-01-07 Atmospheric environment prediction method based on multiple model comparison

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110017749.7A CN112732691A (en) 2021-01-07 2021-01-07 Atmospheric environment prediction method based on multiple model comparison

Publications (1)

Publication Number Publication Date
CN112732691A true CN112732691A (en) 2021-04-30

Family

ID=75591053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110017749.7A Withdrawn CN112732691A (en) 2021-01-07 2021-01-07 Atmospheric environment prediction method based on multiple model comparison

Country Status (1)

Country Link
CN (1) CN112732691A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537515A (en) * 2021-07-27 2021-10-22 江苏蓝创智能科技股份有限公司 PM2.5 prediction method, system, device and storage medium
CN113688506A (en) * 2021-07-29 2021-11-23 北京化工大学 Potential atmospheric pollution source identification method based on multidimensional data such as micro-station
CN113804595A (en) * 2021-09-15 2021-12-17 汉威科技集团股份有限公司 Multi-parameter air quality monitoring system
CN114563834A (en) * 2022-04-27 2022-05-31 知一航宇(北京)科技有限公司 Numerical forecast product interpretation application method and system
CN114814092A (en) * 2022-04-12 2022-07-29 上海应用技术大学 IP index measuring method based on BP neural network
CN115508511A (en) * 2022-09-19 2022-12-23 中节能天融科技有限公司 Sensor self-adaptive calibration method based on gridding equipment full-parameter feature analysis
CN116976146A (en) * 2023-09-22 2023-10-31 中国石油大学(华东) Fracturing well yield prediction method and system coupled with physical driving and data driving

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537515A (en) * 2021-07-27 2021-10-22 江苏蓝创智能科技股份有限公司 PM2.5 prediction method, system, device and storage medium
CN113688506A (en) * 2021-07-29 2021-11-23 北京化工大学 Potential atmospheric pollution source identification method based on multidimensional data such as micro-station
CN113688506B (en) * 2021-07-29 2024-04-12 北京首创大气环境科技股份有限公司 Potential atmospheric pollution source identification method based on multi-dimensional data such as micro-station and the like
CN113804595A (en) * 2021-09-15 2021-12-17 汉威科技集团股份有限公司 Multi-parameter air quality monitoring system
CN113804595B (en) * 2021-09-15 2024-04-05 汉威科技集团股份有限公司 Multi-parameter air quality monitoring system
CN114814092A (en) * 2022-04-12 2022-07-29 上海应用技术大学 IP index measuring method based on BP neural network
CN114563834A (en) * 2022-04-27 2022-05-31 知一航宇(北京)科技有限公司 Numerical forecast product interpretation application method and system
CN114563834B (en) * 2022-04-27 2022-07-26 知一航宇(北京)科技有限公司 Numerical forecast product interpretation application method and system
CN115508511A (en) * 2022-09-19 2022-12-23 中节能天融科技有限公司 Sensor self-adaptive calibration method based on gridding equipment full-parameter feature analysis
CN116976146A (en) * 2023-09-22 2023-10-31 中国石油大学(华东) Fracturing well yield prediction method and system coupled with physical driving and data driving
CN116976146B (en) * 2023-09-22 2024-01-05 中国石油大学(华东) Fracturing well yield prediction method and system coupled with physical driving and data driving

Similar Documents

Publication Publication Date Title
CN112732691A (en) Atmospheric environment prediction method based on multiple model comparison
CN110363347B (en) Method for predicting air quality based on neural network of decision tree index
CN112529240B (en) Atmospheric environment data prediction method, system, device and storage medium
CN111144286A (en) Urban PM2.5 concentration prediction method fusing EMD and LSTM
CN106650767B (en) Flood forecasting method based on cluster analysis and real-time correction
CN112465243B (en) Air quality forecasting method and system
CN111832222B (en) Pollutant concentration prediction model training method, pollutant concentration prediction method and pollutant concentration prediction device
CN111626518A (en) Urban daily water demand online prediction method based on deep learning neural network
CN111898820B (en) PM 2.5-hour concentration combination prediction method and system based on trend clustering and integrated tree
CN111489015A (en) Atmosphere O based on multiple model comparison and optimization3Concentration prediction method
CN115860286B (en) Air quality prediction method and system based on time sequence gate mechanism
CN117977568A (en) Power load prediction method based on nested LSTM and quantile calculation
CN113836808A (en) PM2.5 deep learning prediction method based on heavy pollution feature constraint
CN113537515A (en) PM2.5 prediction method, system, device and storage medium
CN115237896A (en) Data preprocessing method and system for forecasting air quality based on deep learning
CN113985496B (en) Storm surge intelligent forecasting method based on LSTM-GM neural network model
CN116013426A (en) Site ozone concentration prediction method with high space-time resolution
CN113052353B (en) Air quality prediction and prediction model training method and device and storage medium
CN117275238A (en) Short-time traffic flow prediction method for dynamic graph structure attention mechanism
CN117370813A (en) Atmospheric pollution deep learning prediction method based on K line pattern matching algorithm
CN116565850A (en) Wind power ultra-short-term prediction method based on QR-BLSTM
CN116626780A (en) Weather forecast post-processing method and system for ground weather station set forecast
Bárdossy et al. Circulation patterns identified by spatial rainfall and ocean wave fields in Southern Africa
CN114254828B (en) Power load prediction method based on mixed convolution feature extractor and GRU
CN113850233A (en) Element-based thunderstorm storm identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210430