CN115860214A - Early warning method and device for PM2.5 emission concentration - Google Patents

Early warning method and device for PM2.5 emission concentration Download PDF

Info

Publication number
CN115860214A
CN115860214A CN202211517121.4A CN202211517121A CN115860214A CN 115860214 A CN115860214 A CN 115860214A CN 202211517121 A CN202211517121 A CN 202211517121A CN 115860214 A CN115860214 A CN 115860214A
Authority
CN
China
Prior art keywords
data
emission concentration
set data
training
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211517121.4A
Other languages
Chinese (zh)
Inventor
董晓峣
张磊
朱宴恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ceristar Electric Co ltd
Beijing Jingcheng Jiayu Environment Technology Co ltd
MCC Capital Engineering and Research Incorporation Ltd
Original Assignee
Ceristar Electric Co ltd
Beijing Jingcheng Jiayu Environment Technology Co ltd
MCC Capital Engineering and Research Incorporation Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ceristar Electric Co ltd, Beijing Jingcheng Jiayu Environment Technology Co ltd, MCC Capital Engineering and Research Incorporation Ltd filed Critical Ceristar Electric Co ltd
Priority to CN202211517121.4A priority Critical patent/CN115860214A/en
Publication of CN115860214A publication Critical patent/CN115860214A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a PM2.5 emission concentration early warning method and a device, wherein the method comprises the following steps: acquiring historical basic data counted by the air quality automatic monitoring standard station time by time and dividing the historical basic data into training set data and test set data; constructing a PM2.5 emission concentration prediction model by using a LightGBM algorithm and training set data; verifying the model by using the test set data; uploading the verified prediction model to an intelligent environment-friendly control platform, and displaying a PM2.5 emission concentration prediction result in real time according to the real-time air quality data; and early warning the area with the standard exceeding PM2.5 emission concentration according to the prediction result. The method can realize the real-time prediction of the pollutant emission concentration of the key attention area of the iron and steel enterprise; the model built through the LightGBM algorithm can predict the result in real time, and the change condition of the PM2.5 emission concentration is combined to make a prejudgment on whether the future PM2.5 emission concentration in the key concern area has the overproof risk.

Description

Early warning method and device for PM2.5 emission concentration
Technical Field
The invention relates to the technical field of air pollution prevention and control, in particular to a PM2.5 emission concentration early warning method and device.
Background
The fine particulate matters (PM 2.5) in the atmosphere have very important influence on the air quality, the visibility and the human health, and are also important influencing factors for causing dust-haze weather. With the gradual improvement of public environmental awareness, the emission of fine particulate matters (PM 2.5) in ambient air has attracted extensive attention, and the quality of the ambient air has become a hot and important concern of social attention.
As a high-pollution emission industry, steel enterprises have been greatly promoted to improve ultralow emission in the steel industry in recent years under the background that environmental requirements are becoming stricter. It is worth paying attention to that, at present, iron and steel enterprises in key control areas have comprehensively started to build intelligent environment-friendly management and control platforms with ultralow emission, and the building of intelligent environment-friendly management and control platform systems has become a green development trend of the iron and steel industry. The PM2.55 unorganized emission of the enterprise is closely related to the air quality of the environment in the factory, the emission of PM2.5 pollutants can be effectively treated, a prejudgment basis is provided for judging whether the future PM2.5 emission concentration of an area which is mainly concerned by the enterprise exceeds the standard, and the PM2.5 emission concentration is predicted in advance, which is particularly important.
At present, a plurality of achievements are obtained by a prediction method aiming at the PM2.5 emission concentration in the atmosphere, but the area range generally aimed at is large, and the stability, timeliness and pertinence of a data source are relatively weak by adopting a mode of combining open source data of a meteorological department and data of a monitoring station; at present, few steel enterprises for predicting the unorganized PM2.5 emission concentration adopt simple regression models and other methods, the accuracy of prediction results is low, and the prediction results do not have reference values, so that the enterprises cannot effectively predict the PM2.5 emission concentration.
For example, the invention patent application with the publication number of CN112418560A provides a PM2.5 concentration prediction method and a system, and PM2.5 concentration average value and PM are adopted 10 Mean value of concentration, SO 2 Mean value of concentration, NO 2 Average concentration value, average CO concentration value, O 3 And dividing the historical characteristic data sets synthesized by the concentration average value, the weather condition average value and the like according to a set time interval to obtain a plurality of groups of historical characteristic data sets, inputting the historical characteristic data sets into the long-term and short-term memory model, and realizing model optimization from a multi-dimensional single-input sequence model structure to a three-dimensional multi-input sequence model structure so as to improve the PM2.5 concentration prediction speed and the prediction precision.
However, the PM2.5 concentration is predicted by adopting a long-short-term memory network, the model is suitable for processing data with longer interval and delay time in a predicted time sequence, and the applicability of the continuous time sequence for monitoring the monitoring station in real time every minute and every hour is poor; and the current technical proposal adopts PM2.5 concentration average value and PM 10 Mean value of concentration, SO 2 Mean value of concentration, NO 2 Average concentration, average CO concentration, O 3 The time efficiency is poor when the average value is selected as a training set, and the applicability is poor aiming at predicting the PM2.5 concentration at the next moment in real time.
In addition, the invention patent application with the publication number of CN108805253A discloses a PM2.5 concentration prediction method, PM2.5 is predicted by a method of optimizing the weight and the threshold of a three-layer BP neural network by adopting a gray wolf optimization algorithm, and the concentration value of PM2.5 on the next day is predicted according to the concentration value of PM2.5 on the previous day, meteorological data and pollutant concentration data. A given training starting point is optimized through an algorithm, and the network is prevented from being trapped in local optimization, so that the PM2.5 concentration prediction accuracy is improved.
However, according to the technical scheme, the PM2.5 concentration is predicted by adopting a neural network algorithm, the neural network algorithm is more suitable for large sample data, and the neural network algorithm is utilized aiming at the data characteristics of small area range and small sample size of the iron and steel enterprise, so that the operation cost is higher, and the operation memory occupies a larger amount.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for warning PM2.5 emission concentration, so as to solve at least one of the above-mentioned problems.
In order to achieve the purpose, the invention adopts the following scheme:
according to a first aspect of the present invention, there is provided a method of warning of PM2.5 emission concentration, the method comprising: acquiring historical basic data of the air quality automatic monitoring standard station time-by-time statistics; dividing the historical base data into training set data and test set data; constructing a PM2.5 emission concentration prediction model by utilizing a LightGBM machine integrated learning algorithm and the training set data; verifying the PM2.5 emission concentration prediction model by using test set data; uploading the verified PM2.5 emission concentration prediction model to an intelligent environment-friendly control platform, and displaying a PM2.5 emission concentration prediction result in real time according to real-time data of an air quality automatic monitoring standard station; and early warning the area with the excessive PM2.5 emission concentration according to the prediction result.
Preferably, in the warning method according to the embodiment of the present invention, the historical basic data includes time-by-time statistical PM2.5 and PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 As characteristic variables of the LightGBM machine ensemble learning algorithm, and as target variables, as concentration data of PM 2.5.
Preferably, after historical basic data of the standard station for automatically monitoring the air quality is acquired from time to time, the method further comprises the following steps: and eliminating samples with incomplete data and singular values in the historical basic data.
Preferably, when the early warning method according to the embodiment of the present invention utilizes the LightGBM machine ensemble learning algorithm and the training set data to construct the PM2.5 emission concentration prediction model, the method further includes: selecting a lifting model category, and selecting two loss functions of an average absolute error MAE and an average square error MSE as evaluation indexes of a prediction model, wherein:
Figure BDA0003971623670000031
Figure BDA0003971623670000032
in the above formula:
Figure BDA0003971623670000033
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is represented, and n represents the number of samples; during each training, whether the PM2.5 emission concentration prediction model is available is determined through the results of MAE and MSE.
Preferably, the dividing the historical basic data into training set data and test set data in the early warning method according to the embodiment of the present invention includes: randomly dividing the historical basic data into 10 subsamples, wherein 9 subsamples serve as training set data, and the remaining 1 subsample serves as test set data; when the PM2.5 emission concentration prediction model is trained and verified by using training set data and test set data, 9 sub-samples are trained, the rest 1 sub-sample is verified after the training is finished, and each sub-sample is used as one-time test set data to participate in verification, so that ten training and verification processes are completed; and averaging the 10 prediction results to obtain a final prediction value.
Preferably, the early warning method of the embodiment of the present invention further includes: by separately calculating the root mean square error RMES and the goodness of fit R 2 Further checking the final predicted value, wherein:
Figure BDA0003971623670000034
Figure BDA0003971623670000035
/>
in the above formula:
Figure BDA0003971623670000036
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples.
Preferably, the early warning of the area with the excessive PM2.5 emission concentration according to the prediction result in the early warning method of the embodiment of the present invention includes: and judging whether the prediction result exceeds an alarm threshold value, if so, acquiring the position information of the prediction area exceeding the alarm threshold value, and early warning the area corresponding to the position information.
According to a second aspect of the present invention, there is provided an early warning apparatus of PM2.5 emission concentration, the apparatus comprising: the basic data acquisition unit is used for acquiring historical basic data counted by time by the air quality automatic monitoring standard station; the data dividing unit is used for dividing the historical basic data into training set data and test set data; the training unit is used for constructing a PM2.5 emission concentration prediction model by utilizing a LightGBM machine ensemble learning algorithm and the training set data; the verification unit is used for verifying the PM2.5 emission concentration prediction model by using test set data; the prediction unit is used for displaying the PM2.5 emission concentration prediction result in real time by utilizing the PM2.5 emission concentration prediction model and the real-time data of the air quality automatic monitoring standard station; and the early warning unit is used for early warning the area with the PM2.5 emission concentration exceeding the standard according to the prediction result.
Preferably, the historical basic data in the early warning device of the embodiment of the invention includes time-by-time statistical PM2.5 and PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 As characteristic variables of the LightGBM machine ensemble learning algorithm, and as target variables, as concentration data of PM 2.5.
Preferably, the early warning device in the embodiment of the present invention further includes a data preprocessing unit, configured to remove samples with incomplete data and singular values in the historical basic data after the historical basic data of the standard station for automatically monitoring the air quality is acquired and counted time by time.
Preferably, the early warning apparatus according to the embodiment of the present invention further includes: an evaluation index selection unit, configured to select a class of a lifting model when a PM2.5 emission concentration prediction model is constructed by using a LightGBM machine ensemble learning algorithm and the training set data, and select two loss functions, namely an average absolute error MAE and an average square error MSE, as evaluation indexes of the prediction model, where:
Figure BDA0003971623670000041
Figure BDA0003971623670000042
in the above formula:
Figure BDA0003971623670000043
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is represented, and n represents the number of samples;
and the model evaluation unit is used for judging whether the PM2.5 emission concentration prediction model is available or not through the results of the MAE and the MSE in each training process.
Preferably, the data dividing unit in the early warning apparatus according to the embodiment of the present invention is specifically configured to: randomly dividing the historical basic data into 10 subsamples, wherein 9 subsamples serve as training set data, and the remaining 1 subsample serves as test set data; the training unit and the verification unit train 9 sub-samples when training and verifying the PM2.5 emission concentration prediction model by using training set data and test set data, verify the rest 1 sub-sample after training, and take each sub-sample as primary test set data to participate in verification so as to complete ten training and verification processes; and averaging the 10 prediction results to obtain a final prediction value.
Preferably, the early warning device of the embodiment of the present invention further includes: a secondary calibration unit for calculating the root mean square error RMES and the goodness of fit R 2 Further checking the final predicted value, wherein:
Figure BDA0003971623670000051
Figure BDA0003971623670000052
/>
in the above formula:
Figure BDA0003971623670000053
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples.
Preferably, the early warning unit in the early warning apparatus according to the embodiment of the present invention is specifically configured to: and judging whether the prediction result exceeds an alarm threshold, if so, acquiring the position information of the prediction area exceeding the alarm threshold, and early warning the area corresponding to the position information.
According to a third aspect of the invention, there is provided an electronic device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, the processor implementing the steps of the above method when executing said computer program.
According to a fourth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
According to a fifth aspect of the invention, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the above method.
According to the technical scheme, the invention can achieve the following effects: the real-time prediction of the pollutant emission concentration of the key attention area of the iron and steel enterprise can be realized by adopting the data of the air quality automatic monitoring standard station and a model built by the LightGBM machine integrated learning algorithm; the model built through the LightGBM machine integrated learning algorithm can predict the result in real time, and the pre-judgment is made for whether the future PM2.5 emission concentration in the key area of interest exceeds the standard risk or not by combining the PM2.5 emission concentration change condition; the PM2.5 emission concentration prediction result is timely transmitted to the intelligent environment-friendly control platform, and the intelligent environment-friendly control platform starts a plan in time according to the future PM2.5 emission situation; the production rhythm in the key attention area is allocated in advance, and the situations that PM2.5 emission exceeds standard and air weight in a local area is poor are effectively reduced and avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
fig. 1 is a schematic flowchart of an early warning method for PM2.5 emission concentration according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a method for warning of PM2.5 emission concentration according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of an early warning apparatus for PM2.5 emission concentration according to an embodiment of the present application;
fig. 4 is a schematic diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
Fig. 1 is a schematic flow chart of an early warning method for PM2.5 emission concentration according to an embodiment of the present application, where the method includes the following steps:
step S101: historical basic data of the air quality automatic monitoring standard station which is counted time by time is obtained.
In the embodiment, basic data of a continuous time sequence provided by the air quality automatic monitoring standard station is acquired, and the time-by-time statistics in the embodiment can comprise minute-by-minute statistics or hour-by-hour statistics, and the statistical interval is short. Compared with the method for acquiring the open-source data from the national meteorological center, the data source is relatively stable, the data can represent the pollution and meteorological element characteristics in a specific area, and the data stability, timeliness and pertinence are good.
Preferably, the history basic data in this embodiment may include: PM2.5, PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 As characteristic variables of the LightGBM machine ensemble learning algorithm, and as target variables, as concentration data of PM 2.5.
Step S102: the historical base data is partitioned into training set data and test set data.
The historical basic data acquired in step S101 may be uploaded and stored in a database, and the data in the database may be updated in real time according to the data acquired in step S101.
Step S103: and constructing a PM2.5 emission concentration prediction model by utilizing a LightGBM machine integrated learning algorithm and the training set data.
The embodiment of the invention provides that a LightGBM machine integrated learning algorithm is adopted to predict the emission concentration of PM2.5, aiming at small sample data which do not relate to high data such as images or voice, the LightGBM machine integrated learning algorithm is more effective than a neural network algorithm, and meanwhile, the LightGBM machine integrated learning algorithm has the characteristics of high training speed, high-efficiency parallel training support, small memory occupied by operation, high accuracy, distributed rapid data processing support and the like.
The LightGBM machine ensemble learning algorithm is based on a Histopram decision tree algorithm. Continuous floating point characteristic variable (PM) to be read 10 、SO 2 、NO 2 CO and O 3 And meteorological data) which are converted into an integer by a plurality of discrete types, namely, each characteristic variable is divided into a plurality of vertical columns, floating point numbers are divided into a plurality of intervals, the number of the intervals is matched with the number of the vertical columns, and sample data in the vertical columns are assigned to values of the vertical columns to form a plurality of vertical columns. When data is processed once, the histogram accumulates required statistics, then an optimal segmentation point is searched after the data processing is repeated, difference acceleration processing is adopted in the process, all discretized characteristic values do not need to be traversed, the data processing speed is increased, and the operation efficiency is improved.
Furthermore, the LightGBM machine integrated learning algorithm optimally uses a Leaf-wise algorithm with depth limitation, replaces a level-wise decision growth strategy used by a common GBDT algorithm, and calculates the same layer of data in a decision tree compared with a level-wise decision growth strategy which is not distinguished by the Leaf-wise algorithm, SO that data with lower splitting gain also participates in splitting and searching, and calculation cost is wasted.
Meanwhile, the LightGBM machine ensemble learning algorithm of the embodiment simultaneously adopts the gos sampling algorithm and the EFB sampling algorithm. The GOSS sampling algorithm is based on gradient unilateral sampling, and compared with other algorithms, the number of samples participating in calculation is greatly reduced. The method includes the steps that gradient values of feature values to be split are sorted, only samples with large gradients are reserved in each sampling according to set gradient threshold values, the samples have larger influence on information gain, the samples with small gradients are not directly eliminated, partial data in the small gradient data are taken, and a constant larger than 1 is multiplied to fill up the quantitative loss in the overall samples to a certain extent, so that the calculation samples are reduced, and the influence on the overall distribution of the data is avoided. And constructing a weighted undirected graph of the characteristic quantity, wherein the weight is related to the conflict between the two characteristics, sequencing the conflict, redistributing to the existing characteristic package after each characteristic, constructing a new characteristic package to minimize the overall conflict, and assigning a constant among the characteristic values for binding. The algorithm allows the number of bound pairs of incompletely mutually exclusive feature quantities to be increased, and the progress and the efficiency of the algorithm are balanced by setting the maximum conflict ratio. And meanwhile, the optimal segmentation is realized by adopting a Many vs Many mode. In conclusion, the GOSS algorithm can improve the calculation of the model on the samples with insufficient training degree on one hand, and can greatly improve the calculation speed on the other hand; by adopting the EFB mutual exclusion feature binding algorithm, the feature number can be reduced, the data feature scale is reduced, and the training speed of the model is improved.
In addition, the LightGBM machine ensemble learning algorithm adopts an optimized feature parallel and data parallel method to accelerate calculation, and can also adopt a voting parallel strategy when the data volume is very large, so that the LightGBM machine ensemble learning algorithm is not only suitable for a data set with small data volume, but also suitable for large data volume after being optimized.
Step S104: and verifying the PM2.5 emission concentration prediction model by using test set data.
Step S105: and uploading the verified PM2.5 emission concentration prediction model to an intelligent environment-friendly management and control platform, and displaying a PM2.5 emission concentration prediction result in real time according to real-time data of an air quality automatic monitoring standard station.
Step S106: and early warning the area with the PM2.5 emission concentration exceeding the standard according to the prediction result.
According to the technical scheme, the early warning method for the PM2.5 emission concentration provided by the invention comprises the following steps: the method can realize the real-time prediction of the pollutant emission concentration of the key attention area of the iron and steel enterprise; the model built by the LightGBM machine integrated learning algorithm can predict the result in real time, and the change condition of the PM2.5 emission concentration is combined to make a prejudgment on whether the future PM2.5 emission concentration in the key area of interest exceeds the standard risk; the PM2.5 emission concentration prediction result is timely transmitted to the intelligent environment-friendly control platform, and a plan is timely started by the intelligent environment-friendly control platform according to the future PM2.5 emission situation; the production rhythm in the key attention area is allocated in advance, and the situations that PM2.5 emission exceeds standard and air weight in a local area is poor are effectively reduced and avoided.
Fig. 2 is a schematic flow chart of a method for warning of PM2.5 emission concentration according to another embodiment of the present application, where the method includes the following steps:
step S201: and acquiring historical basic data counted by time by the air quality automatic monitoring standard station.
Step S202: and eliminating samples with incomplete data and singular values in the historical basic data.
Through the steps, the independent variable characteristics which can reflect the essence of the dependent variable can be found out to complete sample data classification, a sample data set is established, and the training data set and the test data set are divided on the basis.
Step S203: the historical base data is partitioned into training set data and test set data.
Step S204: and training and verifying the PM2.5 emission concentration prediction model by using a LightGBM machine ensemble learning algorithm and the training set data and test set number.
Preferably, the PM2.5 emission concentration prediction model in this step may select a class of a boost model, and select two loss functions, i.e., an average absolute error MAE and an average square error MSE, as evaluation indexes of the prediction model, where an operation formula of MAE and MSE is shown in the following formulas (1) and (2):
Figure BDA0003971623670000091
Figure BDA0003971623670000092
in the above formula, the first and second carbon atoms are,
Figure BDA0003971623670000093
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples.
During each training, whether the PM2.5 emission concentration prediction model is available is determined through the results of MAE and MSE.
Further preferably, in the step S203, when the training set data and the test set data are divided, the historical basic data may be randomly divided into 10 subsamples, where 9 subsamples are used as the training set data, and the remaining 1 subsample is used as the test set data.
And then training and verifying 9 sub-samples when the PM2.5 emission concentration prediction model is trained and verified by using the training set data and the test set data, verifying the rest 1 sub-sample after training, and taking each sub-sample as one test set data to participate in verification so as to finish 10 training and verification processes. And averaging 10 prediction results of the PM2.5 emission concentration prediction model to obtain a final prediction value.
Further preferably, the present embodiment can also calculate the root mean square error RMES and the goodness of fit R by separately calculating 2 Further examination of the final predicted values, RMES and R 2 The formulas (A) and (B) are shown in the following formulas (3) and (4):
Figure BDA0003971623670000094
Figure BDA0003971623670000095
in the above formula:
Figure BDA0003971623670000096
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples.
In this embodiment, the results of the two test methods can be combined to determine the fitting degree effect between the predicted result and the actual PM2.5 concentration.
Step S205: and uploading the verified PM2.5 emission concentration prediction model to an intelligent environment-friendly management and control platform, and displaying a PM2.5 emission concentration prediction result in real time according to real-time data of an air quality automatic monitoring standard station.
Step S206: and judging whether the prediction result exceeds an alarm threshold, if so, acquiring the position information of the prediction area exceeding the alarm threshold, and early warning the area corresponding to the position information.
According to the technical scheme, the early warning method for the PM2.5 emission concentration provided by the invention comprises the following steps: the method can realize the real-time prediction of the pollutant emission concentration of the key attention area of the iron and steel enterprise; the model built by the LightGBM machine integrated learning algorithm can predict the result in real time, and the change condition of the PM2.5 emission concentration is combined to make a prejudgment on whether the future PM2.5 emission concentration in the key area of interest exceeds the standard risk; the PM2.5 emission concentration prediction result is timely transmitted to the intelligent environment-friendly control platform, and the intelligent environment-friendly control platform starts a plan in time according to the future PM2.5 emission situation; the production rhythm in the key attention area is allocated in advance, and the situations that PM2.5 emission exceeds standard and air weight in a local area is poor are effectively reduced and avoided.
Fig. 3 is a schematic structural diagram of an early warning apparatus for PM2.5 emission concentration according to an embodiment of the present application, where the apparatus includes: the basic data acquisition unit 310, the data dividing unit 320, the training unit 330, the verification unit 340, the prediction unit 350 and the early warning unit 360 are connected in sequence.
The basic data acquiring unit 310 is used for acquiring historical basic data of the air quality automatic monitoring standard station time-by-time statistics.
The data partitioning unit 320 is configured to partition the historical base data into training set data and test set data.
The training unit 330 is configured to construct a PM2.5 emission concentration prediction model by using the LightGBM machine ensemble learning algorithm and the training set data.
The verification unit 340 is configured to verify the PM2.5 emission concentration prediction model by using the test set data.
The prediction unit 350 is configured to display the prediction result of the PM2.5 emission concentration in real time by using the PM2.5 emission concentration prediction model and the real-time data of the air quality automatic monitoring standard station.
The early warning unit 360 is configured to perform early warning on an area with the excessive PM2.5 emission concentration according to the prediction result.
Preferably, the historical basic data acquired by the basic data acquiring unit 310 includes time-by-time statistical PM2.5 and PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 The concentration data and the meteorological data of the sensor are used as characteristic variables of the LightGBM machine integrated learning algorithm, and the concentration data of PM2.5 is used as a target variable.
Preferably, the apparatus of this embodiment further includes a data preprocessing unit, configured to, after obtaining historical basic data of the standard station for automatic air quality monitoring based on time-by-time statistics, eliminate samples with incomplete data and singular values in the historical basic data.
Preferably, the apparatus of this embodiment further comprises:
the evaluation index selection unit is used for selecting a lifting model category and selecting two loss functions of Mean Absolute Error (MAE) and Mean Square Error (MSE) as evaluation indexes of the prediction model when a PM2.5 emission concentration prediction model is constructed by utilizing a LightGBM machine ensemble learning algorithm and the training set data, wherein:
Figure BDA0003971623670000111
Figure BDA0003971623670000112
in the above formula:
Figure BDA0003971623670000113
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is represented, and n represents the number of samples;
and the model evaluation unit is used for judging whether the PM2.5 emission concentration prediction model is available or not through the results of the MAE and the MSE in each training process.
Preferably, the data dividing unit 320 of this embodiment is specifically configured to: randomly dividing the historical basic data into 10 subsamples, wherein 9 subsamples serve as training set data, and the remaining 1 subsample serves as test set data;
when the training unit 330 and the verification unit 340 train and verify the PM2.5 emission concentration prediction model by using the training set data and the test set data, training 9 sub-samples, verifying the rest 1 sub-samples after training, and taking each sub-sample as one test set data to participate in verification so as to complete ten times of training and verification processes;
and averaging the 10 prediction results to obtain a final prediction value.
Preferably, the apparatus of this embodiment further comprises:
a secondary calibration unit for calculating the root mean square error RMES and the goodness of fit R 2 Further checking the final predicted value, wherein:
Figure BDA0003971623670000114
Figure BDA0003971623670000115
in the above formula:
Figure BDA0003971623670000116
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples.
Preferably, the early warning unit 360 of this embodiment is specifically configured to: and judging whether the prediction result exceeds an alarm threshold value, if so, acquiring the position information of the prediction area exceeding the alarm threshold value, and early warning the area corresponding to the position information.
For detailed description of each unit, reference may be made to corresponding description in the foregoing method embodiment, and details are not repeated here.
According to the technical scheme, the early warning device for the emission concentration of PM2.5 provided by the invention comprises the following components: the pollutant emission concentration of key attention areas of iron and steel enterprises can be predicted in real time; the model built by the LightGBM machine integrated learning algorithm can predict the result in real time, and the change condition of the PM2.5 emission concentration is combined to make a prejudgment on whether the future PM2.5 emission concentration in the key area of interest exceeds the standard risk; the PM2.5 emission concentration prediction result is timely transmitted to the intelligent environment-friendly control platform, and the intelligent environment-friendly control platform starts a plan in time according to the future PM2.5 emission situation; the production rhythm in the key attention area is allocated in advance, and the situations that PM2.5 emission exceeds standard and air weight in a local area is poor are effectively reduced and avoided.
Fig. 4 is a schematic diagram of an electronic device provided in an embodiment of the present invention. The electronic device shown in fig. 4 is a general-purpose data processing apparatus comprising a general-purpose computer hardware structure including at least a processor 801 and a memory 802. The processor 801 and the memory 802 are connected by a bus 803. The memory 802 is adapted to store one or more instructions or programs that are executable by the processor 801. The one or more instructions or programs are executed by the processor 801 to implement the steps in the above-described PM2.5 emission concentration warning method.
The processor 801 may be a stand-alone microprocessor or a collection of one or more microprocessors. Thus, the processor 801 implements the processing of data and the control of other devices by executing commands stored in the memory 802 to thereby execute the method flows of embodiments of the present invention as described above. The bus 803 connects the above-described components together, as well as to a display controller 804 and a display device and an input/output (I/O) device 805. Input/output (I/O) devices 805 may be a mouse, keyboard, modem, network interface, touch input device, motion sensitive input device, printer, and other devices known in the art. Typically, input/output (I/O) devices 805 are connected to the system through an input/output (I/O) controller 806.
The memory 802 may store, among other things, software components such as an operating system, communication modules, interaction modules, and application programs. Each of the modules and applications described above corresponds to a set of executable program instructions that perform one or more functions and methods described in embodiments of the invention.
Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the above-mentioned PM2.5 emission concentration warning method.
An embodiment of the present invention further provides a computer program product, which includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the steps of the above-mentioned early warning method for PM2.5 emission concentration are implemented.
In summary, the early warning method and device for the emission concentration of PM2.5 provided by the invention: the pollutant emission concentration of key attention areas of iron and steel enterprises can be predicted in real time; the model built by the LightGBM machine integrated learning algorithm can predict the result in real time, and the change condition of the PM2.5 emission concentration is combined to make a prejudgment on whether the future PM2.5 emission concentration in the key area of interest exceeds the standard risk; the PM2.5 emission concentration prediction result is timely transmitted to the intelligent environment-friendly control platform, and the intelligent environment-friendly control platform starts a plan in time according to the future PM2.5 emission situation; the production rhythm in the key attention area is allocated in advance, and the situations that PM2.5 emission exceeds standard and air weight in a local area is poor are effectively avoided.
The preferred embodiments of the present invention are described above with reference to the accompanying drawings. The many features and advantages of the embodiments are apparent from the detailed specification, and thus, it is intended by the appended claims to cover all such features and advantages of the embodiments which fall within the true spirit and scope thereof. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the embodiments of the invention to the exact construction and operation illustrated and described, and accordingly, all suitable modifications and equivalents may be resorted to, falling within the scope thereof.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (17)

1. An early warning method for PM2.5 emission concentration is characterized by comprising the following steps:
acquiring historical basic data of the air quality automatic monitoring standard station time-by-time statistics;
dividing the historical basic data into training set data and test set data;
constructing a PM2.5 emission concentration prediction model by utilizing a LightGBM machine integrated learning algorithm and the training set data;
verifying the PM2.5 emission concentration prediction model by using test set data;
uploading the verified PM2.5 emission concentration prediction model to an intelligent environment-friendly control platform, and displaying a PM2.5 emission concentration prediction result in real time according to real-time data of an air quality automatic monitoring standard station;
and early warning the area with the PM2.5 emission concentration exceeding the standard according to the prediction result.
2. The warning method for PM2.5 emission concentration according to claim 1, wherein the historical basic data includes PM2.5, PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 The concentration data and the meteorological data of the sensor are used as characteristic variables of the LightGBM machine integrated learning algorithm, and the concentration data of PM2.5 is used as a target variable.
3. The warning method for the emission concentration of PM2.5 according to claim 2, wherein after the historical basic data of the time-by-time statistics of the air quality automatic monitoring standard station is obtained, the method further comprises the following steps: and eliminating samples with incomplete data and singular values in the historical basic data.
4. The method for warning of PM2.5 emission concentration according to claim 1, wherein when the PM2.5 emission concentration prediction model is constructed by using the LightGBM machine ensemble learning algorithm and the training set data, the method further comprises:
selecting a lifting model category, and selecting two loss functions of an average absolute error MAE and an average square error MSE as evaluation indexes of a prediction model, wherein:
Figure FDA0003971623660000011
Figure FDA0003971623660000012
in the above formula:
Figure FDA0003971623660000013
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is represented, and n represents the number of samples;
and during each training process, judging whether the PM2.5 emission concentration prediction model is available according to the MAE and MSE results.
5. The PM2.5 emission concentration warning method of claim 1, wherein the dividing the historical base data into a training set data and a test set data comprises: randomly dividing the historical basic data into 10 subsamples, wherein 9 subsamples serve as training set data, and the remaining 1 subsample serves as test set data;
when the PM2.5 emission concentration prediction model is trained and verified by using training set data and test set data, training 9 sub-samples, verifying the rest 1 sub-sample after training, taking each sub-sample as one test set data to participate in verification, and completing ten times of training and verification processes;
and averaging the 10 prediction results to obtain a final prediction value.
6. The warning method of PM2.5 emission concentration according to claim 5, characterized in that the method further comprises: by separately calculating the root mean square error RMES and the goodness of fit R 2 Further checking the final predicted value, wherein:
Figure FDA0003971623660000021
Figure FDA0003971623660000022
in the above formula:
Figure FDA0003971623660000023
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n is the number of samples. />
7. The method for warning about the emission concentration of PM2.5 according to claim 1, wherein the warning about the area with the excessive emission concentration of PM2.5 according to the prediction result comprises the following steps: and judging whether the prediction result exceeds an alarm threshold, if so, acquiring the position information of the prediction area exceeding the alarm threshold, and early warning the area corresponding to the position information.
8. An early warning device of PM2.5 emission concentration, characterized in that the device includes:
the basic data acquisition unit is used for acquiring historical basic data counted by time by the air quality automatic monitoring standard station;
the data dividing unit is used for dividing the historical basic data into training set data and test set data;
the training unit is used for constructing a PM2.5 emission concentration prediction model by utilizing a LightGBM machine ensemble learning algorithm and the training set data;
the verification unit is used for verifying the PM2.5 emission concentration prediction model by using test set data;
the prediction unit is used for displaying the PM2.5 emission concentration prediction result in real time by utilizing the PM2.5 emission concentration prediction model and the real-time data of the air quality automatic monitoring standard station;
and the early warning unit is used for early warning the area with the PM2.5 emission concentration exceeding the standard according to the prediction result.
9. The warning device for PM2.5 emission concentration according to claim 8, wherein the historical basic data includes time-by-time statistical PM2.5, PM 10 、SO 2 、NO 2 CO and O 3 Concentration data and meteorological data of, wherein PM 10 、SO 2 、NO 2 CO and O 3 As characteristic variables of the LightGBM machine ensemble learning algorithm, and as target variables, as concentration data of PM 2.5.
10. The PM2.5 emission concentration warning device as recited in claim 9, further comprising: and the data preprocessing unit is used for eliminating samples which have incomplete data and singular values in the historical basic data after the historical basic data which are counted time by the air quality automatic monitoring standard station are obtained.
11. The PM2.5 emission concentration warning apparatus according to claim 8, further comprising:
an evaluation index selection unit, configured to select a class of a lifting model when a PM2.5 emission concentration prediction model is constructed by using a LightGBM machine ensemble learning algorithm and the training set data, and select two loss functions, namely an average absolute error MAE and an average square error MSE, as evaluation indexes of the prediction model, where:
Figure FDA0003971623660000031
Figure FDA0003971623660000032
in the above formula:
Figure FDA0003971623660000033
denotes the predicted value of the i-th sample, y i The actual value of the ith sample is shown, and n represents the number of samples;
and the model evaluation unit is used for judging whether the PM2.5 emission concentration prediction model is available or not through the results of the MAE and the MSE in each training process.
12. The PM2.5 emission concentration warning apparatus as recited in claim 8, wherein said data dividing unit is specifically configured to: randomly dividing the historical basic data into 10 subsamples, wherein 9 subsamples serve as training set data, and the remaining 1 subsample serves as test set data;
when the training unit and the verification unit use the training set data and the test set data to train and verify the PM2.5 emission concentration prediction model, training 9 sub-samples, verifying the rest 1 sub-samples after training, and taking each sub-sample as one test set data to participate in verification so as to complete ten training and verification processes;
and averaging the 10 prediction results to obtain a final prediction value.
13. The PM2.5 emission concentration warning apparatus according to claim 12, further comprising:
a secondary calibration unit for calculating the root mean square error RMES and the goodness of fit R 2 Further checking the final predicted value, wherein:
Figure FDA0003971623660000041
Figure FDA0003971623660000042
in the above formula:
Figure FDA0003971623660000043
denotes the predicted value of the i-th sample, y i Indicates the actual value of the ith sample, and n indicates the number of samples.
14. The early warning device of PM2.5 emission concentration according to claim 8, wherein the early warning unit is specifically configured to: and judging whether the prediction result exceeds an alarm threshold value, if so, acquiring the position information of the prediction area exceeding the alarm threshold value, and early warning the area corresponding to the position information.
15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.
16. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
17. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the steps of the method of any of claims 1 to 7.
CN202211517121.4A 2022-11-30 2022-11-30 Early warning method and device for PM2.5 emission concentration Pending CN115860214A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211517121.4A CN115860214A (en) 2022-11-30 2022-11-30 Early warning method and device for PM2.5 emission concentration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211517121.4A CN115860214A (en) 2022-11-30 2022-11-30 Early warning method and device for PM2.5 emission concentration

Publications (1)

Publication Number Publication Date
CN115860214A true CN115860214A (en) 2023-03-28

Family

ID=85668072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211517121.4A Pending CN115860214A (en) 2022-11-30 2022-11-30 Early warning method and device for PM2.5 emission concentration

Country Status (1)

Country Link
CN (1) CN115860214A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117091799A (en) * 2023-10-17 2023-11-21 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117091799A (en) * 2023-10-17 2023-11-21 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center
CN117091799B (en) * 2023-10-17 2024-01-02 湖南一特医疗股份有限公司 Intelligent three-dimensional monitoring method and system for oxygen supply safety of medical center

Similar Documents

Publication Publication Date Title
CN107194139B (en) Atmospheric pollution source grading method and computing equipment
CN110782093B (en) PM fusing SSAE deep feature learning and LSTM2.5Hourly concentration prediction method and system
US20170300546A1 (en) Method and Apparatus for Data Processing in Data Modeling
CN110990905A (en) Auxiliary design system for historical city protection development cooperative control scheme
CN105740991A (en) Climate change prediction method and system for fitting various climate modes based on modified BP neural network
CN111784084B (en) Travel generation prediction method, system and device based on gradient lifting decision tree
CN110232445B (en) Cultural relic authenticity identification method based on knowledge distillation
CN110533239B (en) Smart city air quality high-precision measurement method
CN112395777B (en) Engine calibration parameter optimization method based on automobile exhaust emission simulation environment
CN110261547B (en) Air quality forecasting method and equipment
CN115860214A (en) Early warning method and device for PM2.5 emission concentration
CN113554213A (en) Natural gas demand prediction method, system, storage medium and equipment
CN115422747A (en) Method and device for calculating discharge amount of pollutants in tail gas of motor vehicle
CN113361690A (en) Water quality prediction model training method, water quality prediction device, water quality prediction equipment and medium
CN116437291A (en) Cultural circle planning method and system based on mobile phone signaling
JP2022014139A5 (en)
KR20090117534A (en) System for evaluating ecological construction value using a geographical information system and method using the same
JP5110891B2 (en) Statistical prediction method and apparatus for influent water quality in water treatment facilities
CN114118613A (en) Coking soil pollution space distribution prediction optimization method and system
CN116151469A (en) Model for forecasting air quality
CN115097796B (en) Quality control system and method for simulating big data and correcting AQL value
CN110991930A (en) Method for calculating dust load grade of road section
CN111950753A (en) Scenic spot passenger flow prediction method and device
CN112116139A (en) Power demand prediction method and system
CN111461163A (en) Urban interior PM2.5 concentration simulation and population exposure evaluation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination