CN112733903A - Air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination - Google Patents

Air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination Download PDF

Info

Publication number
CN112733903A
CN112733903A CN202011624426.6A CN202011624426A CN112733903A CN 112733903 A CN112733903 A CN 112733903A CN 202011624426 A CN202011624426 A CN 202011624426A CN 112733903 A CN112733903 A CN 112733903A
Authority
CN
China
Prior art keywords
air quality
value
svm
sample
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011624426.6A
Other languages
Chinese (zh)
Other versions
CN112733903B (en
Inventor
黄海
吴霖瑞
谢昊岩
董旭
温淑棋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuchang University
Original Assignee
Xuchang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuchang University filed Critical Xuchang University
Priority to CN202011624426.6A priority Critical patent/CN112733903B/en
Publication of CN112733903A publication Critical patent/CN112733903A/en
Application granted granted Critical
Publication of CN112733903B publication Critical patent/CN112733903B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital
    • G01N33/0063General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital using a threshold to release an alarm or displaying means
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0073Control unit therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Abstract

The invention provides an air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination, which can transversely combine a support vector machine SVM, a random forest RF and a decision tree DT, give weights to the three algorithms by using historical prediction errors, and then calculate a final air quality prediction result, thereby fully exerting the advantages of the respective algorithms of SVM, RF and DT, accurately predicting the air quality and providing help for people to go out and the atmospheric pollution control work of related departments.

Description

Air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination
Technical Field
The invention relates to air quality monitoring and alarming, in particular to air quality monitoring and alarming based on artificial intelligence.
Background
Along with the industrial development, except automobile exhaust, the sources of air pollution are increasing, the types of air pollution are increasing day by day, and the increased pollutants react with the original pollutants, so that secondary pollution is caused, and the problem of serious haze pollution is caused. The scientific and reasonable air pollution prediction and alarm method is helpful for people to make travel arrangement and is also helpful for the treatment of air pollution.
At present, AI is developed rapidly, and various intelligent algorithms are applied to various industries. In the aspect of early warning of atmospheric pollution, various artificial intelligence algorithms such as random forests, neural networks, support vector machines, particle swarms, artificial fish schools and the like are integrated. For example, Chakma et al in 2017 proposed the use of a random forest fused with a convolutional neural network to analyze the concentration of PM2.5, Rijal et al for example proposed the use of a forward neural network fused with a convolutional neural network to analyze the concentration of PM2.5, and so on. The algorithms are used for analyzing the pollutant concentration by longitudinally fusing artificial intelligence algorithms, and the advantages of transverse parallel calculation of the intelligent algorithms are not fully utilized.
The invention provides an air quality monitoring and alarming method, system, device and medium based on SVM-RF-DT combination, which can transversely combine a support vector machine SVM, a random forest RF and a decision tree DT, give weights to the three algorithms by using historical prediction errors, and then calculate a final air quality prediction result, thereby fully exerting the advantages of the respective algorithms of SVM, RF and DT, accurately predicting the air quality and providing help for people to go out and the atmospheric pollution control work of related departments.
Disclosure of Invention
The invention provides an air quality monitoring and alarming method based on SVM-RF-DT combination, which comprises the following steps: step 1: transmitting the sampling values of the air quality factors obtained by each monitoring sensor into a monitoring and alarming module; step 2: the monitoring and alarming module calculates the predicted value of the air quality factor; the specific calculation is that a Support Vector Machine (SVM) algorithm, a random forest algorithm (RF) and a Decision Tree (DT) are adopted to respectively calculate the predicted values of the air quality, and weights are given to the three algorithms based on historical prediction conditions: wSVM,WRF,WDTCalculating a final air quality predicted value; and step 3: comparing the air quality predicted value obtained in the step 2 with an atmospheric pollution index standard value, and giving an alarm to prompt that the air quality factor exceeds the standard when the predicted value is larger than or equal to the standard value;and when the predicted value is less than or equal to the standard value, prompting that the air quality is good.
The specific calculation process in step 2 is as follows:
step 2.1, calculating an air quality prediction value by adopting a Support Vector Machine (SVM):
when the SVM deals with the non-linear problem, the support vector machine can convert the input data into a space with higher dimension through a specific function. Due to the increase of dimensionality, the problem of finding an optimal classification line for classifying samples in a low-dimensional space is converted into the problem of finding an optimal classification plane in a high-dimensional space, the SVM can obtain a global optimal solution, and the calculation accuracy can be guaranteed for the problem of small sample quantity.
Let the training sample set be: TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs an air quality factor sample, y2、…yNIs the air quality value of the corresponding air quality factor sample.
E.g. x1Is the value of air quality factor of the city opened in 1 month in 2013, as shown in the following table:
month of the year PM2.5 PM10 SO2 CO NO2 O3
Jan-13 193 206 38 5.594 21 0
y1Is x1Corresponding air quality value, also called AQI value, of
Month of the year AQI
Jan-13 238
x2Is the value of the air quality factor of the city opened in 2013 in 2 months, as shown in the following table:
month of the year PM2.5 PM10 SO2 CO NO2 O3 AQI
Feb-13 145 145 26 4.557 15 0 188
y2Is x2Corresponding air quality value, also called AQI value, of
Month of the year AQI
Jan-13 188
Exist in a hyperplane
Figure BDA0002872879310000024
So that the samples can be correctly classified, X ═ X1、x2、…xNIs the number N of samples that are,
Figure BDA0002872879310000025
is to map X to a high-dimensional feature space
Figure BDA0002872879310000026
Omega is a normal vector and determines the direction of the hyperplane, b is a displacement term and is used as an optimization variable, in order to enable samples to be accurately classified according to the hyperplane, a relaxation variable epsilon is introduced, and an objective function is
Figure BDA0002872879310000021
εi0, i 1, …, N is a relaxation variable, each sample x in the training sample setiAll correspond to a relaxation variable epsiloniTo characterize the sample as not satisfying the constraint
Figure BDA0002872879310000022
To the extent of (c).
The kernel function can replace inner product operation in high-dimensional space, and the kernel function does not need to know
Figure BDA0002872879310000023
The specific form of (2) can be used to obtain the inner product result. The kernel function can be selected from linear kernel function, polynomial kernel function, radial basis kernel function, Gaussian kernel function and sigmoid kernel function, and the invention selects Gaussian kernel function, i.e. the kernel function is selected
Figure BDA0002872879310000031
σ is the Gaussian kernel bandwidth, αiIs the lagrange factor. Calculating by using sample values, averaging all obtained b values, and finally obtaining a support vector classification prediction function after training is finished
Figure BDA0002872879310000032
Substituting the current air quality factor monitoring value into the air quality support vector classificationMeasuring the function to obtain the predicted value P of the air qualitySVM
Step 2.2, calculating an air quality prediction value by adopting random forest RF:
the establishing process of the random forest regression tree comprises the following steps: training Data set TDS (training Data set), wherein the number of samples is N, and the expression is TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs an air quality factor sample, y1、y2、…yNIs the corresponding air mass sample value. x is the number of1、x2、…xNAnd y1、y2、…yNExamples of (3) are as described in step 2.1.
Randomly extracting B self-help sample sets by using bootstrap with a return, and constructing B regression trees according to the B self-help sample sets; the samples not drawn for each bootstrap sampling constitute B out of bag data (OOB) as test samples of the random forest.
The total number of the sample variables is M, in the invention, M is 6, M variables are randomly extracted at the nodes of each regression tree as alternative branch variables, in the invention, M is M/2;
each regression tree starts recursive branches from top to bottom, and the number of leaf nodes is set as t, which is the termination condition of the growth of the regression tree, wherein in the invention, t is 6;
the objective function in sample division is the sum of the squares of the minimum errors, i.e.:
Figure BDA0002872879310000033
Figure BDA0002872879310000034
wherein y isiThe actual AQI value for the air quality in the off-bag data,
Figure BDA0002872879310000035
prediction of out-of-bag data for random forests, EOOBIs the out-of-bag data OOB prediction error squared.
And the generated B trees form a random forest model.
Substituting the current air quality factor monitoring value into the random forest model to obtain an air quality predicted value PRF
Step 2.3, calculating an air quality predicted value by adopting a decision tree DT:
decision trees, Decision trees and DT can analyze the information hidden in the data and having important significance, the expression of the information is visual, users can easily understand the hidden information, and the method is widely applied to data mining and prediction.
The training Data set is TDS (training Data set), the number of samples is N, and the expression is TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs a sample of the air quality factor, e.g. x1Is in a table; y is1、y2、…yNIs the corresponding air mass sample value.
The objective function of the decision tree in sample division is the sum of the squares of the minimum errors, namely:
Figure BDA0002872879310000041
wherein j represents j variable in each sample, the total number of sample variables is M, in the invention, M is 6, s represents dividing point s of j variable, R1(j, s) denotes the left region of the division, R2(j, s) denotes the right area of the division, c1And c2Marking region R1(j, s) and R2(j, s) of the optimal output value. Traversing each feature j of the sample, trying possible segmentation points s of each feature, selecting the least error square sum, and determining the optimal output value c1And c2. And obtaining an air quality prediction regression tree after training is finished.
Substituting the current air quality factor monitoring value into an air quality prediction regression tree to obtain an air quality prediction value PDT
Steps 2.1, 2.2, 2.3 can be performed in any order, in particular simultaneously.
Step 2.4, weighting the predicted values obtained by the three algorithms by using the historical prediction errors, specifically:
taking historical data as input of three algorithms to obtain predicted values of the three algorithms, performing difference operation on the predicted values and corresponding historical air quality values, and then squaring to obtain an average error square value: eSVM,ERF,EDT. Then calculating the weight value W of the three algorithmsSVM,WRF,WDT
Figure BDA0002872879310000042
Will be provided with
Figure BDA0002872879310000043
Normalization processing is carried out to obtain WSVM,WRF,WDTThat is to say that,
Figure BDA0002872879310000044
the final air quality prediction value calculation formula is as follows: p ═ WSVM·PSVM+WRF·PRF+WDT·PDT
And step 3: comparing the predicted value of the air quality factor obtained in the step (2) with the standard value of the atmospheric pollution index, and giving an alarm to prompt that the air quality factor exceeds the standard when the predicted value is greater than or equal to the standard value; and when the predicted value is less than or equal to the standard value, prompting that the air quality is good. The standard value of the atmospheric pollution index is 100.
In another aspect of the invention, an air quality monitoring and warning system based on SVM-RF-DT combination is provided, which can realize the aforementioned air quality monitoring and warning method based on SVM-RF-DT combination.
In another aspect of the present invention, an air quality monitoring and warning device based on SVM-RF-DT combination is provided, which includes a processor and a memory, and is capable of implementing the aforementioned air quality monitoring and warning method based on SVM-RF-DT combination.
In another aspect of the present invention, a storage medium is provided, on which a computer program is stored, the computer program being capable of implementing the aforementioned air quality monitoring and warning method based on SVM-RF-DT combination.
Drawings
FIG. 1 is a diagram of a sensor and monitoring and alarm module relationship.
Fig. 2 is a flow diagram of the operational data of the monitoring and alarm module.
Fig. 3 is a plot of air quality values versus actual AQI values obtained by three algorithms SVM, RF, DT.
Fig. 4 is a comparison graph of an air quality predicted value and an actual AQI value obtained according to weights of three algorithms after the combination of the SVM algorithm, the RF algorithm and the DT algorithm.
Detailed Description
The present invention will be further described with reference to the following examples.
As shown in the attached figure 1, the sampled values of the air quality factors obtained by the monitoring sensors are transmitted to the monitoring and alarming module. As shown in fig. 2, the monitoring and warning module calculates the current sampling value by three algorithms of a Support Vector Machine (SVM), a random forest RF and a Decision Tree (DT), respectively calculates the prediction error conditions of the three algorithms based on historical data conditions, obtains an air quality prediction value based on the SVM-RF-DT combination according to the weight value when the three algorithms are transversely combined according to the error conditions, compares the prediction value with an air quality standard value, and judges and outputs whether to warn.
Data set: the data set adopted is from the air quality data of the city unsealing from 1 month in 2013 to 11 months in 2020, and the first six items are air quality factors as described in the following table 1: PM2.5, PM10, SO2, CO, NO2, O3, AQI is air mass fraction, the last corresponds to its quality class. The national Air Quality Index (AQI) technical regulation (trial) regulation for environment out of the counter is to use the Air Quality Index (AQI) to replace the original Air Pollution Index (API) AQI and divide the AQI into six grades, wherein the first grade is excellent, the second grade is excellent, the third grade is slightly polluted, the fourth grade is moderately polluted, the fifth grade is severely polluted and the sixth grade is severely polluted.
Figure BDA0002872879310000051
Figure BDA0002872879310000061
Figure BDA0002872879310000071
The program language adopts C + +, the operating system is windows 10, the data set is divided into a training set and a test set according to a proportion, three models are respectively used for calculation, 35 air quality samples from 1 month to 2020 and 11 months in 2018 are used as the test set, the rest are used as the training set, the obtained results are shown in the following table, and the figure 3 shows a comparison graph of the air quality values obtained by three algorithms of SVM, RF and DT and the actual AQI values:
month-year AQI SVM RF DT E_SVM E_RF E_DT
Jan-18 144 146 147 147 4 9 9
Feb-18 117 117 115 116 0 4 1
Mar-18 92 95 91 95 9 1 9
Apr-18 50 51 52 51 1 4 1
May-18 75 74 73 71 1 4 16
Jun-18 118 116 117 117 4 1 1
Jul-18 75 75 76 76 0 1 1
Aug-18 88 90 88 93 4 0 25
Sep-18 75 74 75 76 1 0 1
Oct-18 95 94 97 95 1 4 0
Nov-18 103 103 100 101 0 9 4
Dec-18 136 137 138 136 1 4 0
Jan-19 170 170 167 173 0 9 9
Feb-19 166 167 165 169 1 1 9
Mar-19 90 88 87 91 4 9 1
Apr-19 84 85 87 87 1 9 9
May-19 103 101 104 104 4 1 1
Jun-19 118 120 117 117 4 1 1
Jul-19 106 104 105 106 4 1 0
Aug-19 79 81 77 84 4 4 25
Sep-19 93 95 95 93 4 4 0
Oct-19 76 77 77 79 1 1 9
Nov-19 99 102 100 100 9 1 1
Dec-19 132 134 133 135 4 1 9
Jan-20 173 171 170 173 4 9 0
Feb-20 92 90 91 91 4 1 1
Mar-20 81 83 78 81 4 9 0
Apr-20 82 81 84 84 1 4 4
May-20 98 96 98 98 4 0 0
Jun-20 94 96 93 99 4 1 25
Jul-20 81 84 78 83 9 9 4
Aug-20 67 68 70 66 1 9 1
Sep-20 92 89 94 87 9 4 25
Oct-20 86 83 86 81 9 0 25
Nov-20 94 92 93 89 4 1 25
The AQI list represents an actual AQI value of the sample, the SVM represents an AQI value obtained after a support vector machine is adopted, the RF represents an AQI value obtained by a random forest algorithm, the DT represents an AQI value obtained by a decision tree, the E _ SVM represents the square of a difference value between the AQI value obtained by calculation of the support vector machine algorithm and the actual AQI value, the E _ RF represents the square of a difference value between the AQI value obtained by calculation of the random forest algorithm and the actual AQI value, and the E _ DT represents the square of a difference value between the AQI value obtained by calculation of the decision tree algorithm and the actual AQI value.
According to the method provided by the invention, the mean error square value of the SVM, RF and DT algorithms is obtained by calculation: eSVM,ERF,EDT. Then calculating the weight value W of the three algorithmsSVM,WRF,WDT
Figure BDA0002872879310000091
Will be provided with
Figure BDA0002872879310000092
Normalization processing is carried out to obtain WSVM,WRF,WDTThat is to say that,
Figure BDA0002872879310000093
Figure BDA0002872879310000094
the air quality prediction value calculation formula is as follows: p is 0.438073. PSVM+0.425459·PRF+0.136468·PDT. The obtained weights can also be carried outAnd (3) performing divisor operation, for example, only taking two digits after the decimal point, wherein the air quality prediction value calculation formula is as follows: p is 0.44. PSVM+0.42·PRF+0.14·PDT
Figure 4 shows a comparison graph of the air quality predicted value and the actual AQI value obtained according to the weight of three algorithms after the combination of the SVM, the RF and the DT algorithms.
While one embodiment of the present invention has been described in detail, the description is only a preferred embodiment of the present invention and should not be taken as limiting the scope of the invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims (10)

1. An air quality monitoring and alarming method based on SVM-RF-DT combination is characterized by comprising the following steps:
step 1: transmitting the sampling values of the air quality factors obtained by each monitoring sensor into a monitoring and alarming module;
step 2: the monitoring and alarming module calculates the predicted value of the air quality factor; the specific calculation is that a Support Vector Machine (SVM) algorithm, a random forest algorithm (RF) and a Decision Tree (DT) are adopted to respectively calculate the predicted values of the air quality, and weights are given to the three algorithms based on historical prediction conditions: wSVM,WRF,WDTCalculating a final air quality predicted value;
and step 3: comparing the air quality predicted value obtained in the step 2 with an atmospheric pollution index standard value, and giving an alarm to prompt that the air quality factor exceeds the standard when the predicted value is larger than or equal to the standard value; and when the predicted value is less than or equal to the standard value, prompting that the air quality is good.
2. The method of claim 1, wherein the process of calculating the air quality prediction value using a Support Vector Machine (SVM) is:
the training sample set is: TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs an air quality factor sample, y2、…yNIs an air quality value corresponding to the air quality factor;
exist in a hyperplane
Figure FDA0002872879300000011
So that the samples can be correctly classified, X ═ X1、x2、…xNIs the number N of samples that are,
Figure FDA0002872879300000012
is to map X to a high-dimensional feature space
Figure FDA0002872879300000013
Omega is a normal vector and determines the direction of the hyperplane, b is a displacement term and is used as an optimization variable, in order to enable samples to be accurately classified according to the hyperplane, a relaxation variable epsilon is introduced, and an objective function is
Figure FDA0002872879300000014
εiN is a relaxation variable, x is for each sample in the set of training samplesiAll correspond to a relaxation variable epsiloniTo characterize the sample as not satisfying the constraint
Figure FDA0002872879300000015
The degree of (d);
gaussian kernel functions replace inner product operations in high dimensional space,
Figure FDA0002872879300000016
sigma > 0, sigma is the Gaussian kernel bandwidth, alphaiIs a Lagrange factor; calculating by using sample values, averaging all obtained b values, and finally obtaining a support vector classification prediction function after training is finished
Figure FDA0002872879300000017
Substituting the current air quality factor monitoring value into an air quality support vector classification prediction function to obtain an air quality prediction value PSVM
3. A method as claimed in claim 1, wherein the process of calculating the air quality prediction using random forest RF is:
training Data set TDS (training Data set), wherein the number of samples is N, and the expression is TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs an air quality factor sample, y1、y2、…yNIs an air quality value corresponding to the air quality factor;
randomly extracting B self-help sample sets by using bootstrap with a return, and constructing B regression trees according to the B self-help sample sets; b out-of-bag data OOB are formed by samples which are not extracted in each bootstrap sampling and are used as test samples of the random forest;
the total number of sample variables is M, and M variables are randomly extracted at nodes of each regression tree to serve as alternative branch variables; starting recursive branches from top to bottom in each regression tree, and setting the number of leaf nodes as t as a termination condition of the growth of the regression tree;
the objective function in sample division is the sum of the squares of the minimum errors, i.e.:
Figure FDA0002872879300000021
Figure FDA0002872879300000022
wherein y isiIs the actual value of the air quality in the data outside the bag,
Figure FDA0002872879300000023
prediction of out-of-bag data for random forests, EOOBIs the error square of the out-of-bag data OOB predicted value;
the generated B trees form a random forest model;
substituting the current air quality factor monitoring value into the random forest model to obtain an air quality predicted value PRF
4. The method according to claim 1, wherein the process of calculating the air quality prediction value using the decision tree DT is:
the training Data set is TDS (training Data set), the number of samples is N, and the expression is TDS { (x)1,y1),(x2,y2)…(xN,yN)},x1、x2、…xNIs a sample of the air quality factor, e.g. x1Is in a table; y is1、y2、…yNAre sample values of the corresponding air quality factor.
The objective function of the decision tree in sample division is the sum of the squares of the minimum errors, namely:
Figure FDA0002872879300000024
wherein j represents the j variable in each sample, the total number of the sample variables is M, s represents the division point s of the j variable, R1(j, s) denotes the left region of the division, R2(j, s) denotes the right area of the division, c1And c2Marking region R1(j, s) and R2(j, s) optimal output value; traversing each feature j of the sample, trying possible segmentation points s of each feature, selecting the least error square sum, and determining the optimal output value c1And c2(ii) a Obtaining an air quality prediction regression tree after training is finished;
substituting the current air quality factor monitoring value into an air quality prediction regression tree to obtain an air quality prediction value PDT
5. The method of claim 1, further characterized in that the Support Vector Machine (SVM) algorithm, the random forest algorithm (RF), and the Decision Tree (DT) algorithm are performed in any order.
6. The method according to claim 1, wherein the three algorithms are weighted based on historical prediction, specifically:
taking historical data as input of three algorithms to obtain predicted values of the three algorithms, performing difference operation on the predicted values and corresponding historical air quality values, and then squaring to obtain an average error square value: eSVM,ERF,EDT(ii) a Then calculating the weight value W of the three algorithmsSVM,WRF,WDT
Figure FDA0002872879300000031
Will be provided with
Figure FDA0002872879300000032
Normalization processing is carried out to obtain WSVM,WRF,WDTThat is to say that,
Figure FDA0002872879300000033
the final air quality prediction value calculation formula is as follows: p ═ WSVM·PSVM+WRF·PRF+WDT·PDT
7. The method of claim 1, wherein the standard value of the atmospheric pollution index is 100.
8. An air quality monitoring and warning system based on SVM-RF-DT combination, which is capable of implementing the method of any of the preceding claims 1-7.
9. An air quality monitoring and warning device based on SVM-RF-DT combination, the device comprising a processor, a memory, which is capable of implementing the method of any of the preceding claims 1-7.
10. A storage medium having stored thereon a computer program enabling the method of any of the preceding claims 1-7.
CN202011624426.6A 2020-12-30 2020-12-30 SVM-RF-DT combination-based air quality monitoring and alarming method, system, device and medium Active CN112733903B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011624426.6A CN112733903B (en) 2020-12-30 2020-12-30 SVM-RF-DT combination-based air quality monitoring and alarming method, system, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011624426.6A CN112733903B (en) 2020-12-30 2020-12-30 SVM-RF-DT combination-based air quality monitoring and alarming method, system, device and medium

Publications (2)

Publication Number Publication Date
CN112733903A true CN112733903A (en) 2021-04-30
CN112733903B CN112733903B (en) 2023-11-17

Family

ID=75609660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011624426.6A Active CN112733903B (en) 2020-12-30 2020-12-30 SVM-RF-DT combination-based air quality monitoring and alarming method, system, device and medium

Country Status (1)

Country Link
CN (1) CN112733903B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117408163A (en) * 2023-12-11 2024-01-16 山西潞安环保能源开发股份有限公司 Prediction device for coal and gas outburst

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180189667A1 (en) * 2016-12-29 2018-07-05 Intel Corporation Entropy-based weighting in random forest models
CN109408774A (en) * 2018-11-07 2019-03-01 上海海事大学 The method of prediction sewage effluent index based on random forest and gradient boosted tree
CN110363347A (en) * 2019-07-12 2019-10-22 江苏天长环保科技有限公司 The method of neural network prediction air quality based on decision tree index
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm
CN111985571A (en) * 2020-08-26 2020-11-24 国网湖南省电力有限公司 Low-voltage intelligent monitoring terminal fault prediction method, device, medium and equipment based on improved random forest algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180189667A1 (en) * 2016-12-29 2018-07-05 Intel Corporation Entropy-based weighting in random forest models
CN109408774A (en) * 2018-11-07 2019-03-01 上海海事大学 The method of prediction sewage effluent index based on random forest and gradient boosted tree
CN110363347A (en) * 2019-07-12 2019-10-22 江苏天长环保科技有限公司 The method of neural network prediction air quality based on decision tree index
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm
CN111985571A (en) * 2020-08-26 2020-11-24 国网湖南省电力有限公司 Low-voltage intelligent monitoring terminal fault prediction method, device, medium and equipment based on improved random forest algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
程蓉;钱雪忠;: "基于神经随机森林的局部空气质量预测模型", 计算机工程与设计, no. 07, pages 166 - 174 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117408163A (en) * 2023-12-11 2024-01-16 山西潞安环保能源开发股份有限公司 Prediction device for coal and gas outburst
CN117408163B (en) * 2023-12-11 2024-04-05 山西潞安环保能源开发股份有限公司 Prediction device for coal and gas outburst

Also Published As

Publication number Publication date
CN112733903B (en) 2023-11-17

Similar Documents

Publication Publication Date Title
Rusdah et al. XGBoost in handling missing values for life insurance risk prediction
CN111178611B (en) Method for predicting daily electric quantity
Wang et al. Intelligent multivariable air-quality forecasting system based on feature selection and modified evolving interval type-2 quantum fuzzy neural network
CN110866030A (en) Database abnormal access detection method based on unsupervised learning
Garg et al. Comparative analysis of various data mining techniques on educational datasets
CN114548592A (en) Non-stationary time series data prediction method based on CEMD and LSTM
Tembusai et al. K-nearest neighbor with K-fold cross validation and analytic hierarchy process on data classification
You et al. A variable relevant multi-local PCA modeling scheme to monitor a nonlinear chemical process
Zhenjie et al. A novel nonlinear causal inference approach using vector‐based belief rule base
CN112711912A (en) Air quality monitoring and alarming method, system, device and medium based on cloud computing and machine learning algorithm
Sun et al. Knowledge-guided bayesian support vector machine for high-dimensional data with application to analysis of genomics data
CN114764682A (en) Rice safety risk assessment method based on multi-machine learning algorithm fusion
CN112733903B (en) SVM-RF-DT combination-based air quality monitoring and alarming method, system, device and medium
Łęski Neuro-fuzzy system with learning tolerant to imprecision
Gunawan et al. C4. 5, K-Nearest Neighbor, Naïve Bayes, and Random Forest Algorithms Comparison to Predict Students' on TIME Graduation
CN110837853A (en) Rapid classification model construction method
CN111062118B (en) Multilayer soft measurement modeling system and method based on neural network prediction layering
CN114297582A (en) Modeling method of discrete counting data based on multi-probe locality sensitive Hash negative binomial regression model
CN113657441A (en) Classification algorithm based on weighted Pearson correlation coefficient and combined with feature screening
Devanta Optimization of the K-Means Clustering Algorithm Using Davies Bouldin Index in Iris Data Classification
CN111488903A (en) Decision tree feature selection method based on feature weight
CN114139634A (en) Multi-label feature selection method based on paired label weights
Martinez-Zeron et al. Method to improve airborne pollution forecasting by using ant colony optimization and neuro-fuzzy algorithms
CN113221966A (en) Differential privacy decision tree construction method based on F _ Max attribute measurement
Sitepu et al. Analysis of Fuzzy C-Means and Analytical Hierarchy Process (AHP) Models Using Xie-Beni Index

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant