CN111985567B - Automatic pollution source type identification method based on machine learning - Google Patents
Automatic pollution source type identification method based on machine learning Download PDFInfo
- Publication number
- CN111985567B CN111985567B CN202010846058.3A CN202010846058A CN111985567B CN 111985567 B CN111985567 B CN 111985567B CN 202010846058 A CN202010846058 A CN 202010846058A CN 111985567 B CN111985567 B CN 111985567B
- Authority
- CN
- China
- Prior art keywords
- pollution
- data
- feature
- model
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000010801 machine learning Methods 0.000 title claims abstract description 27
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 49
- 238000012544 monitoring process Methods 0.000 claims abstract description 37
- 230000007613 environmental effect Effects 0.000 claims abstract description 11
- 230000002159 abnormal effect Effects 0.000 claims abstract description 10
- 238000004458 analytical method Methods 0.000 claims abstract description 6
- 238000005457 optimization Methods 0.000 claims abstract description 5
- 238000012545 processing Methods 0.000 claims abstract description 4
- 239000003344 environmental pollutant Substances 0.000 claims description 20
- 231100000719 pollutant Toxicity 0.000 claims description 20
- 238000007637 random forest analysis Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 239000000428 dust Substances 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000012706 support-vector machine Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000010287 polarization Effects 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 230000002093 peripheral effect Effects 0.000 claims description 4
- 239000000779 smoke Substances 0.000 claims description 4
- 239000013589 supplement Substances 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000013178 mathematical model Methods 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims description 3
- 238000013145 classification model Methods 0.000 claims description 2
- 239000003245 coal Substances 0.000 claims description 2
- 239000002131 composite material Substances 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 238000013499 data model Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 238000012935 Averaging Methods 0.000 claims 1
- 238000005192 partition Methods 0.000 claims 1
- 230000001502 supplementing effect Effects 0.000 abstract description 3
- 238000011109 contamination Methods 0.000 description 7
- 239000000356 contaminant Substances 0.000 description 5
- 238000003066 decision tree Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000011835 investigation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000010865 sewage Substances 0.000 description 2
- 239000012855 volatile organic compound Substances 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 238000003915 air pollution Methods 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000013361 beverage Nutrition 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01D—MEASURING NOT SPECIALLY ADAPTED FOR A SPECIFIC VARIABLE; ARRANGEMENTS FOR MEASURING TWO OR MORE VARIABLES NOT COVERED IN A SINGLE OTHER SUBCLASS; TARIFF METERING APPARATUS; MEASURING OR TESTING NOT OTHERWISE PROVIDED FOR
- G01D21/00—Measuring or testing not otherwise provided for
- G01D21/02—Measuring two or more variables by means not covered by a single other subclass
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/06—Investigating concentration of particle suspensions
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
- G01N33/0037—NOx
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
- G01N33/0039—O3
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
- G01N33/004—CO or CO2
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
- G01N33/0042—SO2 or SO3
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0062—General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/20—Air quality improvement or preservation, e.g. vehicle emission control or emission reduction by using catalytic converters
Landscapes
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Medicinal Chemistry (AREA)
- Food Science & Technology (AREA)
- Combustion & Propulsion (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Software Systems (AREA)
- Tourism & Hospitality (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Educational Administration (AREA)
- Primary Health Care (AREA)
- Dispersion Chemistry (AREA)
- Medical Informatics (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
Abstract
A pollution source type automatic identification method based on machine learning. Comprises the following steps: based on the environmental monitoring data, time and geographic information, identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment, and establishing a typical pollution case library; based on a machine learning algorithm, taking data of a case base as a sample to extract data characteristics, and developing a pollution source type recognition algorithm model; monitoring the real-time monitoring data by using the algorithm model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming; checking or on-site checking the model identification result according to the alarm information, processing the pollution problem if the model identification result exists really, and supplementing and listing the pollution problem in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark. Based on monitoring data such as gridding micro stations and small stations, more data can be brought into a data source, and the model can be further optimized.
Description
Technical Field
The invention relates to the field of atmospheric environment monitoring, in particular to a pollution source type automatic identification method based on machine learning.
Background
In the field of atmospheric environment monitoring, a standard air station method is adopted in traditional monitoring, and due to the fact that cost is high, distribution quantity is small, generated data quantity is small, and the problem of fine pollution is difficult to accurately reflect. The micro-station adopting the sensor method can realize large-scale point distribution application due to low cost, SO that monitoring data with high space-time resolution in a monitoring area is obtained, monitoring parameters comprise PM10, PM2.5, SO2, NO2, CO, O3, temperature and humidity, the spatial resolution is up to 1 x 1km, and the time resolution is 1h. The acquisition of massive environmental monitoring data supports the establishment of the corresponding relation between a pollution source and air quality, through manual analysis and research, the existing pollution problem can be found from data characteristics, and the source type of air pollution can be judged, including a dust raising source, a moving source, a coal-fired source, a catering oil smoke source, an industrial source and the like, so that the investigation range is reduced, the investigation accuracy is improved, the supervision efficiency is improved, and the manpower is saved for the on-site investigation work of the environmental problem.
However, the current problems are that the process of finding pollution problems and source types based on mass monitoring data requires a large amount of manpower and time, the dependency on the technical level and experience of research personnel is high, the overall application process efficiency is low, the timeliness is poor, the process is limited by the technical personnel level, and the environment management is difficult to effectively support. Therefore, a calculation method capable of efficiently, quickly and stably identifying the type of the pollution source is needed.
At present, the existing pollution source identification patent technology is based on a hot spot grid rather than real-time monitoring data, for example, chinese patent CN110147383A, named as "method and apparatus for determining pollution source type", and discloses a method for determining pollution source type, which determines the pollution source type of a pollution grid by setting a preset concentration value and a preset concentration difference value, and combining with wind speed, wind direction and the situation of pollution source in the grid; the invention of Chinese patent CN110006799A is named as a classification method of hotspot grid pollution types, and discloses a classification method of hotspot grid pollution types, which is used for classifying the atmosphere hotspot grid pollution types through the change characteristics of the concentration of atmospheric pollutants along with time. The technology has the following disadvantages: firstly, the time and space resolution of the hotspot grid data is low, so that the pollution source identification work is mainly based on historical data, the pollution tracing work cannot be guided in real time, and the identification result is difficult to carry out scientific and effective verification; secondly, the satellite inversion data are restricted by meteorological conditions such as cloud amount, accuracy cannot be guaranteed, and effective tracing cannot be achieved; thirdly, the hotspot grid data reflect the air quality condition of the grid area but not the periphery of the pollution source, so that the type of the pollution source is difficult to distinguish through data characteristics; fourthly, the pollution source identification mode is single, and the characteristic parameters are few. And the types of the pollution sources at least comprise 6 types of pollution sources with different pollution characteristics. And the contamination characteristics described above cannot be accurately described.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for automatically identifying a type of a pollution source based on machine learning, so that the method can utilize parameters, time and space coordinate information of each pollutant, and the space information participates in a model operation for identifying a pollution process, that is, differences between target grid data and surrounding grid data are considered, rather than analyzing a data change trend in a time series.
In order to achieve the purpose, the invention provides a pollution source type automatic identification method based on machine learning, which mainly comprises the following steps:
step one, based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, and time and geographic information, through (expert) analysis and judgment, the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established.
Secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning algorithm, and developing a pollution source type recognition algorithm model;
monitoring the real-time monitoring data by using the model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
fourthly, the expert examines or checks the model recognition result on site according to the alarm information, if the model recognition result exists, the pollution problem is processed, and event supplements are listed in a typical case library for continuous optimization of the algorithm model; and if the identification result is not accurate, removing the pollution event mark.
The identification algorithm adopted by the method is based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, time and geographic information, through analysis and judgment (manual judgment by experts and the like can be used), the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established. Then, based on a machine learning algorithm, taking mass data of the case base as samples to extract data characteristics, and developing a pollution source type recognition algorithm model; and monitoring the real-time monitoring data by using the model, marking abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming. Furthermore, the model identification result can be audited or checked on site by virtue of experts according to alarm information, if the model identification result does exist, the pollution problem is treated, and event supplement is listed in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark.
Preferably, the algorithm model training set contains pollution-free time series pollution data, after the pollution data of the grid is obtained, the proposed 38 features are calculated, and the classification result of the grid pollution type can be output by inputting the mathematical model after training.
The invention has the beneficial effects that by means of the technical scheme, the invention realizes the following advantages compared with the prior art:
(1) A data source: compared with the prior art based on hotspot grid data, the method is based on monitoring data such as grid micro stations and small stations, and can bring more data into a data source;
(2) An algorithm model: the technical scheme of the invention adopts a machine learning algorithm which specifically comprises algorithms such as a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model which comprises sub models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like;
(3) Is characterized in that: in view of the fact that the selectable features based on features in the prior art are few (single grid judgment), through repeated research of the inventor, the algorithm of the invention can comprise 38 feature values in total, multi-point bit comparison judgment is realized, and data such as peripheral pollution sources and the like are further considered as the feature values; (can improve the accuracy of pollution type identification, and has the functions of distinguishing local sources and external sources, and the like, and overcomes the one-sidedness based on single grid analysis)
(4) Model continuous optimization: compared with the prior art which is based on historical data and has fixed algorithm, the technical scheme of the invention is that a generation of algorithm model is generated through the historical data, the application can be implemented in subsequent monitoring data and new cases can be found, the new cases are automatically put into a case library after being audited by technicians, and the model can be further optimized;
(5) Compared with the prior art that the method is based on the client, the method can be based on the cloud server, and has the advantages that the cost of the client is reduced, the advantages of large data are formed at the server end, a large number of cases are collected at different places, the advantages of the technical scheme are fully played, and the accuracy of the algorithm judgment result is further improved.
Drawings
Fig. 1 is a flowchart illustrating steps of a method for automatically identifying a pollution source type based on machine learning according to the present invention.
Detailed Description
For a better understanding of the objects, aspects and advantages of the present invention, reference is made to the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
The hotspot grid in the invention refers to a technical unit related to the organization of the environmental protection department, and the Jingjin Ji and a peripheral key area of 2+26 city are divided into a plurality of grids according to 3km multiplied by 3 km. The method comprises the steps of integrating various data such as satellite remote sensing, air quality ground observation, meteorological observation and the like, utilizing a remote sensing image recognition technology based on cognition and multi-source data fusion, then determining the PM2.5 average concentration of each grid through atmospheric pollutant satellite remote sensing inversion, and determining key supervision areas in hot point grids according to concentration numerical sorting.
Referring to fig. 1, a flow chart of the recognition algorithm used in the present invention is shown, and its main concept content is briefly described as follows:
1. building of typical case base
The typical case base of the invention is a collection of cases describing pollution events which are audited manually (experts) based on environmental monitoring data, and the data information contained in each case at least comprises: the starting time and the ending time of the pollution event, the name and the coordinate of the affected point, the type of the affected parameter, the current and local meteorological conditions and the type of the pollution source which is judged by an expert. Wherein the parameter types may include PM10, PM2.5, SO2, NO2, CO, O3, and VOC, the meteorological conditions include wind direction, wind speed, temperature, and humidity, and the pollution source types include dust sources, mobile sources, coal-fired sources, food and beverage oil smoke sources, industrial sources, and others.
2. Data characterization
The data features extracted from the algorithm model of the invention comprise: 1. first derivative standard deviation of PM 2.5; 2. first derivative standard deviation of CO; 3. SO (SO) 2 The first derivative standard deviation of (d); 4. the first 10 first order differential series-squared sums for PM 2.5; 5. maximum value of CO; 6. a major contaminant; 7. skewness of AQI; 8. 1st autocorrelation coefficient of PM10; 9. quartiles of CO; 10. 1st autocorrelation coefficient of PM 2.5; 11. the coefficient of variation of AQI; 12. coefficient of variation of CO; 13. first derivative standard deviation of PM10; 14. the first 10 first differential series-sum squares of the CO; 15. the sum of AQI; 16. SO (SO) 2 And is added to the CO sum; 17.skewness of PM10; 18. o is 2 Maximum value of (d); 19. SO (SO) 2 The sum of (d); 20. median of CO; 21. the first 10 first order differential series sums of AQI; 22. NO 2 The kurtosis of (a); 23. the first 10 first order differential series-squared sums of PM10; 24. 1st autocorrelation coefficient of AQI; 25. a first differential stage of CO; 26. 1st autocorrelation coefficient of CO; 27. SO (SO) 2 A first differential order of; 28. the sum of CO; 29. SO 2 A median of (d); 30. kurtosis of PM 2.5; 31. a primary differential stage number of PM 2.5; 32. NO 2 The first 10 first order differential series sums of squares; 33. SO 2 Kurtosis of (2); 34. a small value of the AQI maximum time; 35. SO (SO) 2 Coefficient of variation of (a); 36. correlation coefficient of PM10 and CO; 37. SO 2 And CO correlation coefficient; 38. NO 2 And CO correlation coefficient.
The 38 characteristics can reflect the change conditions such as rising and falling of each pollutant and the (time cross) correlation of each pollutant time series to a certain extent, and comprehensively characterize the pollution types of each station in different periods from the statistical perspective.
For example, the feature 6 (NO 2_ diff1_ acf 10) represents the degree of variation of the NO2 sequence, the feature 11 (distance _ dtw) represents the similarity of time series between different pollutants, and the feature 17 (co-quantile) represents the frequency distribution of C0 pollution, which can indicate to some extent whether a case belongs to automotive pollution.
However, due to the complexity of the multivariate time series variation and the correlation of multivariate time series of peripheral sites, it is difficult to artificially generalize and select the time series characteristics corresponding to each pollution type (or case). Therefore, the invention mainly combines the 38 weighted characteristics automatically based on the training data in the case base through a machine learning algorithm to generate a data-driven prediction model.
3. Model algorithm description/calculation formula
The technical scheme includes that a multi-label classification model is established for an existing case and a case supplemented later, namely, composite pollution formed by combination of multiple pollution types possibly exists in the same time period and the same place, an example is shown in table 1 (not including all pollution types), each row corresponds to one case or one pollution event, X is selected characteristic value summary, X1, X2, X3, X4, X5 and X6 are respectively characteristic values of corresponding cases, Y1, Y2, Y3, Y4 and Y5 are different pollution types, labels are called in the multi-label model, 1 represents the type, and 0 represents the type. The model adopts a combination strategy, and the combination strategy mainly comprises Binary Relevance (Binary Relevance), classifier Chains (Classifier Chains), nested Stacking (Nested Stacking) and the like.
X | Y1 | Y2 | Y3 | Y4 | Y5 |
X1 | 1 | 0 | 0 | 0 | 0 |
X2 | 0 | 1 | 1 | 0 | 0 |
X3 | 0 | 0 | 0 | 1 | 0 |
X4 | 0 | 0 | 0 | 0 | 1 |
X5 | 0 | 1 | 0 | 0 | 0 |
X6 | 1 | 0 | 1 | 0 | 0 |
TABLE 1 Multi-tag model example
The invention mainly uses a binary association strategy, the principle of the strategy is to establish a binary classification for each label, the binary classification is a simple problem, namely whether the label belongs to the type or not, as shown in table 2, a model is divided into five binary classifications, then a plurality of binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear computational complexity in the aspect of label quantity, and can be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved. In addition, machine learning (e.g., random forest) models under default parameter configurations tend to ignore the less significant types of pollution in training samples in the prediction. In the algorithm, a cutoff value (cutoff) parameter in each two classifier is adjusted based on the proportion of each pollution type in a training sample, so that each pollution type can be predicted in a balanced manner by an optimized model, and the overall prediction performance is improved.
TABLE 2 binary Association policy example
When the binary classification is established independently for each label, the same machine learning algorithm is used for modeling for each binary classification under the default condition, and the algorithm comprises a random forest, a neural network, a support vector machine, a gradient propulsion machine and the like. After further learning and research, different characteristic value combinations can be combined when modeling of each pollution type is tried, different machine learning algorithms are tried, the optimal characteristic value combination and the optimal algorithm are selected to establish binary classification, finally, different binary classifications are combined and combined to form an optimal multi-label model according to binary association, and when a new pollution event is predicted, the pollution type can be comprehensively judged according to the characteristic value of the pollution event.
The invention constructs three algorithms of a support vector machine, a random forest and an XGboost for a model. Briefly introduced here, a Support Vector Machine (SVM) is a type of generalized linear classifier that performs binary classification on data in a supervised learning manner, and can be used for classification and regression. The random forest is an algorithm for integrating a plurality of trees through the idea of ensemble learning, belongs to a nonlinear classifier, and therefore, the complex nonlinear interdependence relation between variables can be mined. The basic unit of the random forest is a decision tree which is a basic classifier, the main work is to select features to divide a data set, and finally, the data is attached with two different types of labels, and the constructed decision tree is in a tree structure. The random forest can be obtained by constructing a plurality of decision trees, each tree gives a classification result when prediction is carried out, voting is carried out accordingly, and a final classification result is output by adopting a principle that majority obeys minority. XGBoost is also a decision tree based machine learning algorithm, different from random forests, where each decision tree is constructed separately, and the idea of XGBoost is to grow a tree by adding trees continuously and performing feature splitting continuously, and each time a tree is added, it is actually to learn a new function to fit the residual of the last prediction until a stopping condition is reached, such as the number of trees to be constructed. During prediction, according to the characteristics of a prediction sample, a corresponding leaf node is found on each tree, each leaf node corresponds to a score, and finally the scores corresponding to each tree are added together to form the prediction value of the sample.
When the model is constructed, because each sample of the pollution type is not necessarily balanced, which has certain influence on the accuracy of the model, the method optimizes the point when the model is constructed, avoids the influence caused by unbalanced samples to a certain extent by improving the parameters of the model, and can correspondingly adjust under the condition that the cases are continuously supplemented.
4. How to base on cloud server
In the development process of the algorithm model provided by the invention, as more available cases are provided, the prediction accuracy of the developed model is higher, so that environmental monitoring data of multiple cities are required; after the development is completed, the model can be applied to different cities. Therefore, in the scheme of the invention, the model is set to be in a cloud operation mode, and the operation mode can effectively utilize as much data as possible, improve the precision of the model and facilitate later wide application.
The following specific examples are intended to illustrate the invention, but are not intended to limit the scope of the invention.
In this embodiment, the method for automatically identifying the type of the pollution source based on machine learning of the present invention is to utilize a micro station to obtain monitoring data with high spatial and temporal resolution in a monitoring area, wherein monitoring parameters include PM10, PM2.5, SO2, NO2, CO, O3, temperature, and humidity, and propose concentration characteristics based on changes with time and geographic information for classification. As shown in fig. 1, the method for automatically identifying the type of a pollution source based on machine learning provided by the present invention mainly includes the following steps:
monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like with high space-time resolution in a monitoring area, and time and geographic information are obtained through a micro station;
establishing a typical pollution case library based on expert judgment;
developing a pollution source type recognition algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm;
carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming;
then, the expert examines the model identification result according to the alarm information to determine whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified.
In the following embodiments, the classification of the high spatial and temporal resolution site pollution types in the monitored area comprises the following steps:
1. PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and other monitoring data with high space-time resolution in a monitoring area, and time and geographic information are obtained through the micro-station.
Because different pollution types have different characteristics on the change of the pollutant concentration, various characteristics are extracted from time series pollution data according to the basic statistics of the data; and then converting some geographic information, emission list information and information acquired by expert judgment into corresponding characteristic variables, such as: and (3) the characteristics of pollution sources around the site, road network density around the site and time series distance, and the total number is 140.
The characteristics and some of the calculations involved for each contaminant are as follows:
the 6 pollutants (PM 10, PM2.5, SO2, NO2, O3, CO) and AQI were formed in case groups:
diff1_ acf10: the first 10 first order difference series sums of squares;
diff1_ acf1: a first differential stage number;
x _ acf1: a first autocorrelation coefficient;
x _ pacf5: the square sum of the autocorrelation coefficients of the first five parts;
diff2x _ pacf5: the first 5 2 differential series sums of squares;
std1st _ der: first derivative standard deviation;
the average value, the sum, the maximum value, the quartile, the variation coefficient, the mean, the standard deviation, the median, the variance, the skewness, the kurtosis and the hour value of the maximum time of the AQI are formed by grouping the 6 pollutants and the AQI according to cases; correlation coefficients between six contaminants and AQI; the main contaminants.
Pollution sources around the station: acquiring the number of different types of pollution sources around different stations according to the pollution source information around the stations and the emission list information, and taking the pollution sources as characteristic values;
road network density around the site: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time series between contaminants, dynamic Time Warping (DTW) distance is used.
Then screening a certain amount of characteristic variables from all considered variables according to the importance of the variables in the random forest model, and finally selecting the following 38 data characteristics based on the pollution data and the geographic information, the emission list information and the information obtained by expert judgment as the basis of pollution type classification.
The method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf10; the first 10 first differential series-squared sums of PM10;
and (3) feature: pm2_ 5/diff 1/acf 10; the first 10 first order differential series-squared sums for PM 2.5;
and (4) feature: co _ diff1_ acf10; the first 10 first differential series-sum squares of the CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts, such as main roads, sensitive points, towns, construction sites, environmental background points and the like;
and (6) characteristic: no2_ diff1_ acf10; the first 10 first differential order sums of squares for NO 2;
and (7) feature: aqi _ diff1_ acf10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristics are as follows: hour.data; AQI maximum time small value;
and (2) characteristic 11: distance _ dtw; similarity of time series among pollutants is realized by adopting a distance of dtw;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM10;
feature 14: pm2_5_stdlst _der; first derivative standard deviation of PM 2.5;
characteristic 15: so2_ stdlst _ der; the first derivative standard deviation of SO 2;
and (4) characteristic 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
feature 18: so2_ co _ sum; the sum of SO2 plus the sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
characteristic 22: x _ acf1_ pm10; a first autocorrelation coefficient of PM10;
and (4) characteristic 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
feature 24: x _ acf1_ pm2_5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; the median of CO;
characteristic 29: pm2_5 \ diff1 \_acf1; a primary differential stage number of PM 2.5;
and (2) feature 30: so2_ diff1_ acf1; a first differential order of SO 2;
feature 31: co _ diff1_ acf1; a first differential stage of CO;
feature 32: skewness _ pm10; skewness of PM10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2_5 \ u kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; obtaining the number of different types of pollution sources around the station according to the pollution source information around the station;
feature 38: polarization _ type; and obtaining the number of different types of pollution sources around the station according to the emission list.
2. And establishing a typical pollution case library based on expert judgment.
The type of contamination of each high spatio-temporal resolution grid may be determined by expert judgment based on contamination data and some other information, etc., and in this embodiment the determined types of contamination include: raise dust and dust; a motor vehicle; heavy vehicles, machinery, ships; catering oil smoke; burning coal; carrying out unorganized incineration; an enterprise; fireworks and crackers; the procedures involving VOCs are 9 types.
3. And developing a pollution source type identification algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm.
And calculating 38 technical characteristics selected by the invention according to the pollution data and other information, and labeling the characteristic data corresponding to each grid according to the pollution type judged by experts to be used as training data of the model. The method adopts a machine learning algorithm, specifically comprises a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model, and includes sub-models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
4. And (4) carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming.
The algorithm model training set contains pollution-free time sequence pollution data, after the pollution data of the grids are obtained, 38 provided technical features are calculated, and the classification result of the grid pollution types can be output by inputting the trained mathematical model. In the early-stage test, two standard air stations and three micro stations (southeast corner of a certain steel enterprise, a certain sewage treatment plant and a northwest loop of a city) in a certain city are randomly selected, data after 2019, 9 and 1 days are selected, a segmentation function is adopted to divide the data into different segments, then the pollution segments are screened by using different pollutant concentration conditions, each segment is predicted by using a model established by a case to obtain the pollution types of the different segments, then a series of information of the obtained site pollution segments is sent back to an expert, and the expert performs secondary judgment.
5. The expert examines the model identification result according to the alarm information and determines whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified. For example, it is recognized that the pollution type of the site tang shan ceramics company 2019/9/7/14-2019/9/8 is a (raise dust, dust), the expert group performs secondary judgment, the judgment type is a (raise dust, dust), and the result obtained by the model matches with the judgment result of the expert, so that the case can be input as case supplement into a case base and a pollution source is processed; and recognizing that the pollution type of the station suburb sewage treatment plant 842, 2019/9/212 is g (enterprise), and the time period of 00 is no obvious pollution source when the expert group performs secondary judgment, wherein the model recognition result is different from the model recognition result according to alarm information audit by the expert, and the pollution event mark is removed at the moment.
It will be appreciated by those skilled in the art that the model of the present invention will have an increasing accuracy of model identification as contamination events are replenished into the case library.
Although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.
Claims (5)
1. A pollution source type automatic identification method based on machine learning is characterized by comprising the following steps:
identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment based on environmental monitoring data, time and geographic information, and establishing a typical pollution case library;
secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning calculation method, and developing a pollution source type recognition algorithm model according to the extracted data characteristics;
monitoring real-time environment monitoring data by using the algorithm model, marking the abnormal data as a pollution event if the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
checking or checking the model identification result on site according to the alarm information, if the model identification result exists really, processing the pollution problem, and adding event supplements into a typical case library for continuous optimization of the algorithm model; if the identification result is not accurate, removing the pollution event mark;
the environmental monitoring data includes: PM10, PM2.5, SO2, NO2, CO, O3, temperature, and humidity, as well as time and geographic information;
the typical case base is a set of cases which are used for describing pollution events and are audited on the basis of the environmental monitoring data, and each case comprises the following data information: the starting time and the ending time of the pollution event, the name and the coordinates of the affected point, the type of the affected parameter and the meteorological conditions of the current place, and the type of the pollution source which is judged by an expert;
the affected parameters are parameters for obtaining high spatial and temporal resolution of the monitored area through a micro-station, and the parameter types at least comprise 6 pollutants: PM10, PM2.5, SO2, NO2, CO, O3 and VOC, wherein the meteorological conditions comprise wind direction, wind speed, temperature and humidity, and the pollution dye source types comprise a dust raising source, a moving source, a coal burning source, a catering oil smoke source and an industrial source; in the second step, various features are extracted from the time series pollution data according to the basic statistics of the data; converting some geographic information, emission list information and information obtained by expert judgment into corresponding characteristic variables;
the features extracted and the calculation method are as follows:
the 6 pollutants and the AQI are formed according to case groups: the method comprises the following steps of adding the squares of the first 10 first-order differential series, adding the squares of the first five partial autocorrelation coefficients, adding the squares of the first 5 second-order differential series, adding the standard deviation of the first derivative, averaging, adding, maximum, quartile, coefficient of variation, average, standard deviation, median, variance, skewness, kurtosis, small value of the maximum time of AQI, correlation coefficients between six pollutants and AQI, and main pollutants;
pollution sources around the station: acquiring the number of different types of pollution sources around different stations according to the pollution source information around the stations and the emission list information, and taking the pollution sources as characteristic values;
site peripheral road network density: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time sequences among pollutants adopts dynamic time warping DTW distance;
screening a certain amount of characteristic variables from all considered variables according to the importance of the variables in the random forest model, and finally selecting the following 38 data characteristics based on pollution data, geographic information, emission list information and information obtained by expert judgment as the basis of pollution type classification,
the method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf10; the first 10 first order differential series-squared sums of PM10;
and (3) characteristic: pm2_ 5/diff 1/acf 10; the first 10 first differential series-squared sums of PM 2.5;
and (4) characteristic: co _ diff1_ acf10; the first 10 first differential series-sum squares of the CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts include but are not limited to main roads, sensitive points, towns, construction sites and environmental background points;
and (6) characteristic: no2_ diff1_ acf10; the first 10 first differential series-sum squares of NO 2;
and (7) feature: aqi _ diff1_ acf10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristic 10: data; AQI maximum time small value;
the characteristics are as follows: distance _ dtw; similarity of time series between pollutants, using a distance of dtw;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM10;
feature 14: pm2_5_stdlst _der; first derivative standard deviation of PM 2.5;
characteristic 15: so2_ stdlst _ der; the first derivative standard deviation of SO 2;
and (4) feature 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
and (4) feature 18: so2_ co _ sum; the sum of SO2 plus the sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
and (2) feature 22: x _ acf1_ pm10; a first autocorrelation coefficient of PM10;
and (4) feature 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
characteristic 24: x _ acf1_ pm2_5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; the median of CO;
characteristic 29: pm2_5 \ diff1 \ acf1; a primary differential stage number of PM 2.5;
and (2) characteristic 30: so2_ diff1_ acf1; a first differential order of SO 2;
feature 31: co _ diff1_ acf1; a first differential order of CO;
feature 32: skewness _ pm10; skewness of PM10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2_5 \ u kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; obtaining the number of different types of pollution sources around the station according to the pollution source information around the station;
feature 38: polarization _ type; and obtaining the number of different types of pollution sources around the station according to the emission list.
2. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 1, wherein: the pollution source type recognition algorithm model is a multi-label classification model established for the existing cases and the cases supplemented later, and can express composite pollution formed by combining a plurality of pollution types possibly existing in the same place in the same time period; the pollution source type identification algorithm model adopts a combination strategy; the combination strategy is binary association, a classifier chain or nested superposition; according to the proportion of each pollution type in the training data, a cutoff value parameter is set in each classifier so as to solve the problem of non-equilibrium of the training samples and improve the prediction accuracy of the accidental pollution types.
3. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 2, wherein: the combination strategy is a binary association strategy, a binary classification is established for each label, the binary classification is a simple problem, namely whether the label belongs to the type or not, a model is divided into a plurality of binary classifications, then the binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear calculation complexity in the aspect of label quantity so as to be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved.
4. The method for automatically identifying the type of the pollution source based on the machine learning of claim 3, wherein the selected 38 characteristics are calculated according to the pollution data and other information, and the characteristic data corresponding to each grid is labeled according to the judged pollution type to be used as training data of a model; the method adopts a machine learning algorithm which specifically comprises but is not limited to a random forest, a neural network, a support vector machine and a gradient propulsion machine, and adopts a combined model which comprises a curve shape and deep neural network automatic feature extraction sub-model to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
5. The method as claimed in claim 1, wherein the algorithm model training set contains pollution-free time series pollution data, after obtaining grid pollution data, 38 proposed features are calculated, the classification result of the grid pollution type can be output by inputting the trained mathematical model, the classification result is divided into different segments by a partition function, then the pollution segments are screened by using different pollutant concentration conditions, each segment is predicted by using a model established by a case, the pollution types of different segments are obtained, and then a series of information of the obtained site pollution segments is sent back to an expert for secondary judgment by the expert.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010846058.3A CN111985567B (en) | 2020-08-21 | 2020-08-21 | Automatic pollution source type identification method based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010846058.3A CN111985567B (en) | 2020-08-21 | 2020-08-21 | Automatic pollution source type identification method based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111985567A CN111985567A (en) | 2020-11-24 |
CN111985567B true CN111985567B (en) | 2022-11-22 |
Family
ID=73443859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010846058.3A Active CN111985567B (en) | 2020-08-21 | 2020-08-21 | Automatic pollution source type identification method based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111985567B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634113B (en) * | 2020-12-22 | 2023-09-26 | 山西大学 | Pollution waste gas correlation analysis method based on dynamic sliding window |
CN112990024B (en) * | 2021-03-18 | 2024-03-26 | 深圳博沃智慧科技有限公司 | Urban dust monitoring method |
CN113295635A (en) * | 2021-05-27 | 2021-08-24 | 河北先河环保科技股份有限公司 | Water pollution alarm method based on dynamic update data set |
CN113688940A (en) * | 2021-09-09 | 2021-11-23 | 浙江大学 | Suspected pollution industrial enterprise identification method based on public data |
CN113706127B (en) * | 2021-10-22 | 2022-02-22 | 长视科技股份有限公司 | Water area analysis report generation method and electronic equipment |
CN114693003B (en) * | 2022-05-23 | 2022-09-02 | 成都秦川物联网科技股份有限公司 | Smart city air quality prediction method and system based on Internet of things |
CN115018348B (en) * | 2022-06-20 | 2023-01-17 | 北京北投生态环境有限公司 | Environment analysis method, system, equipment and storage medium based on artificial intelligence |
CN114943482B (en) * | 2022-06-28 | 2024-06-21 | 成都秦川物联网科技股份有限公司 | Smart city exhaust emission management method and system based on Internet of things |
CN115358718A (en) * | 2022-08-24 | 2022-11-18 | 广东旭诚科技有限公司 | Noise pollution classification and real-time supervision method based on intelligent monitoring front end |
CN115792919B (en) * | 2023-01-19 | 2023-05-16 | 合肥中科光博量子科技有限公司 | Method for identifying polluted hot spot area through horizontal scanning monitoring of aerosol laser radar |
CN117057819B (en) * | 2023-08-15 | 2024-06-28 | 泰华智慧产业集团股份有限公司 | Rainwater pipe network sewage discharge traceability analysis method and system |
CN116912069B (en) * | 2023-09-13 | 2024-01-02 | 成都市智慧蓉城研究院有限公司 | Data processing method applied to smart city and electronic equipment |
CN117473398B (en) * | 2023-12-26 | 2024-03-19 | 四川国蓝中天环境科技集团有限公司 | Urban dust pollution source classification method based on slag transport vehicle activity |
CN117633661B (en) * | 2024-01-26 | 2024-04-02 | 西南交通大学 | Slag car high-risk pollution source classification method based on evolution diagram self-supervised learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844626A (en) * | 2017-01-20 | 2017-06-13 | 武汉大学 | Using microblogging keyword and the method and system of positional information simulated air quality |
CN107608009A (en) * | 2017-09-15 | 2018-01-19 | 深圳市卡普瑞环境科技有限公司 | A kind of air quality surveillance equipment, processing terminal and server |
CN110186820A (en) * | 2018-12-19 | 2019-08-30 | 河北中科遥感信息技术有限公司 | Multisource data fusion and environomental pollution source and pollutant distribution analysis method |
CN110870019A (en) * | 2017-10-16 | 2020-03-06 | 因美纳有限公司 | Semi-supervised learning for training deep convolutional neural network sets |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103234883B (en) * | 2013-04-30 | 2016-04-13 | 中南大学 | A kind of method based on road traffic flow real-time estimation inner city PM2.5 concentration |
CN104899596B (en) * | 2015-03-16 | 2018-09-14 | 景德镇陶瓷大学 | A kind of multi-tag sorting technique and its device |
CN108764013A (en) * | 2018-03-28 | 2018-11-06 | 中国科学院软件研究所 | A kind of automatic Communication Signals Recognition based on end-to-end convolutional neural networks |
CN109740560B (en) * | 2019-01-11 | 2023-04-18 | 山东浪潮科学研究院有限公司 | Automatic human body cell protein identification method and system based on convolutional neural network |
CN110006799A (en) * | 2019-02-14 | 2019-07-12 | 北京市环境保护监测中心 | A kind of classification method of hot spot grid pollution type |
CN111121862A (en) * | 2019-09-29 | 2020-05-08 | 广西中遥空间信息技术有限公司 | Air-space-ground integrated atmospheric environment monitoring system and method |
CN111461184A (en) * | 2020-03-19 | 2020-07-28 | 南京理工大学 | XGB multi-dimensional operation and maintenance data anomaly detection method based on multivariate feature matrix |
-
2020
- 2020-08-21 CN CN202010846058.3A patent/CN111985567B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844626A (en) * | 2017-01-20 | 2017-06-13 | 武汉大学 | Using microblogging keyword and the method and system of positional information simulated air quality |
CN107608009A (en) * | 2017-09-15 | 2018-01-19 | 深圳市卡普瑞环境科技有限公司 | A kind of air quality surveillance equipment, processing terminal and server |
CN110870019A (en) * | 2017-10-16 | 2020-03-06 | 因美纳有限公司 | Semi-supervised learning for training deep convolutional neural network sets |
CN110186820A (en) * | 2018-12-19 | 2019-08-30 | 河北中科遥感信息技术有限公司 | Multisource data fusion and environomental pollution source and pollutant distribution analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN111985567A (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111985567B (en) | Automatic pollution source type identification method based on machine learning | |
Kleine Deters et al. | Modeling PM2. 5 urban pollution using machine learning and selected meteorological parameters | |
CN116186566B (en) | Diffusion prediction method and system based on deep learning | |
CN115578015A (en) | Sewage treatment overall process supervision method and system based on Internet of things and storage medium | |
CN108595414B (en) | Soil heavy metal enterprise pollution source identification method based on source-sink space variable reasoning | |
CN116359218B (en) | Industrial aggregation area atmospheric pollution mobile monitoring system | |
Van et al. | A new model of air quality prediction using lightweight machine learning | |
CN117171695B (en) | Method and system for evaluating ecological restoration effect of antibiotic contaminated soil | |
KR102564191B1 (en) | Disaster response system that detects and responds to disaster situations in real time | |
Al_Janabi et al. | Pragmatic method based on intelligent big data analytics to prediction air pollution | |
CN116359285A (en) | Oil gas concentration intelligent detection system and method based on big data | |
CN118171920B (en) | LLM model-based park safety emergency response method, device and medium | |
CN112532652A (en) | Attack behavior portrait device and method based on multi-source data | |
CN113935228A (en) | L-band rough sea surface radiation brightness and temperature simulation method based on machine learning | |
CN114416423B (en) | Root cause positioning method and system based on machine learning | |
CN115146537A (en) | Atmospheric pollutant emission estimation model construction method and system based on power consumption | |
CN113267601B (en) | Industrial production environment remote real-time monitoring cloud platform based on machine vision and data analysis | |
CN109213840B (en) | Hot spot grid identification method based on multidimensional feature deep learning | |
Kim et al. | Massive scale deep learning for detecting extreme climate events | |
CN110827264A (en) | Evaluation system for apparent defects of concrete member | |
CN110543675A (en) | Power transmission line fault identification method | |
CN107679478B (en) | Method and system for extracting space load state of power transmission line | |
CN114527235A (en) | Real-time quantitative detection method for emission intensity | |
Patil | Prediction an air quality index data using machine learning and deep learning | |
CN117952440B (en) | Chemical industry park production environment supervision method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |