CN111985567A - Automatic pollution source type identification method based on machine learning - Google Patents

Automatic pollution source type identification method based on machine learning Download PDF

Info

Publication number
CN111985567A
CN111985567A CN202010846058.3A CN202010846058A CN111985567A CN 111985567 A CN111985567 A CN 111985567A CN 202010846058 A CN202010846058 A CN 202010846058A CN 111985567 A CN111985567 A CN 111985567A
Authority
CN
China
Prior art keywords
pollution
feature
data
aqi
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010846058.3A
Other languages
Chinese (zh)
Other versions
CN111985567B (en
Inventor
王春迎
詹宇
马景金
马红楠
张朝
王振强
张仕富
吴秦慧姿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Hebei Xianhe Environmental Protection Technology Co ltd
Original Assignee
Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Hebei Xianhe Environmental Protection Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Advanced Environmental Protection Industry Innovation Center Co ltd, Hebei Xianhe Environmental Protection Technology Co ltd filed Critical Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Priority to CN202010846058.3A priority Critical patent/CN111985567B/en
Publication of CN111985567A publication Critical patent/CN111985567A/en
Application granted granted Critical
Publication of CN111985567B publication Critical patent/CN111985567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01DMEASURING NOT SPECIALLY ADAPTED FOR A SPECIFIC VARIABLE; ARRANGEMENTS FOR MEASURING TWO OR MORE VARIABLES NOT COVERED IN A SINGLE OTHER SUBCLASS; TARIFF METERING APPARATUS; MEASURING OR TESTING NOT OTHERWISE PROVIDED FOR
    • G01D21/00Measuring or testing not otherwise provided for
    • G01D21/02Measuring two or more variables by means not covered by a single other subclass
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0037NOx
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0039O3
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/004CO or CO2
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0042SO2 or SO3
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/20Air quality improvement or preservation, e.g. vehicle emission control or emission reduction by using catalytic converters

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Combustion & Propulsion (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Human Resources & Organizations (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Molecular Biology (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)

Abstract

A pollution source type automatic identification method based on machine learning. Comprises the following steps: based on the environmental monitoring data, time and geographic information, identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment, and establishing a typical pollution case library; based on a machine learning algorithm, taking data of a case base as a sample to extract data characteristics, and developing a pollution source type recognition algorithm model; monitoring the real-time monitoring data by using the algorithm model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming; checking or on-site checking the model identification result according to the alarm information, processing the pollution problem if the model identification result exists really, and supplementing and listing the pollution problem in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark. Based on monitoring data such as gridding micro stations and small stations, more data can be brought into a data source, and the model can be further optimized.

Description

Automatic pollution source type identification method based on machine learning
Technical Field
The invention relates to the field of atmospheric environment monitoring, in particular to a pollution source type automatic identification method based on machine learning.
Background
In the field of atmospheric environment monitoring, a standard air station method is adopted in traditional monitoring, and due to the fact that cost is high, distribution quantity is small, generated data quantity is small, and the problem of fine pollution is difficult to accurately reflect. The micro-station adopting the sensor method can realize large-scale point distribution application due to low cost, SO that monitoring data with high space-time resolution in a monitoring area is obtained, monitoring parameters comprise PM10, PM2.5, SO2, NO2, CO, O3, temperature and humidity, the space resolution is up to 1 x 1km, and the time resolution is 1 h. The acquisition of massive environmental monitoring data supports the establishment of the corresponding relation between a pollution source and air quality, through manual analysis and research, the existing pollution problem can be found from data characteristics, and the source type of air pollution can be judged, including a dust raising source, a moving source, a coal-fired source, a catering oil smoke source, an industrial source and the like, so that the investigation range is reduced, the investigation accuracy is improved, the supervision efficiency is improved, and the manpower is saved for the on-site investigation work of the environmental problem.
However, the current problems are that the process of finding pollution problems and source types based on mass monitoring data requires a large amount of manpower and time, has high dependence on the technical level and experience of research personnel, has low efficiency of the whole application process, is poor in timeliness and is limited by the level of the technical personnel, and is difficult to effectively support environmental management. Therefore, a calculation method capable of efficiently, quickly and stably identifying the type of the pollution source is needed.
At present, the existing pollution source identification patent technology is based on hot spot grids instead of real-time monitoring data, for example, chinese patent CN110147383A, entitled "method and apparatus for determining pollution source type", and discloses a method for determining pollution source type, which determines pollution source type of the pollution grid by setting preset concentration value and preset concentration difference value, and combining wind speed, wind direction and pollution source situation in the grid; the invention of Chinese patent CN110006799A is named as 'a classification method of hotspot grid pollution types', and discloses a classification method of hotspot grid pollution types, which is used for classifying the atmosphere hotspot grid pollution types according to the change characteristics of the concentration of atmospheric pollutants along with time. The technology has the following disadvantages: firstly, the time and space resolution of hotspot grid data is low, so that pollution source identification work is mostly based on historical data, pollution tracing work cannot be guided in real time, and scientific and effective verification on an identification result is difficult; secondly, the satellite inversion data are restricted by meteorological conditions such as cloud cover, accuracy cannot be guaranteed, and effective tracing cannot be achieved; thirdly, the hot spot grid data reflects the air quality condition of the grid area rather than the periphery of the pollution source, so that the pollution source type is difficult to distinguish through the data characteristics; fourthly, the pollution source identification mode is single, and the characteristic parameters are few. And the types of the pollution sources at least comprise 6 types of pollution sources with different pollution characteristics. And the contamination characteristics described above cannot be accurately described.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for automatically identifying a type of a pollution source based on machine learning, so that the method can utilize parameters, time and space coordinate information of each pollutant, and the space information participates in a model operation for identifying a pollution process, that is, differences between target grid data and surrounding grid data are considered, rather than analyzing a data change trend in a time series.
In order to achieve the purpose, the invention provides a machine learning-based pollution source type automatic identification method, which mainly comprises the following steps:
step one, based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, and time and geographic information, through (expert) analysis and judgment, the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established.
Secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning algorithm, and developing a pollution source type recognition algorithm model;
monitoring the real-time monitoring data by using the model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
fourthly, the expert examines or checks the model recognition result on site according to the alarm information, if the model recognition result exists, the pollution problem is processed, and event supplements are listed in a typical case library for continuous optimization of the algorithm model; and if the identification result is not accurate, removing the pollution event mark.
The identification algorithm adopted by the method is based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, and time and geographic information, through analysis and judgment (by means of manual judgment of experts and the like), the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established. Then, based on a machine learning algorithm, taking mass data of the case base as a sample to extract data characteristics, and developing a pollution source type identification algorithm model; and monitoring the real-time monitoring data by using the model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming. Furthermore, the model identification result can be audited or checked on site by virtue of experts according to alarm information, if the model identification result does exist, the pollution problem is treated, and event supplement is listed in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark.
Preferably, the algorithm model training set contains pollution-free time series pollution data, after the pollution data of the grid is obtained, the proposed 38 features are calculated, and the classification result of the grid pollution type can be output by inputting the mathematical model after training.
The invention has the beneficial effects that by means of the technical scheme, the invention realizes the following advantages compared with the prior art:
(1) a data source: compared with the prior art based on hotspot grid data, the method is based on monitoring data such as grid micro stations and small stations, and can bring more data into a data source;
(2) an algorithm model: the technical scheme of the invention adopts a machine learning algorithm which specifically comprises algorithms such as a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model which comprises sub models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like;
(3) is characterized in that: in view of the fact that the selectable features based on features in the prior art are few (single grid judgment), through repeated research of the inventor, the algorithm of the invention can comprise 38 feature values in total, multi-point bit comparison judgment is realized, and data such as peripheral pollution sources and the like are further considered as the feature values; (can improve the accuracy of pollution type identification, and has the functions of distinguishing local sources and external sources, and the like, and overcomes the one-sidedness based on single grid analysis)
(4) Model continuous optimization: compared with the prior art which is based on historical data and has fixed algorithm, the technical scheme of the invention is that a generation of algorithm model is generated through the historical data, the application can be implemented in subsequent monitoring data and new cases can be found, the new cases are automatically put into a case library after being audited by technicians, and the model can be further optimized;
(5) compared with the prior art that the method is based on the client, the method can be based on the cloud server, and has the advantages that the cost of the client is reduced, the advantages of large data are formed at the server end, a large number of cases are collected at different places, the advantages of the technical scheme are fully played, and the accuracy of the algorithm judgment result is further improved.
Drawings
Fig. 1 is a flowchart illustrating steps of a method for automatically identifying a pollution source type based on machine learning according to the present invention.
Detailed Description
For a better understanding of the objects, aspects and advantages of the present invention, reference is made to the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
The hot spot grid in the invention refers to a technical unit related to the organization of the environmental protection department, and the Jingjin Ji and the surrounding key area of 2+26 cities are divided into a plurality of grids according to the length of 3km multiplied by 3 km. The method comprises the steps of integrating various data such as satellite remote sensing, air quality ground observation, meteorological observation and the like, utilizing a remote sensing image recognition technology based on cognition and multi-source data fusion, then determining the PM2.5 average concentration of each grid through atmospheric pollutant satellite remote sensing inversion, and determining key supervision areas in hot point grids according to concentration numerical sorting.
Referring to fig. 1, a flow chart of the recognition algorithm used in the present invention is shown, and its main concept content is briefly described as follows:
1. establishment of typical case base
The typical case base is a collection of cases describing pollution events and audited by human (experts) on the basis of environment monitoring data, and data information contained in each case at least comprises the following components: the starting time and the ending time of the pollution event, the name and the coordinates of the affected point, the type of the affected parameter and the weather conditions of the current place, and the type of the pollution source which is judged by experts. Wherein the parameter types may include PM10, PM2.5, SO2, NO2, CO, O3, and VOCs, the meteorological conditions include wind direction, wind speed, temperature, and humidity, and the pollution source types include dust sources, mobile sources, coal-fired sources, food and beverage oil smoke sources, industrial sources, and others.
2. Data characterization
The data features extracted from the algorithm model of the invention comprise: 1. first derivative standard deviation of PM 2.5; 2. first derivative standard deviation of CO; 3. SO (SO)2First derivative standard deviation of; 4. the first 10 first order differential series-squared sums for PM 2.5; 5. maximum value of CO; 6. a major contaminant; 7. skewness of AQI; 8. the 1st autocorrelation coefficient of PM 10; 9. quartiles of CO; 10. 1st autocorrelation coefficient of PM 2.5; 11. the coefficient of variation of AQI; 12. coefficient of variation of CO; 13. first derivative standard deviation of PM 10; 14. the first 10 first differential series sums of CO; 15. the sum of AQI; 16. SO (SO)2And is added to the CO sum; 17. skewness of PM 10; 18. o is2Maximum value of (d); 19. SO (SO)2The sum of (1); 20. median of CO; 21. the first 10 first order differential series sums of AQI; 22. NO2The kurtosis of (a); 23. the first 10 first differential order sums of squares for PM 10; 24. 1st autocorrelation coefficient of AQI; 25. a first differential stage of CO; 26. 1st autocorrelation coefficient of CO; 27. SO (SO)2The first differential order of; 28. the sum of CO; 29. SO (SO)2A median of (d); 30. kurtosis of PM 2.5; 31. a primary differential stage number of PM 2.5; 32. NO2The first 10 first order differential series sums of squares; 33. SO (SO)2The kurtosis of (a); 34. small value of AQI maximum time; 35. SO (SO)2Coefficient of variation of (a); 36. correlation coefficient of PM10 and CO; 37. SO (SO)2And CO correlation coefficient; 38. NO2And CO correlation coefficient.
The 38 characteristics can reflect the change situation of rising and falling of each pollutant and the (time cross) relevance of each pollutant time series to a certain extent, and comprehensively characterize the pollution types of each site in different periods from the statistical perspective.
For example, the feature 6(NO2_ diff1_ acf10) represents the degree of variation of the NO2 sequence, the feature 11(distance _ dtw) represents the similarity of time series between different pollutants, and the feature 17(co-quantile) represents the frequency distribution of C0 pollution, which can indicate to some extent whether a case belongs to automotive pollution.
However, due to the complexity of the multivariate time series variation and the correlation of multivariate time series of peripheral sites, it is difficult to artificially generalize and select the time series characteristics corresponding to each pollution type (or case). Therefore, the invention mainly combines the 38 weighted characteristics automatically based on the training data in the case base through a machine learning algorithm to generate a data-driven prediction model.
3. Model algorithm description/calculation formula
The technical scheme includes that a multi-label classification model is established for an existing case and a case supplemented later, namely, composite pollution formed by combining a plurality of pollution types possibly exists in the same time period and the same place, as shown in table 1 (all pollution types are not included), each row corresponds to one case or one pollution event, X is selected characteristic value summary, X1, X2, X3, X4, X5 and X6 are respectively characteristic values of corresponding cases, Y1, Y2, Y3, Y4 and Y5 are different pollution types and are called labels in the multi-label model, 1 represents that the type belongs to, and 0 represents that the type does not belong to. The model adopts a combination strategy, and the combination strategy mainly comprises Binary Relevance (Binary Relevance), Classifier Chains (Classifier Chains), Nested Stacking (Nested Stacking) and the like.
X Y1 Y2 Y3 Y4 Y5
X1 1 0 0 0 0
X2 0 1 1 0 0
X3 0 0 0 1 0
X4 0 0 0 0 1
X5 0 1 0 0 0
X6 1 0 1 0 0
TABLE 1 Multi-tag model example
The invention mainly uses a binary association strategy, the principle of the strategy is to establish a binary classification for each label, the binary classification is simple and has/does not have a problem, namely whether the label belongs to the type or not, as shown in table 2, a model is divided into five binary classifications, then a plurality of binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear computational complexity in the aspect of label quantity, and can be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved. In addition, machine learning (e.g., random forest) models under default parameter configurations tend to ignore the less significant types of pollution in training samples in the prediction. In the algorithm, a cutoff value (cutoff) parameter in each two classifier is adjusted based on the proportion of each pollution type in a training sample, so that each pollution type can be predicted in a balanced manner by an optimized model, and the overall prediction performance is improved.
TABLE 2 binary Association policy example
Figure BDA0002643097040000081
When the binary classification of each label is established independently, the same machine learning algorithm is used for modeling of each binary classification under the default condition, and the algorithm comprises a random forest, a neural network, a support vector machine, a gradient propulsion machine and the like. After further learning and research, different characteristic value combinations can be combined when modeling of each pollution type is tried, different machine learning algorithms are tried, the optimal characteristic value combination and the optimal algorithm are selected to establish binary classification, finally, different binary classifications are combined and combined to form an optimal multi-label model according to binary association, and when a new pollution event is predicted, the pollution type can be comprehensively judged according to the characteristic value of the pollution event.
The invention constructs three algorithms of a support vector machine, a random forest and an XGboost for a model. Briefly introduced here, a Support Vector Machine (SVM) is a type of generalized linear classifier that performs binary classification on data in a supervised learning manner, and can be used for classification and regression. The random forest is an algorithm for integrating a plurality of trees through the idea of ensemble learning, belongs to a nonlinear classifier, and therefore, the complex nonlinear interdependence relation between variables can be mined. The basic unit of the random forest is a decision tree which is a basic classifier, the main work is to select features to divide a data set, and finally, the data is attached with two different types of labels, and the constructed decision tree is in a tree structure. The random forest can be obtained by constructing a plurality of decision trees, each tree gives a classification result when prediction is carried out, voting is carried out accordingly, and a final classification result is output by adopting a principle that majority obeys minority. XGBoost is also a decision tree based machine learning algorithm, different from random forests, where each decision tree is constructed separately, and the idea of XGBoost is to grow a tree by adding trees continuously and performing feature splitting continuously, and each time a tree is added, it is actually to learn a new function to fit the residual of the last prediction until a stopping condition is reached, such as the number of trees to be constructed. During prediction, according to the characteristics of a prediction sample, a corresponding leaf node is found on each tree, each leaf node corresponds to a score, and finally the scores corresponding to each tree are added together to obtain the prediction value of the sample.
When the model is constructed, because each sample of the pollution type is not necessarily balanced, which has certain influence on the accuracy of the model, the method optimizes the point when the model is constructed, avoids the influence caused by unbalanced samples to a certain extent by improving the parameters of the model, and can correspondingly adjust under the condition that the cases are continuously supplemented.
4. How to base on cloud server
In the development process of the algorithm model provided by the invention, as more available cases are provided, the prediction accuracy of the developed model is higher, so that environmental monitoring data of multiple cities are required; after the development is completed, the model can be applied to different cities. Therefore, in the scheme of the invention, the model is set to be in a cloud operation mode, and the operation mode can effectively utilize as much data as possible, improve the precision of the model and facilitate later wide application.
The following specific examples are intended to illustrate the invention, but are not intended to limit the scope of the invention.
In this embodiment, the method for automatically identifying the pollution source type based on machine learning of the present invention is to utilize a micro station to obtain monitoring data with high spatial and temporal resolution in a monitoring area, wherein the monitoring parameters include PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity, and propose classification based on concentration characteristics that change with time and geographic information. As shown in fig. 1, the method for automatically identifying the type of a pollution source based on machine learning provided by the present invention mainly includes the following steps:
monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like with high space-time resolution in a monitoring area, and time and geographic information are obtained through a micro station;
establishing a typical pollution case library based on expert judgment;
developing a pollution source type recognition algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm;
carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming;
then, the expert examines the model identification result according to the alarm information to determine whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified.
In the following embodiments, the classification of the high spatial and temporal resolution site pollution types in the monitored area comprises the following steps:
1. PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and other monitoring data with high space-time resolution of a monitoring area, and time and geographic information are obtained through the micro-station.
Because different pollution types have different characteristics on the change of the pollutant concentration, various characteristics are extracted from time series pollution data according to basic statistics of the data; and then converting some geographic information, emission list information and information acquired by expert judgment into corresponding characteristic variables, such as: and (3) the characteristics of pollution sources around the stations, road network density around the stations and time series distance, and the total number is 140.
The characteristics and some of the calculations involved for each contaminant are as follows:
the 6 pollutants (PM10, PM2.5, SO2, NO2, O3, CO) and AQI were formed in case groups:
diff1_ acf 10: the first 10 first order difference series sums of squares;
diff1_ acf 1: a first differential stage number;
x _ acf 1: a first autocorrelation coefficient;
x _ pacf 5: the sum of squares of the autocorrelation coefficients of the first five parts;
diff2x _ pacf 5: the first 5 2 differential series sums of squares;
std1st _ der: first derivative standard deviation;
the average value, the sum, the maximum value, the quartile, the variation coefficient, the mean, the standard deviation, the median, the variance, the skewness, the kurtosis and the hour value of the maximum time of the AQI are formed by grouping the 6 pollutants and the AQI according to cases; correlation coefficients between six contaminants and AQI; the main contaminants.
Pollution sources around the station: according to the pollution source information and the emission list information around the stations, acquiring the number of different types of pollution sources around different stations and taking the pollution sources as characteristic values;
site peripheral road network density: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time series between contaminants, Dynamic Time Warping (DTW) distance is used.
Then screening a certain amount of characteristic variables from all considered variables according to the importance of the variables in the random forest model, and finally selecting the following 38 data characteristics based on the pollution data and the geographic information, the emission list information and the information obtained by expert judgment as the basis of pollution type classification.
The method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf 10; the first 10 first differential order sums of squares for PM 10;
and (3) feature: pm2 — 5_ diff1_ acf 10; the first 10 first order differential series-squared sums for PM 2.5;
and (4) feature: co _ diff1_ acf 10; the first 10 first differential series sums of CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts, such as main roads, sensitive points, towns, construction sites, environmental background points and the like;
and (6) feature: no2_ diff1_ acf 10; the first 10 first order differential series-squared sums of NO 2;
and (7) feature: aqi _ diff1_ acf 10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristics are as follows: data; AQI maximum time small value;
and (2) characteristic 11: distance _ dtw; similarity in time series between contaminants, using dtw distance;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM 10;
feature 14: pm2_5_ stdlst _ der; first derivative standard deviation of PM 2.5;
and (2) feature 15: so2_ stdlst _ der; first derivative standard deviation of SO 2;
and (4) feature 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
feature 18: so2_ co _ sum; sum of SO2 plus sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
and (2) feature 22: x _ acf1_ pm 10; a first autocorrelation coefficient of PM 10;
and (4) feature 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
feature 24: x _ acf1_ pm2_ 5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; median of CO;
characteristic 29: pm2 — 5_ diff1_ acf 1; a primary differential stage number of PM 2.5;
and (2) feature 30: so2_ diff1_ acf 1; a first differential stage of SO 2;
feature 31: co _ diff1_ acf 1; a first differential stage of CO;
feature 32: skewness _ pm 10; skewness of PM 10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2 — 5_ kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; acquiring the number of different types of pollution sources around the site according to the pollution source information around the site;
feature 38: polarization _ type; and obtaining the number of different types of pollution sources around the station according to the emission list.
2. And establishing a typical pollution case library based on expert judgment.
The type of contamination of each high spatial-temporal resolution grid may be determined by expert judgment based on the contamination data and some other information, and in this embodiment the determined types of contamination include: raise dust and dust; a motor vehicle; heavy vehicles, machinery, ships; catering oil smoke; burning coal; carrying out unorganized incineration; an enterprise; fireworks and crackers; the procedures involving VOCs are 9 types.
3. And developing a pollution source type identification algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm.
And calculating 38 technical characteristics selected by the invention according to the pollution data and other information, and labeling the characteristic data corresponding to each grid according to the pollution type judged by experts to be used as training data of the model. The method adopts a machine learning algorithm, specifically comprises a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model, and includes sub-models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
4. And (4) carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming.
The algorithm model training set contains pollution-free time sequence pollution data, after the pollution data of the grid are obtained, 38 proposed technical features are calculated, and the classification result of the grid pollution type can be output by inputting the mathematical model after training. In the early-stage test, two standard air stations and three micro stations (southeast corner of a certain steel enterprise, a certain sewage treatment plant and a city north loop) in a certain city are randomly selected, data after 2019, 9 and 1 days are selected, a segmentation function is adopted to divide the data into different segments, then the pollution segments are screened by using different pollutant concentration conditions, each segment is predicted by using a model established by a case to obtain the pollution types of the different segments, then a series of information of the obtained site pollution segments is sent back to an expert, and the expert performs secondary judgment.
5. The expert examines the model identification result according to the alarm information and determines whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified. For example, the pollution type of the site Tangshan ceramics 2019/9/714: 00-2019/9/87: 00 in the time period is recognized as a (flying dust and dust), the expert group performs secondary judgment, the judgment type is a (flying dust and dust), the result obtained by the model is matched with the judgment result of the expert, and the case can be used as case supplement to be input into a case base and a pollution source is processed; and the pollution type of the suburb sewage treatment plant 842, 2019/9/212:00-2019/9/2111:00 at the station is identified as g (enterprise), the expert group has no obvious pollution source when carrying out secondary judgment, the expert audits the model identification result according to the alarm information to be different from the model identification result, and the pollution event mark is removed at the moment.
It will be appreciated by those skilled in the art that the model of the present invention will have an increasing accuracy of model identification as contamination events are replenished into the case library.
Although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims (10)

1. A pollution source type automatic identification method based on machine learning is characterized by comprising the following steps:
the method comprises the following steps of firstly, identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment based on environmental monitoring data, time and geographic information, and establishing a typical pollution case library;
secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning algorithm, and developing a pollution source type recognition algorithm model;
monitoring real-time monitoring data by using the algorithm model, marking abnormal data as a pollution event if the abnormal data are found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
checking or checking the model identification result on site according to the alarm information, if the model identification result exists really, processing the pollution problem, and adding event supplements into a typical case library for continuous optimization of the algorithm model; and if the identification result is not accurate, removing the pollution event mark.
2. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 1, wherein:
the environmental monitoring data includes: PM10, PM2.5, SO2, NO2, CO, O3, temperature, and humidity, as well as time and geographic information;
the typical case base is a set of cases which are used for describing pollution events and are audited on the basis of the environmental monitoring data, and each case comprises the following data information: the starting time and the ending time of the pollution event, the name and the coordinates of the affected point, the type of the affected parameter and the meteorological conditions of the current place, and the type of the pollution source which is judged by an expert;
the affected parameters are parameters for obtaining high spatial and temporal resolution of the monitored area through a micro-station, and the parameter types at least comprise 6 pollutants: PM10, PM2.5, SO2, NO2, CO, O3, and VOCs, meteorological conditions including wind direction, wind speed, temperature, and humidity, pollution stain types including dust sources, mobile sources, coal-fired sources, food and beverage oil smoke sources, and industrial sources.
3. The automatic identification method for the pollution source type based on the machine learning according to the claim 1 or 2, characterized in that in the step one, various characteristics are extracted from the time series pollution data according to the basic statistics of the data; and converting the geographic information, the emission list information and the information acquired by expert judgment into corresponding characteristic variables.
4. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 3, wherein the extracted features and the calculation method are as follows:
the 6 pollutants and AQI are formed according to case grouping:
diff1_ acf 10: the first 10 first order difference series sums of squares;
diff1_ acf 1: a first differential stage number;
x _ acf 1: a first autocorrelation coefficient;
x _ pacf 5: the sum of squares of the autocorrelation coefficients of the first five parts;
diff2x _ pacf 5: the first 5 2 differential series sums of squares;
std1st _ der: first derivative standard deviation;
the average value, the sum, the maximum value, the quartile, the variation coefficient, the mean, the standard deviation, the median, the variance, the skewness, the kurtosis and the hour value of the maximum time of the AQI are formed by grouping the 6 pollutants and the AQI according to cases; correlation coefficients between six contaminants and AQI; a major contaminant;
pollution sources around the station: according to the pollution source information and the emission list information around the stations, acquiring the number of different types of pollution sources around different stations and taking the pollution sources as characteristic values;
site peripheral road network density: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time series between contaminants, Dynamic Time Warping (DTW) distance is used.
5. The method of claim 4, wherein a certain amount of characteristic variables are selected from all considered variables according to the importance of the variables in the random forest model, and the following 38 data characteristics based on pollution data and geographic information, emission list information and information obtained by expert judgment are selected as the basis for pollution type classification,
the method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf 10; the first 10 first differential order sums of squares for PM 10;
and (3) feature: pm2 — 5_ diff1_ acf 10; the first 10 first order differential series-squared sums for PM 2.5;
and (4) feature: co _ diff1_ acf 10; the first 10 first differential series sums of CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts, such as main roads, sensitive points, towns, construction sites, environmental background points and the like;
and (6) feature: no2_ diff1_ acf 10; the first 10 first order differential series-squared sums of NO 2;
and (7) feature: aqi _ diff1_ acf 10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristics are as follows: data; AQI maximum time small value;
and (2) characteristic 11: distance _ dtw; similarity in time series between contaminants, using dtw distance;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM 10;
feature 14: pm2_5_ stdlst _ der; first derivative standard deviation of PM 2.5;
and (2) feature 15: so2_ stdlst _ der; first derivative standard deviation of SO 2;
and (4) feature 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
feature 18: so2_ co _ sum; sum of SO2 plus sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
and (2) feature 22: x _ acf1_ pm 10; a first autocorrelation coefficient of PM 10;
and (4) feature 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
feature 24: x _ acf1_ pm2_ 5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; median of CO;
characteristic 29: pm2 — 5_ diff1_ acf 1; a primary differential stage number of PM 2.5;
and (2) feature 30: so2_ diff1_ acf 1; a first differential stage of SO 2;
feature 31: co _ diff1_ acf 1; a first differential stage of CO;
feature 32: skewness _ pm 10; skewness of PM 10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2 — 5_ kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; acquiring the number of different types of pollution sources around the site according to the pollution source information around the site;
feature 38: polarization _ type; obtaining the number of different types of pollution sources around the station according to the discharge list;
the data features extracted from the algorithm model comprise: 1. first derivative standard deviation of PM 2.5; 2. first derivative standard deviation of CO; 3. SO (SO)2First derivative standard deviation of; 4. the first 10 first order differential series-squared sums for PM 2.5; 5. maximum value of CO; 6. a major contaminant; 7. skewness of AQI; 8. the 1st autocorrelation coefficient of PM 10; 9. quartiles of CO; 10. 1st autocorrelation coefficient of PM 2.5; 11. the coefficient of variation of AQI; 12. coefficient of variation of CO; 13. first derivative scaling of PM10Tolerance; 14. the first 10 first differential series sums of CO; 15. the sum of AQI; 16. SO (SO)2And is added to the CO sum; 17. skewness of PM 10; 18. o is2Maximum value of (d); 19. SO (SO)2The sum of (1); 20. median of CO; 21. the first 10 first order differential series sums of AQI; 22. NO2The kurtosis of (a); 23. the first 10 first differential order sums of squares for PM 10; 24. 1st autocorrelation coefficient of AQI; 25. a first differential stage of CO; 26. 1st autocorrelation coefficient of CO; 27. SO (SO)2The first differential order of; 28. the sum of CO; 29. SO (SO)2A median of (d); 30. kurtosis of PM 2.5; 31. a primary differential stage number of PM 2.5; 32. NO2The first 10 first order differential series sums of squares; 33. SO (SO)2The kurtosis of (a); 34. small value of AQI maximum time; 35. SO (SO)2Coefficient of variation of (a); 36. correlation coefficient of PM10 and CO; 37. SO (SO)2And CO correlation coefficient; 38. NO2And CO correlation coefficient.
6. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 1, wherein: the pollution source type identification algorithm model is a multi-label classification model established for the existing cases and the cases supplemented later, namely, the pollution source type identification algorithm model can express composite pollution formed by combining a plurality of pollution types possibly existing in the same place in the same time period; and adopting a combination strategy to dye the pollution source type identification algorithm model.
7. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 6, wherein: the combination strategy is binary association, a classifier chain or nested superposition; according to the proportion of each pollution type in the training data, a cutoff value (cutoff) parameter is set in each classifier so as to solve the problem of non-equilibrium of training samples and improve the prediction accuracy of accidental pollution types.
8. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 7, wherein: the combination strategy is a binary association strategy, a binary classification is established for each label, the binary classification is a simple problem, namely whether the label belongs to the type or not, a model is divided into a plurality of binary classifications, then the binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear calculation complexity in the aspect of label quantity so as to be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved.
9. The method according to claim 8, wherein the selected 38 features are calculated according to the pollution data and other information, and the feature data corresponding to each grid is labeled according to the judged pollution type to serve as training data of the model; the method adopts a machine learning algorithm, specifically comprises a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model, and includes sub-models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
10. The method of claim 1, wherein an algorithm model training set contains pollution-free time series pollution data, after the pollution data of the grid is obtained, the proposed 38 features are calculated, and the classification result of the grid pollution type can be output by inputting the mathematical model after the training.
CN202010846058.3A 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning Active CN111985567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010846058.3A CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010846058.3A CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Publications (2)

Publication Number Publication Date
CN111985567A true CN111985567A (en) 2020-11-24
CN111985567B CN111985567B (en) 2022-11-22

Family

ID=73443859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010846058.3A Active CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Country Status (1)

Country Link
CN (1) CN111985567B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634113A (en) * 2020-12-22 2021-04-09 山西大学 Polluted waste gas correlation analysis method based on dynamic sliding window
CN112990024A (en) * 2021-03-18 2021-06-18 深圳博沃智慧科技有限公司 Method for monitoring urban raise dust
CN113295635A (en) * 2021-05-27 2021-08-24 河北先河环保科技股份有限公司 Water pollution alarm method based on dynamic update data set
CN113688940A (en) * 2021-09-09 2021-11-23 浙江大学 Suspected pollution industrial enterprise identification method based on public data
CN113706127A (en) * 2021-10-22 2021-11-26 长视科技股份有限公司 Water area analysis report generation method and electronic equipment
CN114332540A (en) * 2021-12-31 2022-04-12 北京建筑大学 Building automation system data marking method and system based on big data
CN114693003A (en) * 2022-05-23 2022-07-01 成都秦川物联网科技股份有限公司 Smart city air quality prediction method and system based on Internet of things
CN115018348A (en) * 2022-06-20 2022-09-06 北京北投生态环境有限公司 Environment analysis method, system, equipment and storage medium based on artificial intelligence
CN115358718A (en) * 2022-08-24 2022-11-18 广东旭诚科技有限公司 Noise pollution classification and real-time supervision method based on intelligent monitoring front end
CN115792919A (en) * 2023-01-19 2023-03-14 合肥中科光博量子科技有限公司 Method for identifying pollution hot spot area through horizontal scanning and monitoring of aerosol laser radar
CN116912069A (en) * 2023-09-13 2023-10-20 成都市智慧蓉城研究院有限公司 Data processing method applied to smart city and electronic equipment
CN117057819A (en) * 2023-08-15 2023-11-14 泰华智慧产业集团股份有限公司 Rainwater pipe network sewage discharge traceability analysis method and system
US20230419823A1 (en) * 2022-06-28 2023-12-28 Chengdu Qinchuan Iot Technology Co., Ltd. Methods and systems for managing exhaust emission in a smart city based on industrial internet of things
CN117473398A (en) * 2023-12-26 2024-01-30 四川国蓝中天环境科技集团有限公司 Urban dust pollution source classification method based on slag transport vehicle activity
CN117633661A (en) * 2024-01-26 2024-03-01 西南交通大学 Slag car high-risk pollution source classification method based on evolution diagram self-supervised learning
RU2818685C1 (en) * 2023-06-19 2024-05-03 федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский университет "Высшая школа экономики" Method of identifying a source of emission of harmful substances into the atmosphere based on artificial intelligence technology

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103234883A (en) * 2013-04-30 2013-08-07 中南大学 Road traffic flow-based method for estimating central city PM2.5 in real time
CN104899596A (en) * 2015-03-16 2015-09-09 景德镇陶瓷学院 Multi-label classification method and apparatus thereof
CN106844626A (en) * 2017-01-20 2017-06-13 武汉大学 Using microblogging keyword and the method and system of positional information simulated air quality
CN107608009A (en) * 2017-09-15 2018-01-19 深圳市卡普瑞环境科技有限公司 A kind of air quality surveillance equipment, processing terminal and server
CN108764013A (en) * 2018-03-28 2018-11-06 中国科学院软件研究所 A kind of automatic Communication Signals Recognition based on end-to-end convolutional neural networks
CN109740560A (en) * 2019-01-11 2019-05-10 济南浪潮高新科技投资发展有限公司 Human cellular protein automatic identifying method and system based on convolutional neural networks
CN110006799A (en) * 2019-02-14 2019-07-12 北京市环境保护监测中心 A kind of classification method of hot spot grid pollution type
CN110186820A (en) * 2018-12-19 2019-08-30 河北中科遥感信息技术有限公司 Multisource data fusion and environomental pollution source and pollutant distribution analysis method
CN110870019A (en) * 2017-10-16 2020-03-06 因美纳有限公司 Semi-supervised learning for training deep convolutional neural network sets
CN111121862A (en) * 2019-09-29 2020-05-08 广西中遥空间信息技术有限公司 Air-space-ground integrated atmospheric environment monitoring system and method
CN111461184A (en) * 2020-03-19 2020-07-28 南京理工大学 XGB multi-dimensional operation and maintenance data anomaly detection method based on multivariate feature matrix

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103234883A (en) * 2013-04-30 2013-08-07 中南大学 Road traffic flow-based method for estimating central city PM2.5 in real time
CN104899596A (en) * 2015-03-16 2015-09-09 景德镇陶瓷学院 Multi-label classification method and apparatus thereof
CN106844626A (en) * 2017-01-20 2017-06-13 武汉大学 Using microblogging keyword and the method and system of positional information simulated air quality
CN107608009A (en) * 2017-09-15 2018-01-19 深圳市卡普瑞环境科技有限公司 A kind of air quality surveillance equipment, processing terminal and server
CN110870019A (en) * 2017-10-16 2020-03-06 因美纳有限公司 Semi-supervised learning for training deep convolutional neural network sets
CN108764013A (en) * 2018-03-28 2018-11-06 中国科学院软件研究所 A kind of automatic Communication Signals Recognition based on end-to-end convolutional neural networks
CN110186820A (en) * 2018-12-19 2019-08-30 河北中科遥感信息技术有限公司 Multisource data fusion and environomental pollution source and pollutant distribution analysis method
CN109740560A (en) * 2019-01-11 2019-05-10 济南浪潮高新科技投资发展有限公司 Human cellular protein automatic identifying method and system based on convolutional neural networks
CN110006799A (en) * 2019-02-14 2019-07-12 北京市环境保护监测中心 A kind of classification method of hot spot grid pollution type
CN111121862A (en) * 2019-09-29 2020-05-08 广西中遥空间信息技术有限公司 Air-space-ground integrated atmospheric environment monitoring system and method
CN111461184A (en) * 2020-03-19 2020-07-28 南京理工大学 XGB multi-dimensional operation and maintenance data anomaly detection method based on multivariate feature matrix

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634113B (en) * 2020-12-22 2023-09-26 山西大学 Pollution waste gas correlation analysis method based on dynamic sliding window
CN112634113A (en) * 2020-12-22 2021-04-09 山西大学 Polluted waste gas correlation analysis method based on dynamic sliding window
CN112990024A (en) * 2021-03-18 2021-06-18 深圳博沃智慧科技有限公司 Method for monitoring urban raise dust
CN112990024B (en) * 2021-03-18 2024-03-26 深圳博沃智慧科技有限公司 Urban dust monitoring method
CN113295635A (en) * 2021-05-27 2021-08-24 河北先河环保科技股份有限公司 Water pollution alarm method based on dynamic update data set
CN113688940A (en) * 2021-09-09 2021-11-23 浙江大学 Suspected pollution industrial enterprise identification method based on public data
CN113706127A (en) * 2021-10-22 2021-11-26 长视科技股份有限公司 Water area analysis report generation method and electronic equipment
CN114332540A (en) * 2021-12-31 2022-04-12 北京建筑大学 Building automation system data marking method and system based on big data
CN114332540B (en) * 2021-12-31 2024-10-29 北京建筑大学 Big data-based building automation system data marking method and system
CN114693003A (en) * 2022-05-23 2022-07-01 成都秦川物联网科技股份有限公司 Smart city air quality prediction method and system based on Internet of things
US11776081B1 (en) * 2022-05-23 2023-10-03 Chengdu Qinchuan Iot Technology Co., Ltd. Methods and systems for predicting air quality in smart cities based on an internet of things
US20230394611A1 (en) * 2022-05-23 2023-12-07 Chengdu Qinchuan Iot Technology Co., Ltd. Method and system for area management in smart city based on internet of things
US12056782B2 (en) 2022-05-23 2024-08-06 Chengdu Qinchuan Iot Technology Co., Ltd. Method and system for area management in smart city based on internet of things
CN115018348A (en) * 2022-06-20 2022-09-06 北京北投生态环境有限公司 Environment analysis method, system, equipment and storage medium based on artificial intelligence
US20230419823A1 (en) * 2022-06-28 2023-12-28 Chengdu Qinchuan Iot Technology Co., Ltd. Methods and systems for managing exhaust emission in a smart city based on industrial internet of things
CN115358718A (en) * 2022-08-24 2022-11-18 广东旭诚科技有限公司 Noise pollution classification and real-time supervision method based on intelligent monitoring front end
CN115792919A (en) * 2023-01-19 2023-03-14 合肥中科光博量子科技有限公司 Method for identifying pollution hot spot area through horizontal scanning and monitoring of aerosol laser radar
RU2818685C1 (en) * 2023-06-19 2024-05-03 федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский университет "Высшая школа экономики" Method of identifying a source of emission of harmful substances into the atmosphere based on artificial intelligence technology
CN117057819A (en) * 2023-08-15 2023-11-14 泰华智慧产业集团股份有限公司 Rainwater pipe network sewage discharge traceability analysis method and system
CN116912069A (en) * 2023-09-13 2023-10-20 成都市智慧蓉城研究院有限公司 Data processing method applied to smart city and electronic equipment
CN116912069B (en) * 2023-09-13 2024-01-02 成都市智慧蓉城研究院有限公司 Data processing method applied to smart city and electronic equipment
CN117473398B (en) * 2023-12-26 2024-03-19 四川国蓝中天环境科技集团有限公司 Urban dust pollution source classification method based on slag transport vehicle activity
CN117473398A (en) * 2023-12-26 2024-01-30 四川国蓝中天环境科技集团有限公司 Urban dust pollution source classification method based on slag transport vehicle activity
CN117633661B (en) * 2024-01-26 2024-04-02 西南交通大学 Slag car high-risk pollution source classification method based on evolution diagram self-supervised learning
CN117633661A (en) * 2024-01-26 2024-03-01 西南交通大学 Slag car high-risk pollution source classification method based on evolution diagram self-supervised learning

Also Published As

Publication number Publication date
CN111985567B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN111985567B (en) Automatic pollution source type identification method based on machine learning
CN115578015B (en) Sewage treatment whole process supervision method, system and storage medium based on Internet of things
Kleine Deters et al. Modeling PM2. 5 urban pollution using machine learning and selected meteorological parameters
CN116186566B (en) Diffusion prediction method and system based on deep learning
CN110288001B (en) Target recognition method based on target data feature training learning
CN108595414B (en) Soil heavy metal enterprise pollution source identification method based on source-sink space variable reasoning
CN112307884A (en) Forest fire spreading prediction method based on continuous time sequence remote sensing situation data and electronic equipment
CN116359218B (en) Industrial aggregation area atmospheric pollution mobile monitoring system
Van et al. A new model of air quality prediction using lightweight machine learning
CN111008337A (en) Deep attention rumor identification method and device based on ternary characteristics
CN115438848A (en) PM based on deep mixed graph neural network 2.5 Long-term concentration prediction method
KR102564191B1 (en) Disaster response system that detects and responds to disaster situations in real time
Al_Janabi et al. Pragmatic method based on intelligent big data analytics to prediction air pollution
CN113935228A (en) L-band rough sea surface radiation brightness and temperature simulation method based on machine learning
CN115761439A (en) Boiler inner wall sink detection and identification method based on target detection
CN115146537A (en) Atmospheric pollutant emission estimation model construction method and system based on power consumption
Kim et al. Massive scale deep learning for detecting extreme climate events
CN109213840B (en) Hot spot grid identification method based on multidimensional feature deep learning
CN113267601B (en) Industrial production environment remote real-time monitoring cloud platform based on machine vision and data analysis
CN114527235A (en) Real-time quantitative detection method for emission intensity
Senior-Williams et al. The Classification of Tropical Storm Systems in Infrared Geostationary Weather Satellite Images Using Transfer Learning
CN110543675A (en) Power transmission line fault identification method
KR20230167856A (en) Visibility Prediction Method using Tree-based Machine Learning Algorithm and Meteorological Forecasting Data
Srijiranon et al. Investigation of PM10 prediction utilizing data mining techniques: Analyze by topic
CN113935394A (en) Apparatus and method for environmental monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant