CN111985567B - Automatic pollution source type identification method based on machine learning - Google Patents

Automatic pollution source type identification method based on machine learning Download PDF

Info

Publication number
CN111985567B
CN111985567B CN202010846058.3A CN202010846058A CN111985567B CN 111985567 B CN111985567 B CN 111985567B CN 202010846058 A CN202010846058 A CN 202010846058A CN 111985567 B CN111985567 B CN 111985567B
Authority
CN
China
Prior art keywords
pollution
data
feature
model
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010846058.3A
Other languages
Chinese (zh)
Other versions
CN111985567A (en
Inventor
王春迎
詹宇
马景金
马红楠
张朝
王振强
张仕富
吴秦慧姿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Hebei Sailhero Environmental Protection High Tech Co ltd
Original Assignee
Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Hebei Sailhero Environmental Protection High Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Advanced Environmental Protection Industry Innovation Center Co ltd, Hebei Sailhero Environmental Protection High Tech Co ltd filed Critical Hebei Advanced Environmental Protection Industry Innovation Center Co ltd
Priority to CN202010846058.3A priority Critical patent/CN111985567B/en
Publication of CN111985567A publication Critical patent/CN111985567A/en
Application granted granted Critical
Publication of CN111985567B publication Critical patent/CN111985567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01DMEASURING NOT SPECIALLY ADAPTED FOR A SPECIFIC VARIABLE; ARRANGEMENTS FOR MEASURING TWO OR MORE VARIABLES NOT COVERED IN A SINGLE OTHER SUBCLASS; TARIFF METERING APPARATUS; MEASURING OR TESTING NOT OTHERWISE PROVIDED FOR
    • G01D21/00Measuring or testing not otherwise provided for
    • G01D21/02Measuring two or more variables by means not covered by a single other subclass
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N15/00Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
    • G01N15/06Investigating concentration of particle suspensions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0037NOx
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0039O3
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/004CO or CO2
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0027General constructional details of gas analysers, e.g. portable test equipment concerning the detector
    • G01N33/0036General constructional details of gas analysers, e.g. portable test equipment concerning the detector specially adapted to detect a particular component
    • G01N33/0042SO2 or SO3
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method or the display, e.g. intermittent measurement or digital display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/20Air quality improvement or preservation, e.g. vehicle emission control or emission reduction by using catalytic converters

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Combustion & Propulsion (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Dispersion Chemistry (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)

Abstract

A pollution source type automatic identification method based on machine learning. Comprises the following steps: based on the environmental monitoring data, time and geographic information, identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment, and establishing a typical pollution case library; based on a machine learning algorithm, taking data of a case base as a sample to extract data characteristics, and developing a pollution source type recognition algorithm model; monitoring the real-time monitoring data by using the algorithm model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming; checking or on-site checking the model identification result according to the alarm information, processing the pollution problem if the model identification result exists really, and supplementing and listing the pollution problem in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark. Based on monitoring data such as gridding micro stations and small stations, more data can be brought into a data source, and the model can be further optimized.

Description

Automatic pollution source type identification method based on machine learning
Technical Field
The invention relates to the field of atmospheric environment monitoring, in particular to a pollution source type automatic identification method based on machine learning.
Background
In the field of atmospheric environment monitoring, a standard air station method is adopted in traditional monitoring, and due to the fact that cost is high, distribution quantity is small, generated data quantity is small, and the problem of fine pollution is difficult to accurately reflect. The micro-station adopting the sensor method can realize large-scale point distribution application due to low cost, SO that monitoring data with high space-time resolution in a monitoring area is obtained, monitoring parameters comprise PM10, PM2.5, SO2, NO2, CO, O3, temperature and humidity, the spatial resolution is up to 1 x 1km, and the time resolution is 1h. The acquisition of massive environmental monitoring data supports the establishment of the corresponding relation between a pollution source and air quality, through manual analysis and research, the existing pollution problem can be found from data characteristics, and the source type of air pollution can be judged, including a dust raising source, a moving source, a coal-fired source, a catering oil smoke source, an industrial source and the like, so that the investigation range is reduced, the investigation accuracy is improved, the supervision efficiency is improved, and the manpower is saved for the on-site investigation work of the environmental problem.
However, the current problems are that the process of finding pollution problems and source types based on mass monitoring data requires a large amount of manpower and time, the dependency on the technical level and experience of research personnel is high, the overall application process efficiency is low, the timeliness is poor, the process is limited by the technical personnel level, and the environment management is difficult to effectively support. Therefore, a calculation method capable of efficiently, quickly and stably identifying the type of the pollution source is needed.
At present, the existing pollution source identification patent technology is based on a hot spot grid rather than real-time monitoring data, for example, chinese patent CN110147383A, named as "method and apparatus for determining pollution source type", and discloses a method for determining pollution source type, which determines the pollution source type of a pollution grid by setting a preset concentration value and a preset concentration difference value, and combining with wind speed, wind direction and the situation of pollution source in the grid; the invention of Chinese patent CN110006799A is named as a classification method of hotspot grid pollution types, and discloses a classification method of hotspot grid pollution types, which is used for classifying the atmosphere hotspot grid pollution types through the change characteristics of the concentration of atmospheric pollutants along with time. The technology has the following disadvantages: firstly, the time and space resolution of the hotspot grid data is low, so that the pollution source identification work is mainly based on historical data, the pollution tracing work cannot be guided in real time, and the identification result is difficult to carry out scientific and effective verification; secondly, the satellite inversion data are restricted by meteorological conditions such as cloud amount, accuracy cannot be guaranteed, and effective tracing cannot be achieved; thirdly, the hotspot grid data reflect the air quality condition of the grid area but not the periphery of the pollution source, so that the type of the pollution source is difficult to distinguish through data characteristics; fourthly, the pollution source identification mode is single, and the characteristic parameters are few. And the types of the pollution sources at least comprise 6 types of pollution sources with different pollution characteristics. And the contamination characteristics described above cannot be accurately described.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for automatically identifying a type of a pollution source based on machine learning, so that the method can utilize parameters, time and space coordinate information of each pollutant, and the space information participates in a model operation for identifying a pollution process, that is, differences between target grid data and surrounding grid data are considered, rather than analyzing a data change trend in a time series.
In order to achieve the purpose, the invention provides a pollution source type automatic identification method based on machine learning, which mainly comprises the following steps:
step one, based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, and time and geographic information, through (expert) analysis and judgment, the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established.
Secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning algorithm, and developing a pollution source type recognition algorithm model;
monitoring the real-time monitoring data by using the model, marking the abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
fourthly, the expert examines or checks the model recognition result on site according to the alarm information, if the model recognition result exists, the pollution problem is processed, and event supplements are listed in a typical case library for continuous optimization of the algorithm model; and if the identification result is not accurate, removing the pollution event mark.
The identification algorithm adopted by the method is based on monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like, time and geographic information, through analysis and judgment (manual judgment by experts and the like can be used), the occurrence of pollution problems is identified, the type of a pollution source is judged, and a typical pollution case library is established. Then, based on a machine learning algorithm, taking mass data of the case base as samples to extract data characteristics, and developing a pollution source type recognition algorithm model; and monitoring the real-time monitoring data by using the model, marking abnormal data as a pollution event when the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming. Furthermore, the model identification result can be audited or checked on site by virtue of experts according to alarm information, if the model identification result does exist, the pollution problem is treated, and event supplement is listed in a typical case library for continuous optimization of an algorithm model; and if the identification result is not accurate, removing the pollution event mark.
Preferably, the algorithm model training set contains pollution-free time series pollution data, after the pollution data of the grid is obtained, the proposed 38 features are calculated, and the classification result of the grid pollution type can be output by inputting the mathematical model after training.
The invention has the beneficial effects that by means of the technical scheme, the invention realizes the following advantages compared with the prior art:
(1) A data source: compared with the prior art based on hotspot grid data, the method is based on monitoring data such as grid micro stations and small stations, and can bring more data into a data source;
(2) An algorithm model: the technical scheme of the invention adopts a machine learning algorithm which specifically comprises algorithms such as a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model which comprises sub models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like;
(3) Is characterized in that: in view of the fact that the selectable features based on features in the prior art are few (single grid judgment), through repeated research of the inventor, the algorithm of the invention can comprise 38 feature values in total, multi-point bit comparison judgment is realized, and data such as peripheral pollution sources and the like are further considered as the feature values; (can improve the accuracy of pollution type identification, and has the functions of distinguishing local sources and external sources, and the like, and overcomes the one-sidedness based on single grid analysis)
(4) Model continuous optimization: compared with the prior art which is based on historical data and has fixed algorithm, the technical scheme of the invention is that a generation of algorithm model is generated through the historical data, the application can be implemented in subsequent monitoring data and new cases can be found, the new cases are automatically put into a case library after being audited by technicians, and the model can be further optimized;
(5) Compared with the prior art that the method is based on the client, the method can be based on the cloud server, and has the advantages that the cost of the client is reduced, the advantages of large data are formed at the server end, a large number of cases are collected at different places, the advantages of the technical scheme are fully played, and the accuracy of the algorithm judgment result is further improved.
Drawings
Fig. 1 is a flowchart illustrating steps of a method for automatically identifying a pollution source type based on machine learning according to the present invention.
Detailed Description
For a better understanding of the objects, aspects and advantages of the present invention, reference is made to the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
The hotspot grid in the invention refers to a technical unit related to the organization of the environmental protection department, and the Jingjin Ji and a peripheral key area of 2+26 city are divided into a plurality of grids according to 3km multiplied by 3 km. The method comprises the steps of integrating various data such as satellite remote sensing, air quality ground observation, meteorological observation and the like, utilizing a remote sensing image recognition technology based on cognition and multi-source data fusion, then determining the PM2.5 average concentration of each grid through atmospheric pollutant satellite remote sensing inversion, and determining key supervision areas in hot point grids according to concentration numerical sorting.
Referring to fig. 1, a flow chart of the recognition algorithm used in the present invention is shown, and its main concept content is briefly described as follows:
1. building of typical case base
The typical case base of the invention is a collection of cases describing pollution events which are audited manually (experts) based on environmental monitoring data, and the data information contained in each case at least comprises: the starting time and the ending time of the pollution event, the name and the coordinate of the affected point, the type of the affected parameter, the current and local meteorological conditions and the type of the pollution source which is judged by an expert. Wherein the parameter types may include PM10, PM2.5, SO2, NO2, CO, O3, and VOC, the meteorological conditions include wind direction, wind speed, temperature, and humidity, and the pollution source types include dust sources, mobile sources, coal-fired sources, food and beverage oil smoke sources, industrial sources, and others.
2. Data characterization
The data features extracted from the algorithm model of the invention comprise: 1. first derivative standard deviation of PM 2.5; 2. first derivative standard deviation of CO; 3. SO (SO) 2 The first derivative standard deviation of (d); 4. the first 10 first order differential series-squared sums for PM 2.5; 5. maximum value of CO; 6. a major contaminant; 7. skewness of AQI; 8. 1st autocorrelation coefficient of PM10; 9. quartiles of CO; 10. 1st autocorrelation coefficient of PM 2.5; 11. the coefficient of variation of AQI; 12. coefficient of variation of CO; 13. first derivative standard deviation of PM10; 14. the first 10 first differential series-sum squares of the CO; 15. the sum of AQI; 16. SO (SO) 2 And is added to the CO sum; 17.skewness of PM10; 18. o is 2 Maximum value of (d); 19. SO (SO) 2 The sum of (d); 20. median of CO; 21. the first 10 first order differential series sums of AQI; 22. NO 2 The kurtosis of (a); 23. the first 10 first order differential series-squared sums of PM10; 24. 1st autocorrelation coefficient of AQI; 25. a first differential stage of CO; 26. 1st autocorrelation coefficient of CO; 27. SO (SO) 2 A first differential order of; 28. the sum of CO; 29. SO 2 A median of (d); 30. kurtosis of PM 2.5; 31. a primary differential stage number of PM 2.5; 32. NO 2 The first 10 first order differential series sums of squares; 33. SO 2 Kurtosis of (2); 34. a small value of the AQI maximum time; 35. SO (SO) 2 Coefficient of variation of (a); 36. correlation coefficient of PM10 and CO; 37. SO 2 And CO correlation coefficient; 38. NO 2 And CO correlation coefficient.
The 38 characteristics can reflect the change conditions such as rising and falling of each pollutant and the (time cross) correlation of each pollutant time series to a certain extent, and comprehensively characterize the pollution types of each station in different periods from the statistical perspective.
For example, the feature 6 (NO 2_ diff1_ acf 10) represents the degree of variation of the NO2 sequence, the feature 11 (distance _ dtw) represents the similarity of time series between different pollutants, and the feature 17 (co-quantile) represents the frequency distribution of C0 pollution, which can indicate to some extent whether a case belongs to automotive pollution.
However, due to the complexity of the multivariate time series variation and the correlation of multivariate time series of peripheral sites, it is difficult to artificially generalize and select the time series characteristics corresponding to each pollution type (or case). Therefore, the invention mainly combines the 38 weighted characteristics automatically based on the training data in the case base through a machine learning algorithm to generate a data-driven prediction model.
3. Model algorithm description/calculation formula
The technical scheme includes that a multi-label classification model is established for an existing case and a case supplemented later, namely, composite pollution formed by combination of multiple pollution types possibly exists in the same time period and the same place, an example is shown in table 1 (not including all pollution types), each row corresponds to one case or one pollution event, X is selected characteristic value summary, X1, X2, X3, X4, X5 and X6 are respectively characteristic values of corresponding cases, Y1, Y2, Y3, Y4 and Y5 are different pollution types, labels are called in the multi-label model, 1 represents the type, and 0 represents the type. The model adopts a combination strategy, and the combination strategy mainly comprises Binary Relevance (Binary Relevance), classifier Chains (Classifier Chains), nested Stacking (Nested Stacking) and the like.
X Y1 Y2 Y3 Y4 Y5
X1 1 0 0 0 0
X2 0 1 1 0 0
X3 0 0 0 1 0
X4 0 0 0 0 1
X5 0 1 0 0 0
X6 1 0 1 0 0
TABLE 1 Multi-tag model example
The invention mainly uses a binary association strategy, the principle of the strategy is to establish a binary classification for each label, the binary classification is a simple problem, namely whether the label belongs to the type or not, as shown in table 2, a model is divided into five binary classifications, then a plurality of binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear computational complexity in the aspect of label quantity, and can be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved. In addition, machine learning (e.g., random forest) models under default parameter configurations tend to ignore the less significant types of pollution in training samples in the prediction. In the algorithm, a cutoff value (cutoff) parameter in each two classifier is adjusted based on the proportion of each pollution type in a training sample, so that each pollution type can be predicted in a balanced manner by an optimized model, and the overall prediction performance is improved.
TABLE 2 binary Association policy example
Figure BDA0002643097040000081
When the binary classification is established independently for each label, the same machine learning algorithm is used for modeling for each binary classification under the default condition, and the algorithm comprises a random forest, a neural network, a support vector machine, a gradient propulsion machine and the like. After further learning and research, different characteristic value combinations can be combined when modeling of each pollution type is tried, different machine learning algorithms are tried, the optimal characteristic value combination and the optimal algorithm are selected to establish binary classification, finally, different binary classifications are combined and combined to form an optimal multi-label model according to binary association, and when a new pollution event is predicted, the pollution type can be comprehensively judged according to the characteristic value of the pollution event.
The invention constructs three algorithms of a support vector machine, a random forest and an XGboost for a model. Briefly introduced here, a Support Vector Machine (SVM) is a type of generalized linear classifier that performs binary classification on data in a supervised learning manner, and can be used for classification and regression. The random forest is an algorithm for integrating a plurality of trees through the idea of ensemble learning, belongs to a nonlinear classifier, and therefore, the complex nonlinear interdependence relation between variables can be mined. The basic unit of the random forest is a decision tree which is a basic classifier, the main work is to select features to divide a data set, and finally, the data is attached with two different types of labels, and the constructed decision tree is in a tree structure. The random forest can be obtained by constructing a plurality of decision trees, each tree gives a classification result when prediction is carried out, voting is carried out accordingly, and a final classification result is output by adopting a principle that majority obeys minority. XGBoost is also a decision tree based machine learning algorithm, different from random forests, where each decision tree is constructed separately, and the idea of XGBoost is to grow a tree by adding trees continuously and performing feature splitting continuously, and each time a tree is added, it is actually to learn a new function to fit the residual of the last prediction until a stopping condition is reached, such as the number of trees to be constructed. During prediction, according to the characteristics of a prediction sample, a corresponding leaf node is found on each tree, each leaf node corresponds to a score, and finally the scores corresponding to each tree are added together to form the prediction value of the sample.
When the model is constructed, because each sample of the pollution type is not necessarily balanced, which has certain influence on the accuracy of the model, the method optimizes the point when the model is constructed, avoids the influence caused by unbalanced samples to a certain extent by improving the parameters of the model, and can correspondingly adjust under the condition that the cases are continuously supplemented.
4. How to base on cloud server
In the development process of the algorithm model provided by the invention, as more available cases are provided, the prediction accuracy of the developed model is higher, so that environmental monitoring data of multiple cities are required; after the development is completed, the model can be applied to different cities. Therefore, in the scheme of the invention, the model is set to be in a cloud operation mode, and the operation mode can effectively utilize as much data as possible, improve the precision of the model and facilitate later wide application.
The following specific examples are intended to illustrate the invention, but are not intended to limit the scope of the invention.
In this embodiment, the method for automatically identifying the type of the pollution source based on machine learning of the present invention is to utilize a micro station to obtain monitoring data with high spatial and temporal resolution in a monitoring area, wherein monitoring parameters include PM10, PM2.5, SO2, NO2, CO, O3, temperature, and humidity, and propose concentration characteristics based on changes with time and geographic information for classification. As shown in fig. 1, the method for automatically identifying the type of a pollution source based on machine learning provided by the present invention mainly includes the following steps:
monitoring data such as PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and the like with high space-time resolution in a monitoring area, and time and geographic information are obtained through a micro station;
establishing a typical pollution case library based on expert judgment;
developing a pollution source type recognition algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm;
carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming;
then, the expert examines the model identification result according to the alarm information to determine whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified.
In the following embodiments, the classification of the high spatial and temporal resolution site pollution types in the monitored area comprises the following steps:
1. PM10, PM2.5, SO2, NO2, CO, O3, temperature, humidity and other monitoring data with high space-time resolution in a monitoring area, and time and geographic information are obtained through the micro-station.
Because different pollution types have different characteristics on the change of the pollutant concentration, various characteristics are extracted from time series pollution data according to the basic statistics of the data; and then converting some geographic information, emission list information and information acquired by expert judgment into corresponding characteristic variables, such as: and (3) the characteristics of pollution sources around the site, road network density around the site and time series distance, and the total number is 140.
The characteristics and some of the calculations involved for each contaminant are as follows:
the 6 pollutants (PM 10, PM2.5, SO2, NO2, O3, CO) and AQI were formed in case groups:
diff1_ acf10: the first 10 first order difference series sums of squares;
diff1_ acf1: a first differential stage number;
x _ acf1: a first autocorrelation coefficient;
x _ pacf5: the square sum of the autocorrelation coefficients of the first five parts;
diff2x _ pacf5: the first 5 2 differential series sums of squares;
std1st _ der: first derivative standard deviation;
the average value, the sum, the maximum value, the quartile, the variation coefficient, the mean, the standard deviation, the median, the variance, the skewness, the kurtosis and the hour value of the maximum time of the AQI are formed by grouping the 6 pollutants and the AQI according to cases; correlation coefficients between six contaminants and AQI; the main contaminants.
Pollution sources around the station: acquiring the number of different types of pollution sources around different stations according to the pollution source information around the stations and the emission list information, and taking the pollution sources as characteristic values;
road network density around the site: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time series between contaminants, dynamic Time Warping (DTW) distance is used.
Then screening a certain amount of characteristic variables from all considered variables according to the importance of the variables in the random forest model, and finally selecting the following 38 data characteristics based on the pollution data and the geographic information, the emission list information and the information obtained by expert judgment as the basis of pollution type classification.
The method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf10; the first 10 first differential series-squared sums of PM10;
and (3) feature: pm2_ 5/diff 1/acf 10; the first 10 first order differential series-squared sums for PM 2.5;
and (4) feature: co _ diff1_ acf10; the first 10 first differential series-sum squares of the CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts, such as main roads, sensitive points, towns, construction sites, environmental background points and the like;
and (6) characteristic: no2_ diff1_ acf10; the first 10 first differential order sums of squares for NO 2;
and (7) feature: aqi _ diff1_ acf10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristics are as follows: hour.data; AQI maximum time small value;
and (2) characteristic 11: distance _ dtw; similarity of time series among pollutants is realized by adopting a distance of dtw;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM10;
feature 14: pm2_5_stdlst _der; first derivative standard deviation of PM 2.5;
characteristic 15: so2_ stdlst _ der; the first derivative standard deviation of SO 2;
and (4) characteristic 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
feature 18: so2_ co _ sum; the sum of SO2 plus the sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
characteristic 22: x _ acf1_ pm10; a first autocorrelation coefficient of PM10;
and (4) characteristic 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
feature 24: x _ acf1_ pm2_5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; the median of CO;
characteristic 29: pm2_5 \ diff1 \_acf1; a primary differential stage number of PM 2.5;
and (2) feature 30: so2_ diff1_ acf1; a first differential order of SO 2;
feature 31: co _ diff1_ acf1; a first differential stage of CO;
feature 32: skewness _ pm10; skewness of PM10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2_5 \ u kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; obtaining the number of different types of pollution sources around the station according to the pollution source information around the station;
feature 38: polarization _ type; and obtaining the number of different types of pollution sources around the station according to the emission list.
2. And establishing a typical pollution case library based on expert judgment.
The type of contamination of each high spatio-temporal resolution grid may be determined by expert judgment based on contamination data and some other information, etc., and in this embodiment the determined types of contamination include: raise dust and dust; a motor vehicle; heavy vehicles, machinery, ships; catering oil smoke; burning coal; carrying out unorganized incineration; an enterprise; fireworks and crackers; the procedures involving VOCs are 9 types.
3. And developing a pollution source type identification algorithm model aiming at the pollution source emission data characteristics based on a machine learning algorithm.
And calculating 38 technical characteristics selected by the invention according to the pollution data and other information, and labeling the characteristic data corresponding to each grid according to the pollution type judged by experts to be used as training data of the model. The method adopts a machine learning algorithm, specifically comprises a random forest, a neural network, a support vector machine, a gradient propeller and the like, and adopts a combined model, and includes sub-models based on curve shape (time sequence shape) and deep neural network automatic feature extraction and the like to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
4. And (4) carrying out abnormal data marking on the real-time monitoring data by using an algorithm model, identifying the type of a pollution source and automatically alarming.
The algorithm model training set contains pollution-free time sequence pollution data, after the pollution data of the grids are obtained, 38 provided technical features are calculated, and the classification result of the grid pollution types can be output by inputting the trained mathematical model. In the early-stage test, two standard air stations and three micro stations (southeast corner of a certain steel enterprise, a certain sewage treatment plant and a northwest loop of a city) in a certain city are randomly selected, data after 2019, 9 and 1 days are selected, a segmentation function is adopted to divide the data into different segments, then the pollution segments are screened by using different pollutant concentration conditions, each segment is predicted by using a model established by a case to obtain the pollution types of the different segments, then a series of information of the obtained site pollution segments is sent back to an expert, and the expert performs secondary judgment.
5. The expert examines the model identification result according to the alarm information and determines whether the model identification result is accurate; if the identification is correct, processing the pollution source, and supplementing the event into a case library to further optimize the algorithm model; the contamination event flag is de-flagged if an error is identified. For example, it is recognized that the pollution type of the site tang shan ceramics company 2019/9/7/14-2019/9/8 is a (raise dust, dust), the expert group performs secondary judgment, the judgment type is a (raise dust, dust), and the result obtained by the model matches with the judgment result of the expert, so that the case can be input as case supplement into a case base and a pollution source is processed; and recognizing that the pollution type of the station suburb sewage treatment plant 842, 2019/9/212 is g (enterprise), and the time period of 00 is no obvious pollution source when the expert group performs secondary judgment, wherein the model recognition result is different from the model recognition result according to alarm information audit by the expert, and the pollution event mark is removed at the moment.
It will be appreciated by those skilled in the art that the model of the present invention will have an increasing accuracy of model identification as contamination events are replenished into the case library.
Although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims (5)

1. A pollution source type automatic identification method based on machine learning is characterized by comprising the following steps:
identifying the occurrence of pollution problems and judging the type of a pollution source through analysis and judgment based on environmental monitoring data, time and geographic information, and establishing a typical pollution case library;
secondly, extracting data characteristics by taking mass data of the case base as samples based on a machine learning calculation method, and developing a pollution source type recognition algorithm model according to the extracted data characteristics;
monitoring real-time environment monitoring data by using the algorithm model, marking the abnormal data as a pollution event if the abnormal data is found, further identifying the type of a source causing pollution, realizing online identification of pollution source emission and automatically alarming;
checking or checking the model identification result on site according to the alarm information, if the model identification result exists really, processing the pollution problem, and adding event supplements into a typical case library for continuous optimization of the algorithm model; if the identification result is not accurate, removing the pollution event mark;
the environmental monitoring data includes: PM10, PM2.5, SO2, NO2, CO, O3, temperature, and humidity, as well as time and geographic information;
the typical case base is a set of cases which are used for describing pollution events and are audited on the basis of the environmental monitoring data, and each case comprises the following data information: the starting time and the ending time of the pollution event, the name and the coordinates of the affected point, the type of the affected parameter and the meteorological conditions of the current place, and the type of the pollution source which is judged by an expert;
the affected parameters are parameters for obtaining high spatial and temporal resolution of the monitored area through a micro-station, and the parameter types at least comprise 6 pollutants: PM10, PM2.5, SO2, NO2, CO, O3 and VOC, wherein the meteorological conditions comprise wind direction, wind speed, temperature and humidity, and the pollution dye source types comprise a dust raising source, a moving source, a coal burning source, a catering oil smoke source and an industrial source; in the second step, various features are extracted from the time series pollution data according to the basic statistics of the data; converting some geographic information, emission list information and information obtained by expert judgment into corresponding characteristic variables;
the features extracted and the calculation method are as follows:
the 6 pollutants and the AQI are formed according to case groups: the method comprises the following steps of adding the squares of the first 10 first-order differential series, adding the squares of the first five partial autocorrelation coefficients, adding the squares of the first 5 second-order differential series, adding the standard deviation of the first derivative, averaging, adding, maximum, quartile, coefficient of variation, average, standard deviation, median, variance, skewness, kurtosis, small value of the maximum time of AQI, correlation coefficients between six pollutants and AQI, and main pollutants;
pollution sources around the station: acquiring the number of different types of pollution sources around different stations according to the pollution source information around the stations and the emission list information, and taking the pollution sources as characteristic values;
site peripheral road network density: considering the influence of motor vehicle emission on pollutant data, according to the situation of the road network around the site, the density of the road network around the site is obtained by using a geographic information system technology and is used as a characteristic value;
time series distance features: similarity of time sequences among pollutants adopts dynamic time warping DTW distance;
screening a certain amount of characteristic variables from all considered variables according to the importance of the variables in the random forest model, and finally selecting the following 38 data characteristics based on pollution data, geographic information, emission list information and information obtained by expert judgment as the basis of pollution type classification,
the method is characterized in that: co _ stdlst _ der; first derivative standard deviation of CO;
and (2) feature: pm10_ diff1_ acf10; the first 10 first order differential series-squared sums of PM10;
and (3) characteristic: pm2_ 5/diff 1/acf 10; the first 10 first differential series-squared sums of PM 2.5;
and (4) characteristic: co _ diff1_ acf10; the first 10 first differential series-sum squares of the CO;
and (5) feature: polarization; the positions of the sites of the pollution cases judged by the experts include but are not limited to main roads, sensitive points, towns, construction sites and environmental background points;
and (6) characteristic: no2_ diff1_ acf10; the first 10 first differential series-sum squares of NO 2;
and (7) feature: aqi _ diff1_ acf10; the first 10 first order differential series sums of AQI;
and (2) characteristic 8: x _ acf1_ aqi; a first autocorrelation coefficient of AQI;
and (2) characteristic 9: aqi _ cv; the coefficient of variation of AQI;
the characteristic 10: data; AQI maximum time small value;
the characteristics are as follows: distance _ dtw; similarity of time series between pollutants, using a distance of dtw;
and (2) feature 12: aqi _ sum; the sum of AQI;
and (2) characteristic 13: pm10_ stdlst _ der; first derivative standard deviation of PM10;
feature 14: pm2_5_stdlst _der; first derivative standard deviation of PM 2.5;
characteristic 15: so2_ stdlst _ der; the first derivative standard deviation of SO 2;
and (4) feature 16: co _ max; maximum value of CO;
and (2) feature 17: co _ quantile; quartiles of CO;
and (4) feature 18: so2_ co _ sum; the sum of SO2 plus the sum of CO;
and (2) feature 19: so2_ max; maximum value of SO 2;
and (2) feature 20: co _ sum; the sum of CO;
characteristic 21: so2_ sum; the sum of SO 2;
and (2) feature 22: x _ acf1_ pm10; a first autocorrelation coefficient of PM10;
and (4) feature 23: x _ acf1_ co; a first autocorrelation coefficient of CO;
characteristic 24: x _ acf1_ pm2_5; first autocorrelation coefficient of PM 2.5;
and (2) feature 25: co _ cv; coefficient of variation of CO;
feature 26: so2_ cv; the coefficient of variation of SO 2;
characteristic 27: so2_ mean; the median of SO 2;
characteristic 28: co _ mean; the median of CO;
characteristic 29: pm2_5 \ diff1 \ acf1; a primary differential stage number of PM 2.5;
and (2) characteristic 30: so2_ diff1_ acf1; a first differential order of SO 2;
feature 31: co _ diff1_ acf1; a first differential order of CO;
feature 32: skewness _ pm10; skewness of PM10;
feature 33: skewness _ aqi; skewness of AQI;
feature 34: pm2_5 \ u kurtosis; kurtosis of PM 2.5;
characteristic 35: so2_ kurtosis; kurtosis of SO 2;
feature 36: no2_ kurtosis; kurtosis of NO 2;
feature 37: polarization _ entities; obtaining the number of different types of pollution sources around the station according to the pollution source information around the station;
feature 38: polarization _ type; and obtaining the number of different types of pollution sources around the station according to the emission list.
2. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 1, wherein: the pollution source type recognition algorithm model is a multi-label classification model established for the existing cases and the cases supplemented later, and can express composite pollution formed by combining a plurality of pollution types possibly existing in the same place in the same time period; the pollution source type identification algorithm model adopts a combination strategy; the combination strategy is binary association, a classifier chain or nested superposition; according to the proportion of each pollution type in the training data, a cutoff value parameter is set in each classifier so as to solve the problem of non-equilibrium of the training samples and improve the prediction accuracy of the accidental pollution types.
3. The method for automatically identifying the type of the pollution source based on the machine learning as claimed in claim 2, wherein: the combination strategy is a binary association strategy, a binary classification is established for each label, the binary classification is a simple problem, namely whether the label belongs to the type or not, a model is divided into a plurality of binary classifications, then the binary classifications are combined together, each label is independently predicted during prediction, the dependency between the labels is not considered, then the result is combined into a multi-label target, the binary classification has linear calculation complexity in the aspect of label quantity so as to be easily parallelized, namely the binary classification of each label is established at the same time, and the operation speed is improved.
4. The method for automatically identifying the type of the pollution source based on the machine learning of claim 3, wherein the selected 38 characteristics are calculated according to the pollution data and other information, and the characteristic data corresponding to each grid is labeled according to the judged pollution type to be used as training data of a model; the method adopts a machine learning algorithm which specifically comprises but is not limited to a random forest, a neural network, a support vector machine and a gradient propulsion machine, and adopts a combined model which comprises a curve shape and deep neural network automatic feature extraction sub-model to train a data model, so that the proposed dimensionality and feature classification can be better understood, and the accuracy of pollution type classification can be improved.
5. The method as claimed in claim 1, wherein the algorithm model training set contains pollution-free time series pollution data, after obtaining grid pollution data, 38 proposed features are calculated, the classification result of the grid pollution type can be output by inputting the trained mathematical model, the classification result is divided into different segments by a partition function, then the pollution segments are screened by using different pollutant concentration conditions, each segment is predicted by using a model established by a case, the pollution types of different segments are obtained, and then a series of information of the obtained site pollution segments is sent back to an expert for secondary judgment by the expert.
CN202010846058.3A 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning Active CN111985567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010846058.3A CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010846058.3A CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Publications (2)

Publication Number Publication Date
CN111985567A CN111985567A (en) 2020-11-24
CN111985567B true CN111985567B (en) 2022-11-22

Family

ID=73443859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010846058.3A Active CN111985567B (en) 2020-08-21 2020-08-21 Automatic pollution source type identification method based on machine learning

Country Status (1)

Country Link
CN (1) CN111985567B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634113B (en) * 2020-12-22 2023-09-26 山西大学 Pollution waste gas correlation analysis method based on dynamic sliding window
CN112990024B (en) * 2021-03-18 2024-03-26 深圳博沃智慧科技有限公司 Urban dust monitoring method
CN113295635A (en) * 2021-05-27 2021-08-24 河北先河环保科技股份有限公司 Water pollution alarm method based on dynamic update data set
CN113688940A (en) * 2021-09-09 2021-11-23 浙江大学 Suspected pollution industrial enterprise identification method based on public data
CN113706127B (en) * 2021-10-22 2022-02-22 长视科技股份有限公司 Water area analysis report generation method and electronic equipment
CN114693003B (en) * 2022-05-23 2022-09-02 成都秦川物联网科技股份有限公司 Smart city air quality prediction method and system based on Internet of things
CN115018348B (en) * 2022-06-20 2023-01-17 北京北投生态环境有限公司 Environment analysis method, system, equipment and storage medium based on artificial intelligence
CN114943482B (en) * 2022-06-28 2024-06-21 成都秦川物联网科技股份有限公司 Smart city exhaust emission management method and system based on Internet of things
CN115358718A (en) * 2022-08-24 2022-11-18 广东旭诚科技有限公司 Noise pollution classification and real-time supervision method based on intelligent monitoring front end
CN115792919B (en) * 2023-01-19 2023-05-16 合肥中科光博量子科技有限公司 Method for identifying polluted hot spot area through horizontal scanning monitoring of aerosol laser radar
CN117057819B (en) * 2023-08-15 2024-06-28 泰华智慧产业集团股份有限公司 Rainwater pipe network sewage discharge traceability analysis method and system
CN116912069B (en) * 2023-09-13 2024-01-02 成都市智慧蓉城研究院有限公司 Data processing method applied to smart city and electronic equipment
CN117473398B (en) * 2023-12-26 2024-03-19 四川国蓝中天环境科技集团有限公司 Urban dust pollution source classification method based on slag transport vehicle activity
CN117633661B (en) * 2024-01-26 2024-04-02 西南交通大学 Slag car high-risk pollution source classification method based on evolution diagram self-supervised learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844626A (en) * 2017-01-20 2017-06-13 武汉大学 Using microblogging keyword and the method and system of positional information simulated air quality
CN107608009A (en) * 2017-09-15 2018-01-19 深圳市卡普瑞环境科技有限公司 A kind of air quality surveillance equipment, processing terminal and server
CN110186820A (en) * 2018-12-19 2019-08-30 河北中科遥感信息技术有限公司 Multisource data fusion and environomental pollution source and pollutant distribution analysis method
CN110870019A (en) * 2017-10-16 2020-03-06 因美纳有限公司 Semi-supervised learning for training deep convolutional neural network sets

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103234883B (en) * 2013-04-30 2016-04-13 中南大学 A kind of method based on road traffic flow real-time estimation inner city PM2.5 concentration
CN104899596B (en) * 2015-03-16 2018-09-14 景德镇陶瓷大学 A kind of multi-tag sorting technique and its device
CN108764013A (en) * 2018-03-28 2018-11-06 中国科学院软件研究所 A kind of automatic Communication Signals Recognition based on end-to-end convolutional neural networks
CN109740560B (en) * 2019-01-11 2023-04-18 山东浪潮科学研究院有限公司 Automatic human body cell protein identification method and system based on convolutional neural network
CN110006799A (en) * 2019-02-14 2019-07-12 北京市环境保护监测中心 A kind of classification method of hot spot grid pollution type
CN111121862A (en) * 2019-09-29 2020-05-08 广西中遥空间信息技术有限公司 Air-space-ground integrated atmospheric environment monitoring system and method
CN111461184A (en) * 2020-03-19 2020-07-28 南京理工大学 XGB multi-dimensional operation and maintenance data anomaly detection method based on multivariate feature matrix

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844626A (en) * 2017-01-20 2017-06-13 武汉大学 Using microblogging keyword and the method and system of positional information simulated air quality
CN107608009A (en) * 2017-09-15 2018-01-19 深圳市卡普瑞环境科技有限公司 A kind of air quality surveillance equipment, processing terminal and server
CN110870019A (en) * 2017-10-16 2020-03-06 因美纳有限公司 Semi-supervised learning for training deep convolutional neural network sets
CN110186820A (en) * 2018-12-19 2019-08-30 河北中科遥感信息技术有限公司 Multisource data fusion and environomental pollution source and pollutant distribution analysis method

Also Published As

Publication number Publication date
CN111985567A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
CN111985567B (en) Automatic pollution source type identification method based on machine learning
Kleine Deters et al. Modeling PM2. 5 urban pollution using machine learning and selected meteorological parameters
CN116186566B (en) Diffusion prediction method and system based on deep learning
CN115578015A (en) Sewage treatment overall process supervision method and system based on Internet of things and storage medium
CN108595414B (en) Soil heavy metal enterprise pollution source identification method based on source-sink space variable reasoning
CN116359218B (en) Industrial aggregation area atmospheric pollution mobile monitoring system
Van et al. A new model of air quality prediction using lightweight machine learning
CN117171695B (en) Method and system for evaluating ecological restoration effect of antibiotic contaminated soil
KR102564191B1 (en) Disaster response system that detects and responds to disaster situations in real time
Al_Janabi et al. Pragmatic method based on intelligent big data analytics to prediction air pollution
CN116359285A (en) Oil gas concentration intelligent detection system and method based on big data
CN118171920B (en) LLM model-based park safety emergency response method, device and medium
CN112532652A (en) Attack behavior portrait device and method based on multi-source data
CN113935228A (en) L-band rough sea surface radiation brightness and temperature simulation method based on machine learning
CN114416423B (en) Root cause positioning method and system based on machine learning
CN115146537A (en) Atmospheric pollutant emission estimation model construction method and system based on power consumption
CN113267601B (en) Industrial production environment remote real-time monitoring cloud platform based on machine vision and data analysis
CN109213840B (en) Hot spot grid identification method based on multidimensional feature deep learning
Kim et al. Massive scale deep learning for detecting extreme climate events
CN110827264A (en) Evaluation system for apparent defects of concrete member
CN110543675A (en) Power transmission line fault identification method
CN107679478B (en) Method and system for extracting space load state of power transmission line
CN114527235A (en) Real-time quantitative detection method for emission intensity
Patil Prediction an air quality index data using machine learning and deep learning
CN117952440B (en) Chemical industry park production environment supervision method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant