WO2020010701A1 - Pollutant anomaly monitoring method and system, computer device, and storage medium - Google Patents

Pollutant anomaly monitoring method and system, computer device, and storage medium Download PDF

Info

Publication number
WO2020010701A1
WO2020010701A1 PCT/CN2018/106682 CN2018106682W WO2020010701A1 WO 2020010701 A1 WO2020010701 A1 WO 2020010701A1 CN 2018106682 W CN2018106682 W CN 2018106682W WO 2020010701 A1 WO2020010701 A1 WO 2020010701A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
real
pollutant
index
forest model
Prior art date
Application number
PCT/CN2018/106682
Other languages
French (fr)
Chinese (zh)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020010701A1 publication Critical patent/WO2020010701A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A pollutant anomaly monitoring method and system, a computer device, and a storage medium, relating to the technical field of environmental pollution data processing. The monitoring method comprises: obtaining historical environment monitoring data and setting same as a data set for storage; setting index thresholds for index items of a pollutant, screening the data set according to the index thresholds to obtain data not exceeding the index thresholds, setting said data as feature items and storing the feature items in a data set not exceeding a standard; training an isolation forest model by using the feature items; and obtaining real-time monitoring data, inputting the real-time monitoring data into the isolation forest model, and determining, according to path lengths of the real-time monitoring data from a root node to leaf nodes of the isolation forest model, whether said data are abnormal points, and summarizing the abnormal points. Abnormal point data can be monitored simply and quickly, and the situation that a pollution source exceeds the standard can be predicted in advance and output, so that the situation of exceeding the standard is prevented.

Description

污染物异常监测方法、系统、计算机设备和存储介质Pollutant abnormality monitoring method, system, computer equipment and storage medium
本申请要求于2018年07月11日提交中国专利局、申请号为201810757268.8、发明名称为“污染物异常监测方法、系统、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on July 11, 2018, with application number 201810757268.8, and the invention name is "Contaminant Anomaly Monitoring Method, System, Computer Equipment, and Storage Medium". Citations are incorporated in this application.
技术领域Technical field
本申请涉及环境污染数据处理技术领域,尤其涉及一种污染物异常监测方法、系统、计算机设备和存储介质。The present application relates to the technical field of environmental pollution data processing, and in particular, to a method, a system, a computer device, and a storage medium for monitoring anomalies of pollutants.
背景技术Background technique
污染源是指造成环境污染的污染物发生源,通常指向环境排放有害物质或对环境产生有害影响的场所、设备、装置或人体。任何以不适当的浓度、数量、速度、形态和途径进入环境系统并对环境产生污染或破坏的物质或能量,统称为污染物。在工业生产中的一些环节,如原料生产、加工过程、燃烧过程、加热和冷却过程、成品整理过程等使用的生产设备或生产场所都可能成为工业污染源。现有技术中,对于污染源排放的监测一般有两种方法,一是监督性监测,即定期检查污染源排放废气中的有害物质含量是否符合国家规定。二是研究性监测,是对污染源排放污染物的种类、排放量、排放规律进行监测,有利于查清空气污染的主要来源,探讨空气污染发展的趋势,制订污染控制措施,改善环境空气质量。A pollution source is a source of pollutants that causes environmental pollution. It usually refers to a place, equipment, device, or human body that emits harmful substances to the environment or has a harmful effect on the environment. Any substance or energy that enters the environmental system at an inappropriate concentration, quantity, speed, form, and path and causes pollution or damage to the environment is collectively referred to as a pollutant. In some links in industrial production, such as production equipment or production sites used in raw material production, processing, combustion, heating and cooling, and finishing of finished products, they can become sources of industrial pollution. In the prior art, there are generally two methods for monitoring the emissions of pollution sources. The first is supervisory monitoring, which periodically checks whether the content of harmful substances in the exhaust gas emitted by pollution sources meets national regulations. The second is research-based monitoring, which monitors the types, emissions, and discharge laws of pollutants emitted by pollution sources, which helps to identify the main sources of air pollution, discuss the development trend of air pollution, formulate pollution control measures, and improve ambient air quality.
但是,无论是那种检测方式,目前均是通过单项污染源的达标阈值设定,对比单项污染源检测数据,发现排污超标企业。而存在的问题是污染源种类复杂,单项阈值的设定和查比繁琐,并且阈值的设定无法预防超标的产生。However, no matter what kind of detection method, currently, the threshold of the single pollution source is set, and compared with the detection data of the single pollution source, it is found that the pollutant discharge enterprises exceed the standard. However, there are problems in that the types of pollution sources are complicated, the setting and checking of single thresholds are cumbersome, and the setting of thresholds cannot prevent the occurrence of excessive standards.
发明内容Summary of the invention
有鉴于此,有必要针对通过单项污染源的达标阈值设定查比繁琐,无法预防超标的产生的问题,提供一种污染物异常监测方法、系统、计算机设备和存 储介质。In view of this, it is necessary to set the check threshold for the compliance threshold of a single pollution source, which is tedious, and cannot prevent the problem of exceeding the standard, and provide a method, system, computer equipment and storage medium for monitoring the abnormality of pollutants.
一种污染物异常监测方法,包括如下步骤:A method for monitoring anomalies of pollutants includes the following steps:
从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;Use the feature terms in the non-exceeded data set to train an isolated forest model, and establish the corresponding isolated forest model for each non-exceeded data set;
从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
一种污染物异常监测系统,包括如下单元:A pollutant abnormality monitoring system includes the following units:
获取数据单元,设置为从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;The acquiring data unit is configured to acquire historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and set the historical environmental monitoring data to one data of one pollutant per enterprise Set for storage;
筛选单元,设置为对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;The screening unit is configured to set an index threshold for each index item of each of the pollutants, filter each of the data sets according to the index threshold value, and filter out data that does not exceed the index threshold value as a feature item , Storing the feature items in a non-exceeding data set;
训练单元,设置为采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;A training unit configured to train an isolated forest model by using the feature terms in the non-exceeded data set, and establish the corresponding isolated forest model for each non-exceeded data set;
异常点汇总单元,设置为从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。The abnormal point summary unit is configured to obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, and input the real-time monitoring data into the isolated forest corresponding to the pollutant In the model, it is determined whether the path length from the root node to the leaf node of the isolated forest model is an abnormal point through the real-time monitoring data, and the abnormal points are summarized.
一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读 指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:A computer device includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to perform the following steps:
从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;Use the feature terms in the non-exceeded data set to train an isolated forest model, and establish the corresponding isolated forest model for each non-exceeded data set;
从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:A storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;Use the feature terms in the non-exceeded data set to train an isolated forest model, and establish the corresponding isolated forest model for each non-exceeded data set;
从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
上述污染物异常监测方法、装置、计算机设备和存储介质,包括从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;对每项污染物的每个指标项设置指标阈值,根据指标阈值对每项数据集进行筛选,筛选出未超过指标阈值的数据设置为特征项,将特征项存储在未超标数据集中;采用未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的孤立森林模型;从环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将实时监测数据输入与污染物对应的孤立森林模型中,通过实时监测数据达到孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。本申请通过对历史数据进行筛选,筛选出未超标的污染物数据作为特征项,通过孤立森林模型对企业排污情况做异常点监测的方式,监测异常点数据简单快捷,能提前预测出污染源超标情况并进行输出,预防超标的产生。The above-mentioned pollutant abnormality monitoring method, device, computer equipment and storage medium include obtaining historical environmental monitoring data of each enterprise's pollution source monitoring point from a preset environmental monitoring data system, and converting historical environmental monitoring data to Each pollutant is set as a data set for storage; each indicator of each pollutant is set with an index threshold, each data set is filtered according to the index threshold, and the data that does not exceed the index threshold is set as a feature. Store the feature items in the non-exceeding data set; use the feature items in the non-exceeding data set to train an isolated forest model, and establish a corresponding isolated forest model for each non-exceeding data set; obtain an enterprise-in-one The real-time monitoring data of the pollutants are measured in hours. The real-time monitoring data is input into the isolated forest model corresponding to the pollutants, and the path length of the root node to the leaf node of the isolated forest model is determined by the real-time monitoring data to determine whether it is an abnormal point. The abnormal points are summarized. This application screens historical data and selects non-exceeding pollutant data as feature items, and uses an isolated forest model to monitor abnormal points of the company's sewage. The monitoring of the abnormal point data is simple and fast, and it can predict the pollution source exceeding the standard in advance. And output, to prevent the occurrence of excessive standards.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the detailed description of the preferred embodiments below. The drawings are only for the purpose of illustrating preferred embodiments and are not to be considered as limiting the present application.
图1为本申请一个实施例中的污染物异常监测方法的流程图;FIG. 1 is a flowchart of a pollutant abnormality monitoring method in an embodiment of the present application; FIG.
图2为图1中步骤S3的流程图;FIG. 2 is a flowchart of step S3 in FIG. 1;
图3为步骤S3构造的一种树的结构图;3 is a structural diagram of a tree constructed in step S3;
图4为图1中步骤S4的流程图;FIG. 4 is a flowchart of step S4 in FIG. 1;
图5为本申请一个实施例中的污染物异常监测系统的结构图;5 is a structural diagram of a pollutant abnormality monitoring system in an embodiment of the present application;
图6为图5中的异常点汇总单元的模块示意图。FIG. 6 is a schematic block diagram of the abnormal point summary unit in FIG. 5.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实 施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution, and advantages of the present application clearer, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。Those skilled in the art will understand that, unless specifically stated otherwise, the singular forms "a", "an", "the" and "the" may include plural forms. It should be further understood that the word "comprising" used in the specification of the present application refers to the presence of the described features, integers, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and / or groups thereof.
图1为本申请一个实施例中的污染物异常监测方法的流程图,如图1所示,监测方法,包括如下步骤:FIG. 1 is a flowchart of a method for monitoring abnormal pollutants in an embodiment of the present application. As shown in FIG. 1, the monitoring method includes the following steps:
步骤S1,获取数据:从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储。Step S1: Obtaining data: Obtain historical environmental monitoring data of each enterprise's pollution source monitoring point from a preset environmental monitoring data system, and set historical environmental monitoring data with one pollutant of each enterprise as a data set. storage.
本步骤对污染物异常监测主要是针对企业排放的污染物进行监测,因此本实施例中的企业是纳入生态环境部污染源监控中心的重点排污单位,预设的环境监测数据系统是政府环保部门的生态环境部污染源监控中心的自动监控工作调度平台,或者第三方的环境监测数据系统。环境监测数据系统采集了每个重点排污单位的所有污染源监测点的历史环境监测数据和实时监测数据。The abnormality monitoring of pollutants in this step is mainly for the pollutants discharged by enterprises. Therefore, the enterprises in this embodiment are the key pollutant discharge units included in the pollution source monitoring center of the Ministry of Ecology and Environment. The preset environmental monitoring data system is provided by the government environmental protection department. The automatic monitoring work scheduling platform of the pollution source monitoring center of the Ministry of Ecology and Environment, or a third-party environmental monitoring data system. The environmental monitoring data system collected historical environmental monitoring data and real-time monitoring data from all pollution source monitoring points of each key sewage unit.
企业的污染源监测点一般设置在排水口处和排气口处,因此企业的污染物包括基于排水口监测的排水污染物、基于排气口监测的排气污染物,在对历史环境监测数据进行存储时,数据集按照排水污染物数据集和排气污染物数据集分类存储。The pollution source monitoring points of the enterprise are generally set at the drainage outlet and the exhaust outlet. Therefore, the pollutants of the enterprise include the drainage pollutants based on the drainage outlet monitoring and the exhaust pollutants based on the exhaust outlet monitoring. When stored, the data set is classified and stored according to the drainage pollutant data set and the exhaust pollutant data set.
步骤S2,筛选数据:对每项污染物的每个指标项设置指标阈值,根据指标阈值对每项数据集进行筛选,筛选出未超过指标阈值的数据设置为特征项,将特征项存储在未超标数据集中。Step S2, screening data: setting index thresholds for each index item of each pollutant, filtering each data set according to the index thresholds, filtering out data that does not exceed the index threshold value, setting them as feature items, and storing the feature items in the Out-of-standard data set.
在对数据集中的数据进行筛选前,针对污染物的每个指标项均设置对应的指标阈值。其中,排水污染物的指标项包括悬浮物指标、化学需氧量指标、PH值或氨氮指标中的至少一种指标项,对排水污染物的指标项均设有对应的指标阈值。排气污染物的指标项包括氮氧化物指标、二氧化硫指标、烟尘指标或一 氧化碳指标中的至少一种指标项,对排气污染物指标项均设有对应的指标阈值。本步骤中,污染物的指标项及对应指标阈值如下表1所示:Before filtering the data in the data set, a corresponding index threshold is set for each index item of the pollutant. Among them, the index items of drainage pollutants include at least one of suspended solids index, chemical oxygen demand index, pH value or ammonia nitrogen index, and corresponding index thresholds are set for the index items of drainage pollutants. The exhaust pollutant index items include at least one of nitrogen oxide index, sulfur dioxide index, soot index, or carbon monoxide index, and corresponding exhaust thresholds are set for the exhaust pollutant index items. In this step, the index items of pollutants and the corresponding index thresholds are shown in Table 1 below:
Figure PCTCN2018106682-appb-000001
Figure PCTCN2018106682-appb-000001
表1Table 1
对污染物设置阈值时,对于排气污染物,可参照《大气污染物综合排放标准》(GB 16297-1996),对于排水污染物,可参照《污水综合排放标准》(GB8978-1996),以便于更符合国家规定的污染物排放标准,筛查出更精确的未超标数据。When setting thresholds for pollutants, refer to the "Integrated Emission Standard for Air Pollutants" (GB 16297-1996) for exhaust pollutants, and refer to the "Integrated Sewage Emission Standards" (GB8978-1996) for drainage pollutants in order In order to better meet the national standards for pollutant discharge, the sieve finds out more accurate non-standard data.
本步骤根据预设的指标阈值,对每项数据集进行筛选,筛选出未超过阈值的数据,作为特征项合并存储在未超标数据集中。将特征项存储在未超标数据集中时,以每个企业的一项污染物对特征项进行分类存储在对应的未超标数据集中。即每个企业的排水污染物数据集筛选出的特征项存储为排水未超标数据集,每个企业的排气污染物数据集筛选出的特征项存储为排气未超标数据集。以便于后续训练孤立森林模型时,作为企业某一项污染物的样本使用。In this step, each data set is filtered according to a preset index threshold, and the data that does not exceed the threshold is screened out and stored as a feature item in a non-exceeding data set. When the feature items are stored in the non-exceeded data set, the feature items are classified with one pollutant per enterprise and stored in the corresponding non-exceeded data set. That is, the feature items selected by the drainage pollutant data set of each enterprise are stored as the drainage non-standard data set, and the feature items selected by the exhaust pollutant data set of each enterprise are stored as the exhaust non-standard data set. In order to facilitate the subsequent training of the isolated forest model, it can be used as a sample of a certain pollutant in the enterprise.
步骤S3,训练模型:采用未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的孤立森林模型。Step S3, training the model: using the feature terms in the non-exceeding data set to train an isolated forest model, and establishing a corresponding isolated forest model for each non-exceeding data set.
孤立森林模型,即Isolation Forest模型,是一个快速异常检测方法,具有线性时间复杂度和高精准度,是符合大数据处理要求的算法。孤立森林模型适用于连续数据(Continuous numerical data)的异常检测,将异常定义为“容 易被孤立的离群点”,可以理解为分布稀疏且离密度高的群体较远的点。用统计学来解释的话,即在数据空间里面,分布稀疏的区域表示数据发生在此区域的概率很低,因而可以认为落在这些区域里的数据是异常的。The isolated forest model, the Isolation Forest model, is a fast anomaly detection method with linear time complexity and high accuracy, and is an algorithm that meets the requirements of big data processing. The isolated forest model is suitable for continuous data anomaly detection. The anomaly is defined as “outliers that are easy to be isolated”, which can be understood as the points that are sparsely distributed and far from the densely populated group. In terms of statistics, that is, in the data space, a sparsely distributed area indicates that the probability of data occurring in this area is very low, so the data falling in these areas can be considered abnormal.
因此孤立森林模型基于上述原理,通过样本建立二分树:输入训练数据集A,e为当前的树高,l为树的高度限制。首先将A放在根节点中,随机选择A中的一个维度q,并在q上的最大值和最小值之间随机选择一个值p,将A中在q上比p大的样本流向右子节点中,比p小的样本流向左子节点。然后重复上述步骤直到:每个子节点中都只有一个样本或者多个相同的样本,即每个样本都被孤立了,或树的高度达到l。当采用上述方法建立树的时候,异常点更容易被孤立,因此其被孤立时所在的叶子节点的路径长度也较短,即从根节点到异常点所在叶子节点所经历过的边数较短,而正常点不容易被孤立,因此其路径长度也较长。Therefore, the isolated forest model is based on the above principles and builds a binary tree from the samples: the input training data set A, e is the current tree height, and l is the height limit of the tree. First place A in the root node, randomly select a dimension q in A, and randomly choose a value p between the maximum and minimum values on q, and flow the sample in A that is larger than p to q to the right. In the node, samples smaller than p flow to the left child node. Then repeat the above steps until: each child node has only one sample or multiple identical samples, that is, each sample is isolated, or the height of the tree reaches l. When using the above method to build a tree, the abnormal point is more easily isolated, so the path length of the leaf node where it is isolated is also shorter, that is, the number of edges experienced by the root node to the leaf node where the abnormal point is located is shorter. The normal point is not easy to be isolated, so its path length is also longer.
步骤S4,异常点汇总:从环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将实时监测数据输入与污染物对应的孤立森林模型中,通过实时监测数据达到孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Step S4: Summarization of abnormal points: Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, enter the real-time monitoring data into an isolated forest model corresponding to the pollutants, and achieve The path length from the root node to the leaf node of the isolated forest model determines whether it is an abnormal point, and summarizes the abnormal points.
本步骤基于孤立森林模型的异常点更容易被孤立,异常点被孤立时从根节点到异常点所在叶子节点所经历过的边数较短的特性,通过将实时监测数据输入孤立森林模型中,得到边上较短,即路径长度较短的一些数据设置为异常点,并将这些异常点进行汇总,实现污染源异常点的监测。This step is based on the feature that the outliers in the isolated forest model are easier to isolate, and the number of edges experienced by the outlier from the root node to the leaf node where the outlier is located is short. By inputting real-time monitoring data into the isolated forest model, Obtain short data on the side, that is, some data with short path length are set as abnormal points, and summarize these abnormal points to realize the monitoring of abnormal points of pollution sources.
本实施例,通过对预设的环境监测数据系统中获取历史环境监测数据,训练出较为精确的孤立森林模型,并采用此孤立森林模型对实时监测数据进行监测,监测出异常点并汇总,整个过程监测数据简单快捷,能较为准确的监测出企业各种污染源的超标情况并进行汇总,预防超标的产生。In this embodiment, a more accurate isolated forest model is trained by obtaining historical environmental monitoring data from a preset environmental monitoring data system, and the isolated forest model is used to monitor real-time monitoring data, and abnormal points are monitored and summarized. The process monitoring data is simple and fast, and can accurately monitor the excessive situation of various pollution sources in the enterprise and summarize them to prevent the occurrence of excessive standards.
在一个实施例中,步骤S3中,采用未超标数据集中的特征项训练孤立森林模型时,如图2所示,采用如下方法:In one embodiment, in step S3, when the isolated forest model is trained using the feature terms in the non-exceeded data set, as shown in FIG. 2, the following method is adopted:
步骤S301,取点构造特征:取未超标数据集中每N小时的特征项设置为一个点构造特征。由于未超标数据集中的数据可能较多,因此为了减少采用数据, 取未超标数据集中以小时为单位的特征项进行训练,即可以选择每小时的特征项或每2小时的特征项等设置为点构造特征。将这些点构造特征放入孤立森林模型的树的根节点。In step S301, a point structure feature is obtained: a feature item in every non-standard data set for every N hours is set as a point structure feature. Since there may be more data in the non-exceeded data set, in order to reduce the use of data, take the feature items in units of the non-exceeded data set for training, that is, you can select the hourly feature items or the feature items every 2 hours to set as Point construction features. These point construction features are put into the root node of the tree of the isolated forest model.
步骤S302,设差分阈值:每个点构造特征和前一个点构造特征之间的差分设差分阈值为X。差分阈值可以随机产生作为当前节点的切割点。In step S302, a difference threshold is set: a difference between each point structure feature and a previous point structure feature is set to a difference threshold of X. The difference threshold can be randomly generated as the cutting point of the current node.
步骤S303,构造树的左右子节点:两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点。In step S303, the left and right child nodes of the tree are constructed: the difference between the structural features of two adjacent points is less than X and is divided into the left child node of the tree, and the difference between X and X is divided into the right child node of the tree.
步骤S304,递归构造树:递归步骤S302和S303,不断构造左子节点和右子节点,直到满足以下条件:训练的未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,树的高度范围为:在包含n条记录的未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。Step S304, recursively construct the tree: recursively, steps S302 and S303, continuously construct left child nodes and right child nodes until the following conditions are met: the training non-exceeded data set has only one record or multiple identical records, or the height of the tree reaches a pre-set Set the height range. The height range of the tree is: in a non-exceeded data set containing n records, the minimum height of the constructed tree is log (n), and the maximum height of the constructed tree is n-1.
具体的,如图3所示,比如a、b、c、d四个点构造特征,放入孤立森林模型,遍历一颗树时,先将a、b、c、d四个点构造特征放入孤立森林模型的树的根节点。a、b、c三个点构造特征和前一个点构造特征之间的差分小于X,被分到树的左子节点,d点构造特征和前一个点构造特征之间的差分大于X,被分到树的右子节点。递归上述步骤,不断构造左子节点和右子节点,如图3所示,经过三次递归后,得到每个叶子节点均只有一条记录。可以看到,d点构造特征最早就被孤立,因此d点构造特征最有可能是异常点。Specifically, as shown in FIG. 3, for example, four points a, b, c, and d are constructed into an isolated forest model. When traversing a tree, the four points a, b, c, and d are constructed first. The root node of the tree into the isolated forest model. The difference between the three point structure features a, b, and c and the previous point structure feature is less than X, and is divided into the left child node of the tree. The difference between the point d structure feature and the previous point structure feature is greater than X, and To the right child node of the tree. The above steps are recursively, and the left child node and the right child node are continuously constructed. As shown in FIG. 3, after three recursions, each leaf node has only one record. It can be seen that the structural feature at point d was isolated at the earliest, so the structural feature at point d is most likely an anomaly.
本实施例在对未超标数据集中的特征项训练孤立森林模型时,首先对未超标数据集中的数据进行筛选,以确保训练出的孤立森林模型尽可能精确的前提下,减少数据采集量。在设置切割点时,采用差分阈值的形式,能有效反映出每N小时相邻的两个点构造特征之间的变化,最终得到的孤立森林模型作为异常点监控时,能更为可靠。In this embodiment, when training an isolated forest model on the feature items in the non-standard data set, the data in the non-standard data set is first filtered to ensure that the trained isolated forest model is as accurate as possible to reduce the data collection amount. When setting the cut point, the difference threshold is used to effectively reflect the changes between the structural features of two adjacent points every N hours. The isolated forest model finally obtained can be more reliable as an abnormal point monitoring.
在一个实施例中,步骤S4中,将实时监测数据输入与污染物对应的孤立森林模型中,通过实时监测数据达到孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点时,如图4所示,采用如下步骤:In one embodiment, in step S4, the real-time monitoring data is input into the isolated forest model corresponding to the pollutants, and the path length of the root node to the leaf node of the isolated forest model is determined by the real-time monitoring data to determine whether it is an abnormal point, such as As shown in Figure 4, the following steps are taken:
步骤S401,生成路径长度:将实时监测数据中的实时数据逐个输入对应的 孤立森林模型中,实时数据按照孤立森林模型被划分为M次后不再划分时,实时数据在孤立森林模型的根节点到叶子节点的路径长度为M。Step S401, generating a path length: real-time data in the real-time monitoring data is input into the corresponding isolated forest model one by one, and the real-time data is divided into M times according to the isolated forest model and is no longer divided, the real-time data is at the root node of the isolated forest model The path length to the leaf node is M.
具体的,如图3所示,如果a、b、c、d是实时监测数据中的四个实时数据,则a点构造特征的路径长度为2,b和c点构造特征的路径长度为3,d点构造特征的路径长度为1。Specifically, as shown in FIG. 3, if a, b, c, and d are the four real-time data in the real-time monitoring data, the path length of the structural feature at point a is 2, and the path length of the structural feature at point b and c is 3 The path length of the structural feature at point d is 1.
步骤S402,归一化处理:对实时数据的路径长度M做归一化处理,得到M’。In step S402, the normalization process is performed on the path length M of the real-time data to obtain M '.
归一化处理时,采用如下方式:When normalizing, the following methods are used:
首先统计所有树上待判断点x的平均路径,记为E(h(x)),E()表示平均,h(x)表示x的路径长度。假设实时监测数据中有n个点,n个点如果用一个孤立森林模型中树来搜索,n个点的平均路径长度c(n)=2H(n-1)-(2(n-1)/n)。H(k)=ln(k)+ξ,ξ=0.5772156649为欧拉常数,归一化后的s(x,n)为:First calculate the average path of the point x to be judged on all the trees, and record it as E (h (x)), E () represents the average, and h (x) represents the path length of x. Assume that there are n points in the real-time monitoring data. If the n points are searched with a tree in an isolated forest model, the average path length of the n points is c (n) = 2H (n-1)-(2 (n-1) / n). H (k) = ln (k) + ξ, ξ = 0.5772156649 is Euler's constant, and the normalized s (x, n) is:
Figure PCTCN2018106682-appb-000002
Figure PCTCN2018106682-appb-000002
其中,M=h(x),M’=s(x,n);Where M = h (x) and M '= s (x, n);
s(x,n)取值范围为[0,1]。The range of s (x, n) is [0,1].
步骤S403,异常点汇总:预设异常点阈值Y,当M’大于Y时,将实时数据设置为异常点,进行汇总,生成异常点汇总表。In step S403, the abnormal point summary is preset: the abnormal point threshold value Y is preset, and when M 'is greater than Y, the real-time data is set as the abnormal point, and summarized to generate an abnormal point summary table.
由于归一化后的M’取值范围为[0,1],异常点判断时,M’越接近1表示此实时数据是异常点的可能性高,M’越接近0表示此实时数据是正常点的可能性高,如果实时监测数据中的实时数据都接近于0.5,说明整个实时监测数据中都没有明显的异常点。因此,异常点阈值Y取值范围应大于0.5,靠近1的数值。Since the normalized M 'value range is [0,1], when judging an abnormal point, the closer M' is to 1, the higher the probability that this real-time data is an abnormal point, and the closer M 'is to 0, it means that the real-time data is The possibility of normal points is high. If the real-time data in the real-time monitoring data are close to 0.5, there are no obvious abnormal points in the entire real-time monitoring data. Therefore, the range of the abnormal point threshold Y should be greater than 0.5 and close to 1.
本实施例,在判断异常点时,引入归一化处理的方式,对某一实时数据的路径长度进行归一化,将路径长度变成一个无刚量的标量,以便于异常点的汇总和后续各个污染物的异常点比较。In this embodiment, when determining an abnormal point, a normalization method is introduced to normalize the path length of a certain real-time data, and change the path length into a scalar without rigidity, so as to facilitate the summary and sum of the abnormal points. Comparison of abnormal points of each subsequent pollutant.
在一个实施例中,提出了一种污染物异常监测系统,如图5所示,包括如下单元:In one embodiment, a pollutant abnormality monitoring system is proposed, as shown in FIG. 5, and includes the following units:
获取数据单元,设置为从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将历史环境监测数据以每个企业的一项污染物 设置为一项数据集进行存储;The acquisition data unit is configured to acquire historical environmental monitoring data of each enterprise's pollution source monitoring point from a preset environmental monitoring data system, and set the historical environmental monitoring data as one pollutant for each enterprise as a data set. storage;
筛选单元,设置为对每项污染物的每个指标项设置指标阈值,根据指标阈值对每项数据集进行筛选,筛选出未超过指标阈值的数据设置为特征项,将特征项存储在未超标数据集中;The screening unit is set to set an index threshold for each index item of each pollutant, and to filter each data set according to the index threshold, to filter out data that does not exceed the index threshold as a feature item, and to store the feature item in a non-exceeding standard Data set
训练单元,设置为采用未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的孤立森林模型;The training unit is set to train the isolated forest model by using the feature items in the non-exceeded data set, and establish a corresponding isolated forest model for each non-exceeded data set;
异常点汇总单元,设置为从环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将实时监测数据输入与污染物对应的孤立森林模型中,通过实时监测数据达到孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。The outlier summary unit is set to obtain real-time monitoring data of a pollutant in an enterprise from the environmental monitoring data system in units of hours, and input the real-time monitoring data into an isolated forest model corresponding to the pollutants. The path length from the root node to the leaf node of the isolated forest model determines whether it is an abnormal point, and summarizes the abnormal points.
在一个实施例中,获取数据单元中的污染物包括基于排水口监测的排水污染物、基于排气口监测的排气污染物,在对历史环境监测数据进行存储时,每个企业均存储排水污染物数据集和排气污染物数据集。In one embodiment, the pollutants in the data acquisition unit include drainage pollutants based on drainage port monitoring and exhaust pollutants based on exhaust port monitoring. When historical environmental monitoring data is stored, each enterprise stores drainage Pollutant dataset and exhaust pollutant dataset.
在一个实施例中,排水污染物的指标项包括悬浮物指标、化学需氧量指标、PH值或氨氮指标中的至少一种指标项,对排水污染物的指标项均设有对应的指标阈值;排气污染物的指标项包括氮氧化物指标、二氧化硫指标、烟尘指标或一氧化碳指标中的至少一种指标项,对排气污染物指标项均设有对应的指标阈值。In one embodiment, the index items of drainage pollutants include at least one index item of suspended solids index, chemical oxygen demand index, pH value or ammonia nitrogen index, and corresponding index thresholds are set for the index items of drainage pollutants. ; The exhaust pollutant index items include at least one of nitrogen oxide index, sulfur dioxide index, soot index or carbon monoxide index, and the exhaust pollutant index items are provided with corresponding index thresholds.
在一个实施例中,筛选单元,还设置为以每个企业的一项污染物对特征项进行分类存储在对应的未超标数据集中。In one embodiment, the screening unit is further configured to classify and store feature items with one pollutant per enterprise in a corresponding non-exceeding data set.
在一个实施例中,训练单元,包括:构造树的左右子节点模块,设置为取未超标数据集中每N小时的特征项设置为一个点构造特征,每个点构造特征和前一个点构造特征之间的差分设差分阈值为X,则两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点;In one embodiment, the training unit includes: left and right child node modules of the construction tree, which are set to take the feature items of every N hours in the non-exceeded data set and set as a point construction feature, each point construction feature and the previous point construction feature Set the difference threshold to X, then the difference between the structural features of two adjacent points is less than X is divided into the left child node of the tree, and the difference greater than or equal to X is divided into the right child node of the tree;
递归模块,设置为递归构造左子节点和右子节点,直到满足以下条件:训练的未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,树的高度范围为:在包含n条记录的未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。Recursive module, set to recursively construct left and right child nodes until the following conditions are met: the training non-standard data set has only one record or multiple identical records, or the height of the tree reaches a preset height range, and the height range of the tree For: In a non-exceeded data set containing n records, the minimum height of the constructed tree is log (n), and the maximum height of the constructed tree is n-1.
在一个实施例中,如图6所示,异常点汇总单元包括:In one embodiment, as shown in FIG. 6, the abnormal point summary unit includes:
生成路径长度模块,设置为将实时监测数据中的实时数据逐个输入对应的孤立森林模型中,实时数据按照孤立森林模型被划分为M次后不再划分时,实时数据在孤立森林模型的根节点到叶子节点的路径长度为M;Generate a path length module and set it to input the real-time data in the real-time monitoring data one by one into the corresponding isolated forest model. When the real-time data is divided into M times according to the isolated forest model and no longer divided, the real-time data is at the root node of the isolated forest model. The path length to the leaf node is M;
归一化处理模块,设置为对实时数据的路径长度M做归一化处理,得到M’;The normalization processing module is configured to perform normalization processing on the path length M of the real-time data to obtain M ′;
生成异常点汇总表模块,设置为预设异常点阈值Y,当M’大于Y时,将实时数据设置为异常点,进行汇总,生成异常点汇总表。The abnormal point summary table module is set to a preset abnormal point threshold value Y. When M ′ is greater than Y, the real-time data is set as an abnormal point, and summarized to generate an abnormal point summary table.
在一个实施例中,提出了一种计算机设备,包括存储器和处理器,存储器中存储有计算机可读指令,计算机可读指令被处理器执行时,使得处理器执行上述各实施例里污染物异常监测方法中的步骤。In one embodiment, a computer device is provided, which includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to execute the pollutant abnormality in the foregoing embodiments. Steps in a monitoring method.
在一个实施例中,提出了一种存储有计算机可读指令的存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述各实施例里污染物异常监测方法中的步骤。其中,存储介质可以为非易失性存储介质。In one embodiment, a storage medium storing computer-readable instructions is provided. When the computer-readable instructions are executed by one or more processors, the one or more processors are caused to execute the pollutant abnormality in each of the foregoing embodiments. Steps in a monitoring method. The storage medium may be a non-volatile storage medium.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁盘或光盘等。A person of ordinary skill in the art may understand that all or part of the steps in the various methods of the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The storage medium may include: Read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the embodiments described above can be arbitrarily combined. In order to simplify the description, all possible combinations of the technical features in the above embodiments have not been described. However, as long as there is no contradiction in the combination of these technical features, It should be considered as the scope described in this specification.
以上所述实施例仅表达了本申请一些示例性实施例,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express some exemplary embodiments of the present application, and their descriptions are more specific and detailed, but cannot be understood as a limitation on the scope of the patent of the present application. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, and these all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims (20)

  1. 一种污染物异常监测方法,包括:An abnormality monitoring method for pollutants, including:
    从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
    对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
    采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;Use the feature terms in the non-exceeded data set to train an isolated forest model, and establish the corresponding isolated forest model for each non-exceeded data set;
    从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
  2. 根据权利要求1所述的污染物异常监测方法,其中,所述污染物包括基于排水口监测的排水污染物、基于排气口监测的排气污染物,在对所述历史环境监测数据进行存储时,每个企业均存储排水污染物数据集和排气污染物数据集。The method for monitoring pollutant anomaly according to claim 1, wherein the pollutants include drainage pollutants based on drainage outlet monitoring and exhaust pollutants based on exhaust outlet monitoring, and the historical environmental monitoring data is stored At the time, each enterprise stored a drainage pollutant data set and an exhaust pollutant data set.
  3. 根据权利要求2所述的污染物异常监测方法,其中,所述排水污染物的指标项包括悬浮物指标、化学需氧量指标、PH值或氨氮指标中的至少一种指标项,对所述排水污染物的指标项均设有对应的指标阈值;The pollutant abnormality monitoring method according to claim 2, wherein the index items of the drainage pollutants include at least one index item of a suspended matter index, a chemical oxygen demand index, a pH value, or an ammonia nitrogen index. The indicators of drainage pollutants are set with corresponding indicator thresholds;
    所述排气污染物的指标项包括氮氧化物指标、二氧化硫指标、烟尘指标或一氧化碳指标中的至少一种指标项,对所述排气污染物指标项均设有对应的指标阈值。The index items of the exhaust pollutant include at least one of a nitrogen oxide index, a sulfur dioxide index, a soot index, or a carbon monoxide index, and a corresponding index threshold is set for each of the exhaust pollutant index items.
  4. 根据权利要求1所述的污染物异常监测方法,其中,所述将所述特征项存储在未超标数据集中,包括:以每个企业的一项污染物对所述特征项进行分类存储在对应的未超标数据集中。The method for monitoring anomalies of pollutants according to claim 1, wherein said storing the characteristic items in a non-exceeding data set comprises: classifying and storing the characteristic items with one pollutant per enterprise in a corresponding Of non-exceeded data sets.
  5. 根据权利要求1所述的污染物异常监测方法,其中,所述采用所述未超标数据集中的特征项训练所述孤立森林模型,包括:The method for monitoring pollutant anomaly according to claim 1, wherein the training of the isolated forest model by using the feature terms in the non-exceeding data set comprises:
    取所述未超标数据集中每N小时的特征项设置为一个点构造特征,每个所述点构造特征和前一个点构造特征之间的差分设差分阈值为X,则两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点;Taking the feature item every N hours in the non-exceeding data set as a point structure feature, and setting the difference threshold between each of the point structure feature and the previous point structure feature to X, two adjacent point structures are constructed Differences between features less than X are assigned to the left child of the tree, and differences greater than or equal to X are assigned to the right child of the tree;
    递归构造所述左子节点和所述右子节点,直到满足以下条件:Construct the left child node and the right child node recursively until the following conditions are met:
    训练的所述未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,所述树的高度范围为:在包含n条记录的所述未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。The non-exceeded data set for training has only one record or multiple identical records, or the height of the tree reaches a preset height range, and the height range of the tree is: in the non-exceeded data set containing n records, construct The minimum height of the tree is log (n), and the maximum height of the constructed tree is n-1.
  6. 根据权利要求1所述的污染物异常监测方法,其中,所述将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,包括:The pollutant abnormality monitoring method according to claim 1, wherein the real-time monitoring data is input into the isolated forest model corresponding to the pollutant, and the isolated forest model is reached through the real-time monitoring data. The length of the path from the root node to the leaf node determines whether it is an abnormal point, including:
    将所述实时监测数据中的实时数据逐个输入对应的所述孤立森林模型中,所述实时数据按照所述孤立森林模型被划分为M次后不再划分时,所述实时数据在所述孤立森林模型的根节点到叶子节点的路径长度为M;When the real-time data in the real-time monitoring data is input into the corresponding isolated forest model one by one, and the real-time data is divided into M times according to the isolated forest model and no longer divided, the real-time data is in the isolated The path length from the root node to the leaf node of the forest model is M;
    对所述实时数据的路径长度M做归一化处理,得到M’;Performing normalization processing on the path length M of the real-time data to obtain M ';
    预设异常点阈值Y,当M’大于Y时,将所述实时数据设置为异常点,进行汇总,生成异常点汇总表。The abnormal point threshold value Y is preset, and when M 'is greater than Y, the real-time data is set as an abnormal point, and summarized to generate an abnormal point summary table.
  7. 一种污染物异常监测系统,包括:A pollutant abnormality monitoring system includes:
    获取数据单元,设置为从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;The acquiring data unit is configured to acquire historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and set the historical environmental monitoring data to one data of one pollutant per enterprise Set for storage;
    筛选单元,设置为对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;The screening unit is configured to set an index threshold for each index item of each of the pollutants, filter each of the data sets according to the index threshold value, and filter out data that does not exceed the index threshold value as a feature item , Storing the feature items in a non-exceeding data set;
    训练单元,设置为采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;A training unit configured to train an isolated forest model by using the feature terms in the non-exceeded data set, and establish the corresponding isolated forest model for each non-exceeded data set;
    异常点汇总单元,设置为从所述环境监测数据系统中获取一企业中一项污 染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。The abnormal point summary unit is configured to obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, and input the real-time monitoring data into the isolated forest corresponding to the pollutant In the model, it is determined whether the path length from the root node to the leaf node of the isolated forest model is an abnormal point through the real-time monitoring data, and the abnormal points are summarized.
  8. 根据权利要求7所述的污染物异常监测系统,其中,所述获取数据单元中的所述污染物包括基于排水口监测的排水污染物、基于排气口监测的排气污染物,在对所述历史环境监测数据进行存储时,每个企业均存储排水污染物数据集和排气污染物数据集。The pollutant abnormality monitoring system according to claim 7, wherein the pollutants in the acquisition data unit include drainage pollutants based on monitoring of a drain outlet, and exhaust pollutants based on monitoring of an exhaust outlet. When the historical environmental monitoring data is stored, each enterprise stores a drainage pollutant data set and an exhaust pollutant data set.
  9. 根据权利要求8所述的污染物异常监测系统,其中,所述排水污染物的指标项包括悬浮物指标、化学需氧量指标、PH值或氨氮指标中的至少一种指标项,对所述排水污染物的指标项均设有对应的指标阈值;The pollutant abnormality monitoring system according to claim 8, wherein the index items of the drainage pollutants include at least one index item of a suspended matter index, a chemical oxygen demand index, a pH value, or an ammonia nitrogen index. The indicators of drainage pollutants are set with corresponding indicator thresholds;
    所述排气污染物的指标项包括氮氧化物指标、二氧化硫指标、烟尘指标或一氧化碳指标中的至少一种指标项,对所述排气污染物指标项均设有对应的指标阈值。The index items of the exhaust pollutant include at least one of a nitrogen oxide index, a sulfur dioxide index, a soot index, or a carbon monoxide index, and a corresponding index threshold is set for each of the exhaust pollutant index items.
  10. 根据权利要求7所述的污染物异常监测系统,其中,所述筛选单元,还设置为以每个企业的一项污染物对所述特征项进行分类存储在对应的未超标数据集中。The pollutant abnormality monitoring system according to claim 7, wherein the screening unit is further configured to classify and store the characteristic items with one pollutant per enterprise in a corresponding non-exceeding data set.
  11. 根据权利要求7所述的污染物异常监测系统,其中,所述训练单元,包括:The pollutant abnormality monitoring system according to claim 7, wherein the training unit comprises:
    构造树的左右子节点模块,设置为取所述未超标数据集中每N小时的特征项设置为一个点构造特征,每个所述点构造特征和前一个点构造特征之间的差分设差分阈值为X,则两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点;The left and right child node modules of the construction tree are set to take a feature item every N hours in the non-exceeding data set as a point structure feature, and a difference threshold is set for the difference between each of the point structure features and the previous point structure feature. Is X, the difference between the structural features of two adjacent points is less than X is divided into the left child node of the tree, and the difference greater than or equal to X is divided into the right child node of the tree;
    递归模块,设置为递归构造所述左子节点和所述右子节点,直到满足以下条件:训练的所述未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,所述树的高度范围为:在包含n条记录的所述未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。A recursive module configured to recursively construct the left child node and the right child node until the following conditions are met: the non-exceeded data set for training has only one record or multiple identical records, or the height of the tree reaches a preset height Range, the height range of the tree is: in the non-exceeding data set containing n records, the minimum height of the constructed tree is log (n), and the maximum height of the constructed tree is n-1.
  12. 根据权利要求7所述的污染物异常监测系统,其中,所述异常点汇总单元包括:The pollutant abnormality monitoring system according to claim 7, wherein the abnormal point summary unit comprises:
    生成路径长度模块,设置为将所述实时监测数据中的实时数据逐个输入对应的所述孤立森林模型中,所述实时数据按照所述孤立森林模型被划分为M次后不再划分时,所述实时数据在所述孤立森林模型的根节点到叶子节点的路径长度为M;A path length generating module is configured to input real-time data in the real-time monitoring data one by one into the corresponding isolated forest model, and when the real-time data is divided into M times according to the isolated forest model and no longer divided, The path length of the real-time data from the root node to the leaf node of the isolated forest model is M;
    归一化处理模块,设置为对所述实时数据的路径长度M做归一化处理,得到M’;A normalization processing module configured to perform normalization processing on the path length M of the real-time data to obtain M ';
    生成异常点汇总表模块,设置为预设异常点阈值Y,当M’大于Y时,将所述实时数据设置为异常点,进行汇总,生成异常点汇总表。The abnormal point summary table module is set to a preset abnormal point threshold value Y. When M 'is greater than Y, the real-time data is set as an abnormal point, and summarized to generate an abnormal point summary table.
  13. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:A computer device includes a memory and a processor. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor causes the processor to perform the following steps:
    从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
    对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
    采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数据集均建立对应的所述孤立森林模型;Use the feature terms in the non-exceeded data set to train an isolated forest model, and establish the corresponding isolated forest model for each non-exceeded data set;
    从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
  14. 根据权利要求13所述的计算机设备,其中,所述将所述特征项存储在未超标数据集中时,使得所述处理器执行以下步骤:The computer device according to claim 13, wherein the storing the feature items in a non-exceeded data set causes the processor to perform the following steps:
    以每个企业的一项污染物对所述特征项进行分类存储在对应的未超标数据集中。The characteristic items are classified by one pollutant of each enterprise and stored in the corresponding non-exceeding data set.
  15. 根据权利要求13所述的计算机设备,其中,所述采用所述未超标数据 集中的特征项训练所述孤立森林模型时,使得所述处理器执行以下步骤:The computer device according to claim 13, wherein the training of the isolated forest model by using the feature terms in the non-exceeded data set causes the processor to perform the following steps:
    取所述未超标数据集中每N小时的特征项设置为一个点构造特征,每个所述点构造特征和前一个点构造特征之间的差分设差分阈值为X,则两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点;Taking the feature item every N hours in the non-exceeding data set as a point structure feature, and setting the difference threshold between each of the point structure feature and the previous point structure feature to X, two adjacent point structures are constructed Differences between features less than X are assigned to the left child of the tree, and differences greater than or equal to X are assigned to the right child of the tree;
    递归构造所述左子节点和所述右子节点,直到满足以下条件:Construct the left child node and the right child node recursively until the following conditions are met:
    训练的所述未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,所述树的高度范围为:在包含n条记录的所述未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。The non-exceeded data set for training has only one record or multiple identical records, or the height of the tree reaches a preset height range, and the height range of the tree is: in the non-exceeded data set containing n records, construct The minimum height of the tree is log (n), and the maximum height of the constructed tree is n-1.
  16. 根据权利要求13所述的计算机设备,其中,所述将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点时,使得所述处理器执行以下步骤:The computer device according to claim 13, wherein the real-time monitoring data is input into the isolated forest model corresponding to the pollutant, and the real-time monitoring data is used to reach a root node of the isolated forest model. When determining whether the path length to a leaf node is an abnormal point, the processor is caused to perform the following steps:
    将所述实时监测数据中的实时数据逐个输入对应的所述孤立森林模型中,所述实时数据按照所述孤立森林模型被划分为M次后不再划分时,所述实时数据在所述孤立森林模型的根节点到叶子节点的路径长度为M;When the real-time data in the real-time monitoring data is input into the corresponding isolated forest model one by one, and the real-time data is divided into M times according to the isolated forest model and no longer divided, the real-time data is in the isolated The path length from the root node to the leaf node of the forest model is M;
    对所述实时数据的路径长度M做归一化处理,得到M’;Performing normalization processing on the path length M of the real-time data to obtain M ';
    预设异常点阈值Y,当M’大于Y时,将所述实时数据设置为异常点,进行汇总,生成异常点汇总表。The abnormal point threshold value Y is preset, and when M 'is greater than Y, the real-time data is set as an abnormal point, and summarized to generate an abnormal point summary table.
  17. 一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:A storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    从预设的环境监测数据系统中获取每个企业的污染源监测点的历史环境监测数据,将所述历史环境监测数据以每个企业的一项污染物设置为一项数据集进行存储;Acquiring historical environmental monitoring data of a pollution source monitoring point of each enterprise from a preset environmental monitoring data system, and storing the historical environmental monitoring data with one pollutant of each enterprise as a data set for storage;
    对每项所述污染物的每个指标项设置指标阈值,根据所述指标阈值对每项所述数据集进行筛选,筛选出未超过所述指标阈值的数据设置为特征项,将所述特征项存储在未超标数据集中;An index threshold is set for each index item of each of the pollutants, and each of the data sets is filtered according to the index threshold, and data that does not exceed the index threshold is set as a feature item, and the feature is set. Items are stored in non-exceeded data sets;
    采用所述未超标数据集中的特征项训练出孤立森林模型,对每项未超标数 据集均建立对应的所述孤立森林模型;An isolated forest model is trained by using the feature items in the non-exceeded data set, and the corresponding isolated forest model is established for each non-exceeded data set;
    从所述环境监测数据系统中获取一企业中一项污染物以小时为单位的实时监测数据,将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点,并将异常点进行汇总。Obtain real-time monitoring data of a pollutant in an enterprise in units of hours from the environmental monitoring data system, input the real-time monitoring data into the isolated forest model corresponding to the pollutant, and pass the real-time The monitoring data reaches the path length from the root node to the leaf node of the isolated forest model to determine whether it is an abnormal point, and the abnormal points are summarized.
  18. 根据权利要求17所述的存储介质,其中,所述将所述特征项存储在未超标数据集中时,使得一个或多个所述处理器执行以下步骤:The storage medium according to claim 17, wherein when storing the feature items in a non-exceeded data set, causes one or more of the processors to perform the following steps:
    以每个企业的一项污染物对所述特征项进行分类存储在对应的未超标数据集中。The characteristic items are classified by one pollutant of each enterprise and stored in the corresponding non-exceeding data set.
  19. 根据权利要求17所述的存储介质,其中,所述采用所述未超标数据集中的特征项训练所述孤立森林模型时,使得一个或多个所述处理器执行以下步骤:The storage medium according to claim 17, wherein when training the isolated forest model by using the feature terms in the non-exceeded data set, one or more of the processors are caused to perform the following steps:
    取所述未超标数据集中每N小时的特征项设置为一个点构造特征,每个所述点构造特征和前一个点构造特征之间的差分设差分阈值为X,则两个相邻点构造特征之间的差分小于X的被分到树的左子节点,差分大于等于X的被分到树的右子节点;Taking the feature item every N hours in the non-exceeding data set as a point structure feature, and setting the difference threshold between each of the point structure feature and the previous point structure feature to X, two adjacent point structures are constructed Differences between features less than X are assigned to the left child of the tree, and differences greater than or equal to X are assigned to the right child of the tree;
    递归构造所述左子节点和所述右子节点,直到满足以下条件:Construct the left child node and the right child node recursively until the following conditions are met:
    训练的所述未超标数据集只有一条记录或多条一样的记录,或者树的高度达到预设高度范围,所述树的高度范围为:在包含n条记录的所述未超标数据集中,构造的树的高度最小值为log(n),构造的树的高度最大值为n-1。The non-exceeded data set for training has only one record or multiple identical records, or the height of the tree reaches a preset height range, and the height range of the tree is: in the non-exceeded data set containing n records, construct The minimum height of the tree is log (n), and the maximum height of the constructed tree is n-1.
  20. 根据权利要求17所述的存储介质,其中,所述将所述实时监测数据输入与所述污染物对应的所述孤立森林模型中,通过所述实时监测数据达到所述孤立森林模型的根节点到叶子节点的路径长度判断是否为异常点时,使得一个或多个所述处理器执行以下步骤:The storage medium according to claim 17, wherein the real-time monitoring data is input into the isolated forest model corresponding to the pollutant, and the real-time monitoring data is used to reach a root node of the isolated forest model. When determining whether the path length to a leaf node is an abnormal point, one or more of the processors are caused to perform the following steps:
    将所述实时监测数据中的实时数据逐个输入对应的所述孤立森林模型中,所述实时数据按照所述孤立森林模型被划分为M次后不再划分时,所述实时数据在所述孤立森林模型的根节点到叶子节点的路径长度为M;When the real-time data in the real-time monitoring data is input into the corresponding isolated forest model one by one, and the real-time data is divided into M times according to the isolated forest model and no longer divided, the real-time data is in the isolated The path length from the root node to the leaf node of the forest model is M;
    对所述实时数据的路径长度M做归一化处理,得到M’;Performing normalization processing on the path length M of the real-time data to obtain M ';
    预设异常点阈值Y,当M’大于Y时,将所述实时数据设置为异常点,进行汇总,生成异常点汇总表。The abnormal point threshold value Y is preset, and when M 'is greater than Y, the real-time data is set as an abnormal point, and summarized to generate an abnormal point summary table.
PCT/CN2018/106682 2018-07-11 2018-09-20 Pollutant anomaly monitoring method and system, computer device, and storage medium WO2020010701A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810757268.8A CN108921440B (en) 2018-07-11 2018-07-11 Pollutant abnormity monitoring method, system, computer equipment and storage medium
CN201810757268.8 2018-07-11

Publications (1)

Publication Number Publication Date
WO2020010701A1 true WO2020010701A1 (en) 2020-01-16

Family

ID=64412682

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106682 WO2020010701A1 (en) 2018-07-11 2018-09-20 Pollutant anomaly monitoring method and system, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN108921440B (en)
WO (1) WO2020010701A1 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428886A (en) * 2020-04-10 2020-07-17 青岛聚好联科技有限公司 Fault diagnosis deep learning model self-adaptive updating method and device
CN112016050A (en) * 2020-08-07 2020-12-01 汉威科技集团股份有限公司 Manifold learning-based CEMS system abnormal data monitoring method
CN112084382A (en) * 2020-09-04 2020-12-15 安徽思环科技有限公司 Pretreatment method for three-dimensional fluorescence data of water quality of industrial park pollution source
CN112085081A (en) * 2020-09-02 2020-12-15 董萍 Sewage component detection method and system
CN112505278A (en) * 2020-11-30 2021-03-16 深圳市联正通达科技有限公司 Sampling formula sewage control analytical equipment
CN112597144A (en) * 2020-12-29 2021-04-02 农业农村部环境保护科研监测所 Automatic cleaning method for production area environment monitoring data
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN112860671A (en) * 2021-01-19 2021-05-28 中国石油天然气集团有限公司 Production factor data abnormity diagnosis method and device
CN113420652A (en) * 2021-06-22 2021-09-21 中冶赛迪重庆信息技术有限公司 Method, system, medium and terminal for recognizing abnormity of time sequence signal fragment
CN113777223A (en) * 2021-08-12 2021-12-10 北京金水永利科技有限公司 Atmospheric pollutant tracing method and system
CN113792988A (en) * 2021-08-24 2021-12-14 河北先河环保科技股份有限公司 Online monitoring data anomaly identification method for enterprise
CN114062038A (en) * 2020-07-31 2022-02-18 力合科技(湖南)股份有限公司 Pollution tracing management and control method
CN114236068A (en) * 2021-11-24 2022-03-25 中冶赛迪重庆信息技术有限公司 Chloride ion concentration analysis method and system based on circulating water system
CN114527206A (en) * 2022-01-25 2022-05-24 长安大学 Method and system for tracing groundwater pollution by sulfonamides antibiotics
CN114611616A (en) * 2022-03-16 2022-06-10 吕少岚 Unmanned aerial vehicle intelligent fault detection method and system based on integrated isolated forest
CN114931854A (en) * 2022-05-31 2022-08-23 北京实力伟业环保科技有限公司 System and method for purifying waste gas by microorganisms
CN115171362A (en) * 2022-09-07 2022-10-11 江西珉轩智能科技有限公司 Early warning method and system for prevention and control of key areas
CN116069892A (en) * 2023-03-27 2023-05-05 乳山市海洋经济发展中心 Environmental data processing method and system based on ocean engineering
CN116484153A (en) * 2023-06-20 2023-07-25 北京泰豪智能工程有限公司 Environment monitoring method based on satellite Internet of things
CN116500240A (en) * 2023-06-21 2023-07-28 江西索立德环保服务有限公司 Soil environment quality monitoring method, system and readable storage medium
CN116522270A (en) * 2023-07-04 2023-08-01 西安启迪能源技术有限公司 Data processing system for smart sponge city
CN116576553A (en) * 2023-07-11 2023-08-11 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN116627953A (en) * 2023-05-24 2023-08-22 首都师范大学 Method for repairing loss of groundwater level monitoring data
CN116699072A (en) * 2023-06-08 2023-09-05 东莞市华复实业有限公司 Environment early warning method based on detection cruising
CN116718249A (en) * 2023-08-08 2023-09-08 山东元明晴技术有限公司 Hydraulic engineering liquid level detection system
CN116992390A (en) * 2023-09-26 2023-11-03 北京联创高科信息技术有限公司 Configuration and display method of abnormal data
CN116992244A (en) * 2023-09-26 2023-11-03 山东益来环保科技有限公司 Intelligent monitoring system of cems
CN117194920A (en) * 2023-09-06 2023-12-08 万仁企业管理技术(深圳)有限公司 Data system processing platform and processing method based on big data analysis
CN117455124A (en) * 2023-12-25 2024-01-26 杭州烛微智能科技有限责任公司 Environment-friendly equipment monitoring method, system, medium and electronic equipment for enterprises
CN117454096A (en) * 2023-12-25 2024-01-26 西安高商智能科技有限责任公司 Motor production quality detection method and system
CN117538491A (en) * 2024-01-09 2024-02-09 武汉怡特环保科技有限公司 Station room air quality intelligent monitoring method and system
CN113777223B (en) * 2021-08-12 2024-04-30 北京金水永利科技有限公司 Atmospheric pollutant tracing method and system

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109828825A (en) * 2019-01-07 2019-05-31 平安科技(深圳)有限公司 Abnormal deviation data examination method, device, computer equipment and storage medium
CN109902721A (en) * 2019-01-28 2019-06-18 平安科技(深圳)有限公司 Outlier detection model verification method, device, computer equipment and storage medium
CN109785595A (en) * 2019-02-26 2019-05-21 成都古河云科技有限公司 A kind of vehicle abnormality track real-time identification method based on machine learning
CN110243599B (en) * 2019-07-02 2020-05-05 西南交通大学 Method for monitoring temperature abnormal state of multi-dimensional outlier train motor train unit axle box bearing
CN110398375B (en) * 2019-07-16 2021-10-19 广州亚美信息科技有限公司 Method, device, equipment and medium for monitoring working state of vehicle cooling system
CN110469522A (en) * 2019-08-13 2019-11-19 浪潮通用软件有限公司 A kind of method for detecting abnormality and device of drainage system
CN111160647B (en) * 2019-12-30 2023-08-22 第四范式(北京)技术有限公司 Money laundering behavior prediction method and device
CN111275547B (en) * 2020-03-19 2023-07-18 重庆富民银行股份有限公司 Wind control system and method based on isolated forest
CN111675257B (en) * 2020-06-16 2022-04-12 浙江富春紫光环保股份有限公司 Remote centralized control method and system for sewage treatment plant
CN111783904B (en) * 2020-09-04 2020-12-04 平安国际智慧城市科技股份有限公司 Data anomaly analysis method, device, equipment and medium based on environmental data
CN113420816A (en) * 2021-06-24 2021-09-21 北京市生态环境监测中心 Data abnormal value determination method for full-spectrum water quality monitoring equipment
CN113655111A (en) * 2021-08-17 2021-11-16 北京雪迪龙科技股份有限公司 Atmospheric volatile organic compound tracing method based on navigation monitoring
CN114417263B (en) * 2022-01-27 2022-10-04 中国环境科学研究院 Pollutant fluctuation coefficient determination method, pollutant monitoring method, pollutant fluctuation coefficient determination device, pollutant monitoring device and storage medium
CN116773238B (en) * 2023-06-16 2024-01-19 南方电网调峰调频发电有限公司检修试验分公司 Fault monitoring method and system based on industrial data
CN116933186B (en) * 2023-09-14 2023-11-24 江苏新路德建设有限公司 Sewage pipe network blocking real-time monitoring method based on data driving
CN116992391B (en) * 2023-09-27 2023-12-15 青岛冠宝林活性炭有限公司 Hard carbon process environment-friendly monitoring data acquisition and processing method
CN117436005B (en) * 2023-12-21 2024-03-15 山东汇力环保科技有限公司 Abnormal data processing method in automatic ambient air monitoring process

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020642A (en) * 2012-10-08 2013-04-03 江苏省环境监测中心 Water environment monitoring and quality-control data analysis method
CN104063609A (en) * 2014-07-01 2014-09-24 北京金控自动化技术有限公司 Method of assisting in judging pollution source monitoring data validity by utilizing neural network
CN106682685A (en) * 2016-12-06 2017-05-17 重庆大学 Microwave heating temperature field distribution characteristic deep learning-based local temperature variation anomaly detection method
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method
CN107292350A (en) * 2017-08-04 2017-10-24 电子科技大学 The method for detecting abnormality of large-scale data
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7866204B2 (en) * 2007-01-31 2011-01-11 The United States Of America As Represented By The Administrator Of The United States Environmental Protection Agency Adaptive real-time contaminant detection and early warning for drinking water distribution systems
US8954365B2 (en) * 2012-06-21 2015-02-10 Microsoft Corporation Density estimation and/or manifold learning
CN104091061B (en) * 2014-07-01 2017-04-26 北京金控数据技术股份有限公司 Method for using normal distribution for assisting in determining effectiveness of pollution source monitoring data
CN106485353B (en) * 2016-09-30 2019-11-29 中国科学院遥感与数字地球研究所 Air pollutant concentration forecasting procedure and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020642A (en) * 2012-10-08 2013-04-03 江苏省环境监测中心 Water environment monitoring and quality-control data analysis method
CN104063609A (en) * 2014-07-01 2014-09-24 北京金控自动化技术有限公司 Method of assisting in judging pollution source monitoring data validity by utilizing neural network
CN106682685A (en) * 2016-12-06 2017-05-17 重庆大学 Microwave heating temperature field distribution characteristic deep learning-based local temperature variation anomaly detection method
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method
CN106846806A (en) * 2017-03-07 2017-06-13 北京工业大学 Urban highway traffic method for detecting abnormality based on Isolation Forest
CN107292350A (en) * 2017-08-04 2017-10-24 电子科技大学 The method for detecting abnormality of large-scale data
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU, FEITONY ET AL., ISOLATION FOREST'' 2008 EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 31 December 2008 (2008-12-31), XP031423720 *

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428886B (en) * 2020-04-10 2023-08-04 青岛聚好联科技有限公司 Method and device for adaptively updating deep learning model of fault diagnosis
CN111428886A (en) * 2020-04-10 2020-07-17 青岛聚好联科技有限公司 Fault diagnosis deep learning model self-adaptive updating method and device
CN114062038A (en) * 2020-07-31 2022-02-18 力合科技(湖南)股份有限公司 Pollution tracing management and control method
CN112016050A (en) * 2020-08-07 2020-12-01 汉威科技集团股份有限公司 Manifold learning-based CEMS system abnormal data monitoring method
CN112016050B (en) * 2020-08-07 2023-11-21 汉威科技集团股份有限公司 CEMS system abnormal data monitoring method based on manifold learning
CN112085081A (en) * 2020-09-02 2020-12-15 董萍 Sewage component detection method and system
CN112085081B (en) * 2020-09-02 2024-02-02 西部第三方检测集团(宁夏)有限公司 Sewage component detection method and system
CN112084382A (en) * 2020-09-04 2020-12-15 安徽思环科技有限公司 Pretreatment method for three-dimensional fluorescence data of water quality of industrial park pollution source
CN112505278A (en) * 2020-11-30 2021-03-16 深圳市联正通达科技有限公司 Sampling formula sewage control analytical equipment
CN112597144A (en) * 2020-12-29 2021-04-02 农业农村部环境保护科研监测所 Automatic cleaning method for production area environment monitoring data
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN112860671A (en) * 2021-01-19 2021-05-28 中国石油天然气集团有限公司 Production factor data abnormity diagnosis method and device
CN113420652A (en) * 2021-06-22 2021-09-21 中冶赛迪重庆信息技术有限公司 Method, system, medium and terminal for recognizing abnormity of time sequence signal fragment
CN113777223A (en) * 2021-08-12 2021-12-10 北京金水永利科技有限公司 Atmospheric pollutant tracing method and system
CN113777223B (en) * 2021-08-12 2024-04-30 北京金水永利科技有限公司 Atmospheric pollutant tracing method and system
CN113792988A (en) * 2021-08-24 2021-12-14 河北先河环保科技股份有限公司 Online monitoring data anomaly identification method for enterprise
CN114236068A (en) * 2021-11-24 2022-03-25 中冶赛迪重庆信息技术有限公司 Chloride ion concentration analysis method and system based on circulating water system
CN114236068B (en) * 2021-11-24 2024-03-01 中冶赛迪信息技术(重庆)有限公司 Chloride ion concentration analysis method and system based on circulating water system
CN114527206A (en) * 2022-01-25 2022-05-24 长安大学 Method and system for tracing groundwater pollution by sulfonamides antibiotics
CN114611616A (en) * 2022-03-16 2022-06-10 吕少岚 Unmanned aerial vehicle intelligent fault detection method and system based on integrated isolated forest
CN114931854A (en) * 2022-05-31 2022-08-23 北京实力伟业环保科技有限公司 System and method for purifying waste gas by microorganisms
CN114931854B (en) * 2022-05-31 2023-05-26 北京实力伟业环保科技有限公司 System and method for purifying waste gas by microorganisms
CN115171362A (en) * 2022-09-07 2022-10-11 江西珉轩智能科技有限公司 Early warning method and system for prevention and control of key areas
CN116069892A (en) * 2023-03-27 2023-05-05 乳山市海洋经济发展中心 Environmental data processing method and system based on ocean engineering
CN116069892B (en) * 2023-03-27 2023-08-04 乳山市海洋经济发展中心 Environmental data processing method and system based on ocean engineering
CN116627953A (en) * 2023-05-24 2023-08-22 首都师范大学 Method for repairing loss of groundwater level monitoring data
CN116627953B (en) * 2023-05-24 2023-10-27 首都师范大学 Method for repairing loss of groundwater level monitoring data
CN116699072A (en) * 2023-06-08 2023-09-05 东莞市华复实业有限公司 Environment early warning method based on detection cruising
CN116699072B (en) * 2023-06-08 2024-01-26 东莞市华复实业有限公司 Environment early warning method based on detection cruising
CN116484153B (en) * 2023-06-20 2023-09-01 北京泰豪智能工程有限公司 Environment monitoring method based on satellite Internet of things
CN116484153A (en) * 2023-06-20 2023-07-25 北京泰豪智能工程有限公司 Environment monitoring method based on satellite Internet of things
CN116500240B (en) * 2023-06-21 2023-12-29 江西索立德环保服务有限公司 Soil environment quality monitoring method, system and readable storage medium
CN116500240A (en) * 2023-06-21 2023-07-28 江西索立德环保服务有限公司 Soil environment quality monitoring method, system and readable storage medium
CN116522270A (en) * 2023-07-04 2023-08-01 西安启迪能源技术有限公司 Data processing system for smart sponge city
CN116522270B (en) * 2023-07-04 2023-09-15 西安启迪能源技术有限公司 Data processing system for smart sponge city
CN116576553A (en) * 2023-07-11 2023-08-11 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN116576553B (en) * 2023-07-11 2023-09-22 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN116718249A (en) * 2023-08-08 2023-09-08 山东元明晴技术有限公司 Hydraulic engineering liquid level detection system
CN117194920A (en) * 2023-09-06 2023-12-08 万仁企业管理技术(深圳)有限公司 Data system processing platform and processing method based on big data analysis
CN116992244B (en) * 2023-09-26 2023-12-22 山东益来环保科技有限公司 Intelligent monitoring system of cems
CN116992390B (en) * 2023-09-26 2023-12-05 北京联创高科信息技术有限公司 Configuration and display method of abnormal data
CN116992244A (en) * 2023-09-26 2023-11-03 山东益来环保科技有限公司 Intelligent monitoring system of cems
CN116992390A (en) * 2023-09-26 2023-11-03 北京联创高科信息技术有限公司 Configuration and display method of abnormal data
CN117455124A (en) * 2023-12-25 2024-01-26 杭州烛微智能科技有限责任公司 Environment-friendly equipment monitoring method, system, medium and electronic equipment for enterprises
CN117454096B (en) * 2023-12-25 2024-03-01 西安高商智能科技有限责任公司 Motor production quality detection method and system
CN117455124B (en) * 2023-12-25 2024-03-08 杭州烛微智能科技有限责任公司 Environment-friendly equipment monitoring method, system, medium and electronic equipment for enterprises
CN117454096A (en) * 2023-12-25 2024-01-26 西安高商智能科技有限责任公司 Motor production quality detection method and system
CN117538491A (en) * 2024-01-09 2024-02-09 武汉怡特环保科技有限公司 Station room air quality intelligent monitoring method and system
CN117538491B (en) * 2024-01-09 2024-04-05 武汉怡特环保科技有限公司 Station room air quality intelligent monitoring method and system

Also Published As

Publication number Publication date
CN108921440B (en) 2022-08-05
CN108921440A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
WO2020010701A1 (en) Pollutant anomaly monitoring method and system, computer device, and storage medium
US11838308B2 (en) Computer-implemented method and arrangement for classifying anomalies
Рибка et al. Research into dynamics of setting the threshold and a probability of ignition detection by self-adjusting fire detectors
CN107391353A (en) Complicated software system anomaly detection method based on daily record
CN112183709B (en) Method for predicting and early warning excessive dioxin in waste incineration gas
CN110990393A (en) Big data identification method for abnormal data behaviors of industry enterprises
CN109543874B (en) Airport air quality prediction method combining meteorological condition influence
CN115077627B (en) Multi-fusion environmental data supervision method and supervision system
KR102549313B1 (en) Pollutant emission level calculation system and method
CN116151621A (en) Atmospheric pollution treatment risk detection system based on data analysis
CN116975378B (en) Equipment environment monitoring method and system based on big data
CN113435471A (en) Deep feature clustering high-emission mobile source pollution identification method and system
CN116663962A (en) Be used for hydraulic engineering dyke material quality detection analysis system
CN116881747B (en) Intelligent treatment method and system based on medical wastewater monitoring
CN109856321A (en) The determination method of abnormal high level point
US20160078071A1 (en) Large scale offline retrieval of machine operational information
TW201705035A (en) Method and system for rapidly screening information security risk hosts rapidly screening hosts with high hacking risks through various hacking indexes analyzed by a hacking risk analysis module
CN116797649A (en) Incineration treatment performance analysis method and system based on industrial big data
CN115098740B (en) Data quality detection method and device based on multi-source heterogeneous data source
CN105553990A (en) Network security triple anomaly detection method based on decision tree algorithm
CN113628423B (en) Harmful gas concentration monitoring and alarming system
CN113242213B (en) Power communication backbone network node vulnerability diagnosis method
CN115171362A (en) Early warning method and system for prevention and control of key areas
CN113590663A (en) Environment detection method and system
CN114677052A (en) Natural gas load fluctuation asymmetry analysis method and system based on TARCH model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18926224

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 21/04/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18926224

Country of ref document: EP

Kind code of ref document: A1