KR20120001116A

KR20120001116A - A system and method for diagnosis algorithm development at wastewater treatment plant operation

Info

Publication number: KR20120001116A
Application number: KR1020100061753A
Authority: KR
Inventors: 김창원; 최명원; 문태섭; 김예진; 김효수
Original assignee: 부산대학교 산학협력단
Priority date: 2010-06-29
Filing date: 2010-06-29
Publication date: 2012-01-04
Also published as: WO2012002713A9; WO2012002713A2; KR101237444B1

Abstract

PURPOSE: A system for diagnosing the operation of a sewage and wastewater treating plant and a method for the same are provided to accurately diagnose the operation of the sewage and wastewater treating plant regardless of human factors. CONSTITUTION: A system for diagnosing the operation of a sewage and wastewater treating plant includes a data collecting part(10), a data processing part(20), a data diagnosing part(30), a regulation obtaining part(40), and a data predicting part(50). The data collecting part collects data related to operational histories and water quality histories. The data processing part processes the data related to the operational histories and the water quality histories. The data diagnosing part applies diagnosing results to the processed data. The regulation obtaining part obtains diagnosing regulations based on a decision tree algorithm. The data predicting part obtains predicted diagnosis with respect to new data based on the diagnosing regulation.

Description

A system and method for diagnosis algorithm development at wastewater treatment plant operation

본 발명은 하폐수 처리장의 공정진단 시스템 및 방법에 관한 것으로, 이를 더욱 상세히 설명하면 하폐수 처리장에서 복합적인 인자가 고려된 공정의 진단결과가 정량적. 정성적으로 도출될 수 있어 정확하고 총괄적인 진단이 가능하며, 복합적인 인자가 고려된 공정의 진단이 일정 규칙에 의거 자동적으로 도출이 가능하므로 인적요소에 구애됨이 없으면서도 데이터가 가지고 있는 정보에 기반한 진단결과를 도출해 냄으로써 판단시마다 달라질 수 있는 운전자의 주관적인 판단에 의한 진단 결과의 오류를 감소시킬 수 있는 하폐수 처리장의 공정진단 시스템 및 방법에 관한 것이다.The present invention relates to a process diagnosis system and method of a wastewater treatment plant, which will be described in more detail. It can be derived qualitatively for accurate and comprehensive diagnosis, and the diagnosis of the process considering the complex factors can be automatically derived based on a certain rule. The present invention relates to a process diagnosis system and method of a sewage treatment plant that can reduce errors in the diagnosis result by the subjective judgment of the driver, which can vary from decision to judgment.

생물학적 하폐수 처리장의 공정은 유입수에 포함되어 있는 제거대상물질인 유기물, 질소 및 인을 활성슬러지에 의해 제거하는 공정으로 유기물을 산화시키기 위한 폭기비용과 일정량의 활성슬러지 보유를 위하여 폐기되는 슬러지 처리비용, 각종 약품공급에 소요되는 비용이 크다. 또한 유출수의 수질은 법적방류기준에 맞춰 유지되어야 하므로 활성슬러지의 제거 성능의 유지는 매우 중요하다고 할 수 있다. 이러한 하폐수 처리장의 공정에 있어 성능 유지와 비용의 최적화는 축적된 운전 노하우를 보유하고 있는 운전자에 의해 행해지며, 운전자는 매일 공정의 상태를 파악하고 공정의 바람직한 성능 유지 및 운전비용의 최적화를 위한 조치를 취해주게 되는데 여기서 공정의 상태를 파악하여 결론을 내리는 일련의 작업을 진단이라고 칭하게 된다.The process of biological wastewater treatment plant is a process to remove organic matter, nitrogen and phosphorus, which are included in the influent, by activated sludge, aeration cost for oxidizing organic matter and sludge disposal cost for retaining a certain amount of activated sludge, The cost of supplying various medicines is large. In addition, the water quality of the effluent must be maintained in accordance with the legal discharge standards, it is important to maintain the performance of activated sludge removal. In this process of wastewater treatment plant, the maintenance of performance and the optimization of cost are performed by the operator who has accumulated operation know-how, and the operator understands the status of the process every day and measures to maintain the desired performance of the process and optimize the operation cost. In this case, a series of tasks that determine the state of the process and draw conclusions are called diagnostics.

이러한 생물학적 하폐수 처리장의 공정은 활성슬러지의 성능의 변동과 유입수질의 변화가 항시 발생하는 복잡한 공정이며, 생물학적인 특성에 기인하여 활성슬러지의 성능의 변화를 관찰하는 것은 정량적인 측정 수단과 인간 운전자에 의한 정성적인 관찰에 의존하는 바가 크다. The biological sewage treatment plant is a complex process in which fluctuations in activated sludge performance and influent water quality occur all the time, and observing the change in activated sludge performance due to biological characteristics is a quantitative measurement tool and a human operator. It depends a lot on qualitative observation.

즉 효율적인 하폐수처리공정의 진단을 수행하기 위해서는 매일의 유입수질, 처리성능을 대변하는 유출수질의 정보, 슬러지의 침전능에 관한 측정지표, 생물학적 반응조에서 항시 모니터링되는 용존산소농도 및 pH, ORP 등과 같은 측정인자들뿐 아니라 슬러지의 발생량, 약품소모량 및 폭기량과 같은 비용적 인자들에 관한 정보가 필요한 것이며, 이러한 정보들을 근거로 공정의 진단에 있어서 경험많은 운전자의 판단이 게재가 되어야 한다. That is, in order to diagnose the effective sewage treatment process, measurement such as daily inflow water quality, outflow water quality representing the treatment performance, measurement index on sludge sedimentation capacity, dissolved oxygen concentration and pH, ORP, etc. which are always monitored in biological reactors Information on factors such as sludge generation, drug consumption and aeration is needed as well as factors, and based on this information, the judgment of an experienced operator should be published in the diagnosis of the process.

그러나, 현재는 이러한 총괄적인 데이터들을 대상으로 상관분석이나 평균과 중앙값, 표준편차 등을 계산하는 기초적인 기술통계방법을 이용한 진단 방법이 사용되거나 각 인자들의 높고 낮음을 파악하기 위해서는 일반적인 통계적 공정관리기법인 관리도기법에 기초하여 범위를 설정하고, 그 범위 안에 인자의 값이 존재하는지 아닌지를 단순로직에 의해 파악하여 인자의 값이 높다/낮다(예: SRT가 '짧다' 혹은 '보통이다' 혹은 '길다') 등을 진단해 주는 방법에 머물고 있다.However, at present, a general statistical process control technique is used to analyze the high or low values of each factor by using a correlation analysis, a diagnosis method using basic descriptive statistical methods that calculate mean, median, and standard deviation. The range is set based on control chart techniques, and by simple logic to determine whether a value exists in the range, the value of the factor is high / low (e.g. SRT is 'short' or 'normal' or 'long' ') Is staying in the way to diagnose the back.

한편 공정의 현재 상태에 대한 보편적인 진단, 즉 처리성능이 '좋다', '나쁘다' 및 '보통이다' 라는 진단 결론을 내려주기 위해 필요한 인자는 여러 개이며, 여러 개의 인자가 가지는 값을 복합적으로 파악하여 주기 위해서는 일련의 복합적인 규칙이 필요한데, 종래의 진단방법은 이러한 복합적 규칙에 대한 판단을 숙련된 운전자와의 인터뷰에 의한 규칙 작성에 의존하고 있어, 숙련된 운전자의 결원 시 문제가 있으며 지나치게 주관성의 의존하는 문제 등이 야기된다.On the other hand, several factors are needed to conclude the general diagnosis of the current state of the process, that is, the treatment performance is 'good', 'bad' and 'normal'. A series of complex rules is required to identify them. Conventional diagnostic methods rely on the rule-making by interviewing an experienced driver for judging such complex rules. The problem of dependence is caused.

본 발명은 상기의 문제점을 해결하기 위한 것으로, 하폐수처리장의 공정진단에 있어 총괄적이며 복합적인 진단이 가능하고, 이러한 진단이 규칙에 의거 자동적으로 도출되도록 하여 인적자원에 의존하지 않고 객관적인 진단이 가능한 하폐수처리장의 공정진단 시스템 및 방법을 제공하고자 한다.The present invention is to solve the above problems, comprehensive and complex diagnosis is possible in the process diagnosis of the wastewater treatment plant, such that the diagnosis is automatically derived based on the rules so that the objective diagnosis can be made without depending on human resources It is intended to provide a process diagnostic system and method for a treatment plant.

상기의 목적을 달성하기 위한 수단으로서,As a means for achieving the above object,

본 발명의 하폐수 처리장의 공정진단 시스템은, 하폐수 처리장의 공정진단 시스템에 있어서, 공정에 축적된 운전이력 및 수질이력들에 관한 데이터를 수집하는 데이터수집부와; 상기 데이터수집부에서 수집된 운전이력 및 수질이력에 관한 데이터를 가공하는 데이터가공부와; 상기 테이터가공부에서 가공된 데이터에 각각 진단결과를 부여하는 데이터진단부와; 상기 데이터진단부에서 가공된 데이터에 대한 각각의 진단결과를 바탕으로 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하는 규칙도출부와; 상기 규칙도출부에 의해 도출된 진단규칙에 의해 새로운 데이터에 대한 예측진단을 도출하는 데이터예측부;로 구성됨을 특징으로 한다. The process diagnosis system of the sewage treatment plant of the present invention includes a process collecting system for collecting data on operational and water quality histories accumulated in a process; A data processing unit for processing data relating to a driving history and a water quality history collected by the data collection unit; A data diagnosis unit for providing a diagnosis result to the data processed by the data processing unit; A rule derivation unit for deriving a diagnosis rule using a decision tree algorithm based on each diagnosis result of the data processed by the data diagnosis unit; And a data predictor for deriving predictive diagnosis of new data by the diagnostic rule derived by the rule extractor.

여기서 상기 운전이력이라함은 약품소모량, 폐슬러지 처리 비용, 폭기에 소요되는 비용 중 하나 이상을 포함하는 것으로 하폐수 처리장을 운영함에 있어 경제적인 운전을 진단하는데 필요한 데이터를 말하는 것이다. Here, the operation history refers to data necessary for diagnosing economical operation in operating a wastewater treatment plant, including one or more of chemical consumption, waste sludge treatment cost, and aeration cost.

또한, 상기 수질이력은 유입수, 유출수, 생물반응조 내의 제거대상 물질의 농도, 용존산소농도, PH값, 산화환원전위(ORP), 슬리지 침전능(SV30, SV1) 중 하나 이상을 포함하는 것으로 하폐수 처리장의 각종 센서, 자동분석기 등에 의하거나 실험적으로 측정되는 데이터를 말한다. In addition, the water history includes at least one of influent, effluent, the concentration of the substance to be removed in the bioreactor, dissolved oxygen concentration, PH value, redox potential (ORP), sludge sedimentation capacity (SV30, SV1) Refers to data measured or experimentally measured by sensors, automatic analyzers, etc.

상기 데이터가공부에서 데이터를 가공하는 것은 일정 시간 간격으로 수집된 데이터를 세트화 하는 것으로서, 일정 시간간격으로 수집된 데이터 세트의 평균값을 도출하는 것을 특징으로 한다.Processing the data in the data processing unit is to set the collected data at regular time intervals, characterized in that to derive an average value of the data set collected at a predetermined time interval.

상기 데이터 진단부는 K-means clustering 알고리즘을 이용하여 가공된 데이터에 각각 진단결과를 부여하는 것을 특징으로 한다. The data diagnosis unit may assign a diagnosis result to the processed data using the K-means clustering algorithm.

더욱 상세히 설명하면, 상기 데이터진단부는 가공된 데이터를 그룹핑 하는 그룹핑부와, 상기 그룹핑부에서 그룹핑 된 각각의 데이터들의 평균값을 분류하는 데어터정산부와, 상기 데이터정산부에서 정산된 평균값으로부터 분류기준값을 도출하는 기준값도출부로 구성됨을 특징으로 한다. In more detail, the data diagnosis unit may include a grouping unit for grouping the processed data, a data calculating unit for classifying an average value of respective data grouped by the grouping unit, and a classification reference value from the average value calculated by the data calculating unit. It is characterized by consisting of a reference value derivation unit for deriving.

한편 본 발명의 하폐수 처리장의 공정진단 방법은 하폐수 처리장의 공정진단 방법에 있어서, 공정에 축적된 운전이력 및 수질이력들에 관한 데이터를 수집하는 단계와; 상기 단계에서 수집된 운전이력 및 수질이력에 관한 데이터를 가공하는 단계와; 상기 단계에서 가공된 데이터에 진단결과를 부여하는 단계와; 상기 단계에서 가공된 데이터를 바탕으로 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하는 단계와; 상기 단계에서 도출된 진단규칙을 이용하여 예측진단을 수행하는 단계를 포함하여 이루어짐을 특징으로 한다. On the other hand, the process diagnosis method of the wastewater treatment plant of the present invention, the process diagnosis method of the wastewater treatment plant, comprising the steps of: collecting data on the operation history and water quality history accumulated in the process; Processing data relating to the operating history and the water quality history collected in the step; Assigning a diagnosis result to the data processed in the step; Deriving a diagnosis rule using a decision tree algorithm based on the processed data in the step; And performing a predictive diagnosis using the diagnostic rule derived in the above step.

상기에서 수집된 운전이력 및 수질이력에 관한 데이터를 가공하는 단계에는 수집된 데이터를 데이터세트로 분류하는 단계와 각각의 데이터세트에 단위를 선정하는 단계를 포함하는 것을 특징으로 한다. The processing of the collected data related to the operating history and the water quality history may include classifying the collected data into a data set and selecting a unit for each data set.

상기 각각의 데이터세트에 단위를 선정하는 단계에는 일정 시간간격으로 측정된 데이터세트를 분류하는 것을 포함하여 이루어짐을 특징으로 한다. Selecting a unit for each data set is characterized in that it comprises the classification of the data set measured at a predetermined time interval.

상기 단계에서 가공된 데이터에 진단결과를 부여하는 단계에는 각각 가공된 데이터에 진단결과를 K-means clustering 알고리즘을 사용하여 진단결과를 부여하는 것을 특징으로 한다.In the step of assigning a diagnosis result to the processed data in the step, it is characterized in that to give a diagnosis result to each processed data using the K-means clustering algorithm.

상기 단계에서 가공된 데이터에 진단결과를 부여하는 단계에는 진단하고자 하는 항목 및 결과를 선정하는 단계와, k개의 그룹핑을 하는 단계와, 각 그룹의 평균값을 구하는 단계와, 분류기준값을 정하는 단계를 포함하는 이루어짐을 특징으로 한다.The step of assigning a diagnosis result to the processed data in the step includes selecting the item and the result to be diagnosed, k grouping, obtaining an average value of each group, and determining a classification reference value Characterized in that made.

상기 단계에서 도출된 각각의 가공된 데이터의 분류기준값을 근거로,

,

(여기서 pi 는 S가 i 분류에 속하는 분율이며, A는 한 변수, Sv는 변수 A가 v라는 값을 가질 때의 S의 부분집합임)식으로 도출되는 지수에 의해 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하는 것을 특징으로 한다. On the basis of the classification reference value of each processed data derived in the above step,

,

(Where pi is the fraction of S belonging to class i, A is one variable, Sv is a subset of S when variable A has a value of v). It is characterized by deriving a diagnosis rule.

본 발명의 하폐수 처리장의 공정진단 시스템 및 방법은 복합적인 인자가 고려된 공정의 진단결과가 정량적. 정성적으로 도출될 수 있어 정확하고 총괄적인 진단이 가능한 장점이 있다. In the process diagnosis system and method of the wastewater treatment plant of the present invention, the diagnosis result of the process considering the complex factors is quantitative. Since it can be derived qualitatively, there is an advantage that accurate and comprehensive diagnosis is possible.

또한, 복합적인 인자가 고려된 공정의 진단이 일정 규칙에 의거 자동적으로 도출이 가능하므로 경험많은 운전자가 직접적으로 행하는 진단작업에 따른 진단결과와 동일하게 진단이 될 수 있어 인적요소에 구애됨이 없으면서도 데이터가 가지고 있는 정보에 기반한 진단결과를 도출해 냄으로써 판단시마다 달라질 수 있는 운전자의 주관적인 판단에 의한 진단 결과의 오류를 감소시킬 수 있어 객관적 의사결정지원이 가능한 장점이 있다. In addition, since the diagnosis of the process considering the complex factors can be automatically derived based on a certain rule, the diagnosis can be made in the same way as the diagnosis result according to the diagnosis work performed by an experienced operator directly. Also, by deriving the diagnosis result based on the information in the data, it is possible to reduce the error of the diagnosis result by the subjective judgment of the driver, which may vary from decision to decision, thereby providing objective decision support.

또한, 의사결정나무 알고리즘에 의해 진단결과를 도출함에 따라 이러한 진단결과를 나오게 되는 각각의 진단항목의 결과를 역추적할 수 있으므로 이에 대한 제어가 용이한 장점이 있다. In addition, since it is possible to trace back the result of each diagnosis item that comes out of the diagnosis result by deciding the diagnosis result by the decision tree algorithm, there is an advantage that it is easy to control.

도 1은 본 발명의 하폐수 처리장의 공정진단 시스템의 개략적 구성도를 나타내는 것이고,
도 2는 본 발명의 하폐수 처리장의 공정진단 방법을 나타내는 블럭도이고,
도 3은 본 발명에 의해 도출되는 진단규칙의 일 예를 나타내는 것이다. 1 is a schematic configuration diagram of a process diagnosis system of a wastewater treatment plant of the present invention,
Figure 2 is a block diagram showing a process diagnostic method of the wastewater treatment plant of the present invention,
3 shows an example of a diagnostic rule derived by the present invention.

이하, 도면 및 실시예를 통하여 본 발명을 보다 상세하게 설명한다. 하기의 설명은 본 발명의 구체적 일례에 대한 것이므로, 비록 단정적, 한정적 표현이 있더라도 특허청구범위로부터 정해지는 권리범위를 제한하는 것은 아니다.
Hereinafter, the present invention will be described in more detail with reference to the drawings and examples. The following descriptions are for specific examples of the present invention, but are not intended to limit the scope of the rights set forth in the claims, even if there is an assertive or limited expression.

도 1은 본 발명의 하폐수 처리장의 공정진단 시스템의 개략적 구성도를 나타내는 것이고, 도 2는 본 발명의 하폐수 처리장의 공정진단 방법을 나타내는 블럭도이고, 도 3은 본 발명에 의해 도출되는 진단규칙의 일 예를 나타내는 것이다.
1 is a schematic configuration diagram of a process diagnosis system of a wastewater treatment plant of the present invention, FIG. 2 is a block diagram showing a process diagnosis method of a wastewater treatment plant of the present invention, and FIG. 3 is a diagram illustrating a diagnostic rule derived by the present invention. It shows an example.

본 발명의 하폐수 처리장의 공정진단 시스템은, 도 1에서 보는 바와 같이 데이터수집부(10), 데이터가공부(20), 데이터진단부(30), 규칙도출부(40), 및 데이터예측부(50)로 구성되어 기존의 단순규칙 진단 시스템(방법)에 있어 '유출수 BOD'가 '높다' / '유출수 T-N'이 '보통이다' / '유출수 T-P'가 '낮다' 등 각각의 공정인자들이 가지는 값이 높고 낮음을 나열하는 데 그쳤던 반면에, 본 발명은 상기와 같은 구성에 기해 공정의 복합적이며 총괄적인 진단항목에 대한 규칙을 도출하여 이러한 진단규칙에 따라 '공정의 처리성능'이 '좋다'/ '에너지소비'가 '효율적이다' 등과 같은 총괄적인 진단결과가 도출되도록 하여 공정의 판단시마다 경험많은 운전자가 행하는 진단작업에 따른 결과와 유사한 진단결과가 도출되며, 데이터가 가지고 있는 정보에 기반한 진단결과를 도출해 냄으로써 운전자의 주관적인 판단에 의한 진단 결과의 오류를 감소시킬 수 있어 객관적 의사결정지원이 가능한 시스템을 제공하는 것이다. In the process diagnosis system of the wastewater treatment plant of the present invention, as shown in FIG. 1, the data collection unit 10, the data processing unit 20, the data diagnosis unit 30, the rule extracting unit 40, and the data prediction unit ( In the existing simple rule diagnosis system, the effluent BOD is high, the effluent T-N is normal, and the effluent T-P is low. On the contrary, the present invention derives the rules for the complex and comprehensive diagnosis items of the process based on the above configuration, and the process performance of the process according to the diagnosis rules. As a result of comprehensive diagnosis such as 'good' / 'energy consumption' is 'efficient', a diagnosis result similar to the result of a diagnostic operation performed by an experienced operator is determined at every process decision. Based on the information By deriving it, it is possible to reduce the error of the diagnosis result by the subjective judgment of the driver and to provide a system capable of objective decision support.

이하에서는 상기에서 언급한 구성에 대해 설명한다.Hereinafter, the configuration mentioned above will be described.

상기 데이터수집부(10)는 공정에 축적된 운전이력 및 수질이력들에 관한 데이터를 수집하는 기능을 하는 것이다. 여기서 공정의 운전이력 데이터란 약품 소모량, 폐슬러지 처리 비용, 폭기에 소요된 비용 등 공정의 경제적인 운전을 진단하는 데 필요한 인자 및 자료를 말한다. 또한 수질이력 데이터란 하루에 한번 혹은 일정한 시간 주기를 가지고 센서, 자동분석기 등을 이용하여 혹은 실험적으로 측정되는 유입수와 유출수, 생물 반응조 내에 존재하는 제거대상 물질의 농도와 생물 반응조 내의 활성슬러지의 상태를 대변하는 용존산소 농도, pH, 산화환원전위(ORP) 및 슬러지 침전능(SV30, SVI) 등의 데이터를 의미한다. The data collection unit 10 is to collect data on the operation history and the water quality history accumulated in the process. Here, the operation history data of the process refers to the factors and data necessary for diagnosing the economic operation of the process, such as chemical consumption, waste sludge treatment cost, and aeration cost. In addition, the water quality history data shows the inflow and outflow water, the concentration of the substance to be removed in the bioreactor, and the state of activated sludge in the bioreactor, which are measured once a day or at regular intervals by using a sensor, an automatic analyzer, or experimentally. It refers to data such as dissolved oxygen concentration, pH, redox potential (ORP) and sludge sedimentation capacity (SV30, SVI).

이렇게 데이터수집부(10)에서 수집된 데이터는 상기 데이터가공부(20)에 의해서 가공이 되는데, 상기 데이터가공부(20)는 수집된 데이터를 데어터세트로 분류하고, 그 다음으로 분류된 데이터세트에 단위를 선정하는 기능을 수행하는 바, 데이터세트에 단위를 선정한다는 것은 진단 작업을 수행할 시간 단위를 시간, 일간 혹은 주간으로 단위로 선정하였을 때, 그 선정된 시간 단위마다 진단에 사용할 일정 시간간격으로 수집된 데이터의 세트를 분류하는 것이다. 즉 데이터를 가공한다는 것은 수집된 데이터를 일정 시간간격으로 수집된 데이터를 세트화 한다는 것을 의미한다. 예를 들어, 하루에 한 번 진단작업을 수행한다고 한다면 이러한 데이터의 가공은 해당 시점의 24시간 전부터 해당 시점까지 수집된 일간 유입/유출 수질 측정 데이터와, 일일 공정 운전이력 사항들과, 1시간마다 측정되어 24개의 단위로 존재하는 반응조 내 용존산소 농도, pH, 그리고 ORP 값들의 일간 평균값 등으로 구성될 수 있는 것이다.The data collected by the data collection unit 10 is processed by the data processing unit 20. The data processing unit 20 classifies the collected data into a data set, and then classifies the data set. Selecting a unit in the data set means that selecting a unit in the dataset means that when a unit of time for performing a diagnostic task is selected as a unit of time, daily or weekly, a predetermined time to be used for diagnosis for each selected unit of time. To classify a set of data collected at intervals. In other words, processing the data means that the collected data is set at a predetermined time interval. For example, if you perform a diagnostic task once a day, the processing of this data can include daily inflow / outflow water quality measurement data collected daily from 24 hours up to that point, daily process history, and hourly The measured oxygen concentration, pH, and the daily average value of the ORP values in the reactor exist in 24 units.

이렇게 가공된 데이터는 상기 데이터진단부(30)에 의해 각각 진단결과가 부여된다. 상기 데이터진단부(30)는 가공된 데이터에 진단하고자 하는 항목 및 진단결과를 각각 부여하는 것으로, 이렇게 가공된 데이터에 진단하고자 하는 항목 및 결과를 부여하기 위해 다양한 구성(방법)이 사용될 수 있으나, 본 발명에서는 K-means clustering 알고리즘에 의해 통계적인 데이터 군집화 방법을 사용하여 가공된 데이터에 진단결과를 부여한다. The processed data is provided with a diagnosis result by the data diagnosis unit 30, respectively. The data diagnosis unit 30 is to assign the item and the diagnosis result to each of the processed data, and various configurations (methods) may be used to give the item and the result to be diagnosed to the processed data. In the present invention, a diagnostic result is given to the processed data using the statistical data clustering method by the K-means clustering algorithm.

상기에서 언급된 K-means 알고리즘은 임의의 데이터 집단을 K개의 그룹(Clerster)으로 나누어 주는 알고리즘을 말한다. 즉 주어진 Q개의 표본데이터 집합을 이용하여 K개의 그룹을 설정하고, 먼저 주어진 Q개의 데이터들 중에서 임의로 K개를 선택한 후 이 데이터를 K개의 그룹에 있어 center로 설정한 후에 남아있는 데이터들을 유클리드 거리로 가장 가까운 그룹에 배정한다. 이렇게 모든 데이터들이 배정된 후에는 각 K개의 그룹에 포함된 데이터들을 평균하여 새로운 center를 계산하고, 새로운 center에 대하여 Q개의 데이터를 다시 재배정 하여 이러한 과정을 반복하면서 새로운 center가 이전 center와 비교하여 변경되는 부분이 없을 경우까지 계속하여 각 그룹의 center을 확정하는 것이다. The K-means algorithm mentioned above refers to an algorithm that divides an arbitrary data group into K groups (Clerster). In other words, K groups are set up using the given Q sample data set, K is randomly selected from the given Q data sets, and this data is set as the center of K groups. Assign to the nearest group. After all data has been allocated, the new center is calculated by averaging the data contained in each K group, and the Q center is re-assigned for the new center and the process is repeated. Continue to determine the center of each group until nothing is done.

또한 상기에서 언급한 진단결과를 부여한다고 함은 다양한 예를 제시할 수 있으나, 그 일 예로 들어 진단하고자 하는 항목과 진단결과를 쌍으로 "유입수 부하"-"높다/보통이다/낮다", "유출수 수질"-"좋다/보통이다/나쁘다", "공정 에너지 소모정도"-"효율적/보통/비효율적"로 선정하였을 때, 각각의 가공된 데이터에 상기 세가지 항목으로 "유입수 부하", "유출수 수질", "공정 에너지 소모정도"라는 항목을 부여하고, 이러한 항목에 대한 상기 진단결과 "높다/보통이다/낮다" 등을 데이터 군집화 방법을 통해 부여(분류)하는 것이다. 상기 진단결과는 경험이 풍부한 운전자의 경험도 반영될 수 있다. In addition, the above-mentioned diagnosis result can be given various examples, but as an example, the item to be diagnosed and the diagnosis result are paired with "influent load"-"high / normal / low", "effluent water" Water quality "-" good / moderate / bad "," process energy consumption degree "-" efficient / moderate / inefficient ", each of the above three items in the processed data," influent load "," effluent water quality " , "Process energy consumption degree" is given, and the diagnosis result for these items "high / normal / low", etc. are assigned (classified) through the data clustering method. The diagnosis result may also reflect the experience of an experienced driver.

상기에서 언급한 K-means 알고리즘을 수행하기 위해 상기 데이터진단부(30)는 가공된 데이터를 그룹핑 하는 그룹핑부(31)와, 상기 그룹핑부(31)에서 그룹핑 된 각각의 가공된 데이터의 평균값을 분류하는 데어터정산부(32)와, 상기 데이터정산부에서 정산된 평균값으로부터 분류기준값을 도출하는 기준값도출부(33)로 구성된다. 즉 그룹핑부(31)에서 가공된 데이터를 K수만큼 그룹핑을 하는 것이고, 데이터정산부(32)에서 그룹핑부(31)에 의해 그룹핑 된 데이터들의 평균값을 도출하여 새로운 center값을 도출하는 것이며, 이렇게 그룹핑부(31)와 데이터정산부(32)가 반복적으로 작용하여 최종적인 K개의 그룹에 있어 center값을 정하게 되며, 이렇게 최종적인 center값이 도출되면 기준값도출부(33)는 분류기준값을 도출한다. 이렇게 도출된 분류기준값이 이하에서 설명할 진단항목 및 진단결과는 물론 이러한 진단항목 및 진단결과 이르는 각각의 인자 및 이러한 인자에 대한 진단결과에 대한 규칙에 근거가 되는 것이다. In order to perform the above-described K-means algorithm, the data diagnosis unit 30 may include a grouping unit 31 for grouping processed data and an average value of each processed data grouped at the grouping unit 31. And a reference value deriving unit 33 for deriving a classification reference value from the average value calculated by the data calculating unit. That is, the data processed by the grouping unit 31 is grouped by K number, and the data calculating unit 32 derives an average value of the data grouped by the grouping unit 31 to derive a new center value. The grouping unit 31 and the data calculating unit 32 repeatedly operate to determine the center values in the final K groups. When the final center values are derived, the reference value deriving unit 33 derives the classification reference values. . The classification criteria values thus derived are based on the diagnosis items and the diagnosis results described below, as well as the respective factors leading to the diagnosis items and the diagnosis results and the rules for the diagnosis results for these factors.

이렇게 가공된 데이터에 각각 진단결과가 부여되면 상기 규칙도출부(40)는 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하게 된다. 여기서 의사결정나무 알고리즘은 축적된 데이터로부터 지식이나 규칙을 추출해내는 방법을 통칭하는 데이터마이닝(Data Mining) 방법론 들 중의 하나로서, 귀납적 학습방법에 속하는 알고리즘으로 수집된 모형 추정용(Training) 데이터로부터 순환적 분할(Recursive Partitioning)방식을 이용하여 나무를 구축하는 기법으로 구축 되어진 의사결정 나무는 속성의 분리기준을 포함하는 내부마디(Internal nodes)와 최종분류를 의미하는 잎(Leaves)으로 구성되는 것으로, 다른 기법들에 비해 뛰어난 설명력을 지니고 있는데 이것은 분석을 통해 추출한 정보를 사용자가 쉽게 이해할 수 있는 나무모형 또는 IF-THEN 형식의 규칙으로 제공하기 때문으로 경영 및 시장분석, 고객분석 등에서부터 산업제조공정의 생산수율 혹은 품질향상을 위한 의사결정을 지원하는 도구로 폭넓게 응용되고 있는 것이다. 이러한 의사결정나무 알고리즘을 본 발명에서 도입하는 것은 하폐수처리 공정의 진단항목을 예를 들어 '유기물제거성능', '질소제거성능', '슬러지 침전능', '폭기에너지 소모량' 등에 대해 각각에 대해 진단결과로서 예를 들어 '유기물제거성능'이 '좋다'/'보통이다'/'나쁘다' 등을 생성해 내는데 있어서, '유기물제거성능'이라는 진단항목이 '좋다'/'보통이다'/'나쁘다'라는 진단결과를 도출하기까지 다양한 인자들에 의해 구성되는 일련의 복합적 규칙에 의해 이러한 진단결과가 도출되는데 이러한 진단결과가 도출됨에 있어 이를 위한 다양한 인자들에 있어서도 각각 진단결과가 있을 것이며, 이러한 진단결과들의 복합적 작용에 기해 최종적인 진단결과가 도출되는 것을 반영하기 위한 것이다. 예를 들어 각각의 인자 및 이의 진단결과로서 '유출수 BOD 농도'가 A mg/L 이상이고 'SRT'가 B day 이상이고 '폭기량'이 'C m3/day' 이상이면 최종적인 진단항목 및 진단결과로 '유기물제거성능'은 '나쁘다'라는 결론을 도출할 수 있게 되는 것이다. 즉 하폐수처리장에 있어 다양한 함수가 반영된 진단공정이 가능하게 되는 것이며, 또한 이러한 진단결과의 원인으로 다양한 인자가 어떻게 작용을 하고 있는지에 대한 결론도 얻을 수 있어 향후 대책의 수립이 용이하게 되는 것이다.When the diagnosis result is provided to each of the processed data, the rule derivation unit 40 derives a diagnosis rule using a decision tree algorithm. Here, the decision tree algorithm is one of the data mining methodologies collectively known as the method of extracting knowledge or rules from the accumulated data, and it is circulated from the training data collected by the algorithm belonging to the inductive learning method. The decision tree, which is constructed by the method of constructing a tree using recursive partitioning, is composed of internal nodes including attributes separation criteria and leaves, which means final classification. Compared with other techniques, it has excellent explanatory power. It provides information extracted through the analysis in a tree model or IF-THEN format rules that can be easily understood by users. Widely used as a tool to support decision making for production yield or quality improvement That will be for. Introducing the decision tree algorithm in the present invention is a diagnostic item of the wastewater treatment process, for example, for 'organic removal performance', 'nitrogen removal performance', 'sludge sedimentation ability', 'aeration energy consumption', etc. As a result of the diagnosis, for example, the organic matter removal performance is' good '/' normal '/' bad ', etc., the diagnosis item' organic removal performance 'is' good' / 'normal' / ' This diagnostic result is derived from a series of complex rules composed of various factors until the diagnosis result is 'bad'. In the case of such a diagnosis result, there will be a diagnosis result in various factors. This is to reflect the final diagnosis result based on the complex action of the diagnosis results. For example, if each effluent and its diagnosis result are 'effluent BOD concentration' of A mg / L or more, 'SRT' of B day or more, and 'aeration amount' of 'C m3 / day' or more, the final diagnosis items and diagnosis As a result, it can be concluded that 'organic removal performance' is 'bad'. In other words, it is possible to make a diagnosis process in which various functions are reflected in the sewage treatment plant, and also to conclude how various factors are acting as the cause of the diagnosis result, thereby facilitating the establishment of future countermeasures.

상기 규칙도출부(40)에 의해 도출된 진단규칙에 의해 데이터예측부(50)는 새로운 데이터에 대해 예측진단을 도출한다.The data predicting unit 50 derives the predictive diagnosis for the new data by the diagnostic rule derived by the rule extracting unit 40.

한편 본 발명은 도 2에서 보는 바와 같이 하폐수 처리장의 공정진단 방법을 제시하는 바, 이는 공정에 축적된 운전이력 및 수질이력들에 관한 데이터를 수집하는 단계(S10)와; 상기 단계(S10)에서 수집된 운전이력 및 수질이력에 관한 데이터를 가공하는 단계(S20)와; 상기 단계(S20)에서 가공된 데이터에 진단결과를 부여하는 단계(S30)와; 상기 단계(S30)에서 가공된 데이터를 바탕으로 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하는 단계(S40)와; 상기 단계(S40)에서 도출된 진단규칙을 이용하여 예측진단을 수행하는 단계(S50)를 포함하여 이루어진다.Meanwhile, the present invention provides a process diagnosis method of a sewage treatment plant as shown in FIG. 2, which includes collecting data on operation and water histories accumulated in a process (S10); Processing (S20) data relating to a driving history and a water quality history collected in the step S10; (S30) giving a diagnosis result to the data processed in the step (S20); Deriving a diagnosis rule using a decision tree algorithm based on the data processed in the step S30; It includes a step (S50) for performing a predictive diagnosis using the diagnostic rule derived in the step (S40).

상기에서 수집된 운전이력 및 수질이력에 관한 데이터를 가공하는 단계(S20)에는 수집된 데이터를 데이터세트로 분류하는 단계(S21)와 각각의 데이터세트에 단위를 선정하는 단계(S22)를 포함하는 것을 특징으로 한다. The processing of the collected operating and water quality data (S20) includes the step of classifying the collected data into a data set (S21) and selecting a unit for each data set (S22). It is characterized by.

상기 각각의 데이터세트에 단위를 선정하는 단계(S22)에는 일정 시간간격으로 데이터세트를 분류하되, 일정 시간간격으로 측정된 데이터와, 상기 일정 시간간격보다 작은 간격으로 측정된 데이터의 평균값을 선정하여 분류하는 것을 포함하여 이루어짐을 특징으로 한다. In step (S22) of selecting a unit for each data set, the data sets are classified at predetermined time intervals, and the average value of the data measured at predetermined time intervals and the measured data at intervals smaller than the predetermined time intervals is selected. Characterized in that it comprises a classification.

상기에서 언급한 가공된 데이터에 진단결과를 부여하는 단계(S30)에는 각각 가공된 데이터에 진단결과를 K-means clustering 알고리즘을 사용하여 진단결과를 부여하는 것을 특징으로 한다.In the step (S30) of assigning a diagnosis result to the above-mentioned processed data, the diagnosis result is assigned to the processed data by using a K-means clustering algorithm.

이를 더욱 상세히 설명하면 가공된 데이터에 진단결과를 부여하는 단계(S30)에는 진단하고자 하는 항목 및 결과를 선정하는 단계(S31)와, K개의 그룹핑을 하는 단계(S32)와, 각 그룹의 평균값을 구하는 단계(S33)와, 분류기준값을 정하는 단계(S33)를 포함하는 이루어지는 바, 분류기준값을 정하는 단계(S33) 전에는 K개의 그룹핑을 하는 단계(S32)와, 각 그룹의 평균값을 구하는 단계(S33)를 반복하여 최종적인 각 그룹의 평균값 즉 상기에서 언급한 center값이 도출되면, 분류기준값을 정하는 단계(S33)를 거치는 것이다. In more detail, the step (S30) of assigning a diagnosis result to the processed data includes selecting an item and a result to be diagnosed (S31), performing a grouping of K (S32), and an average value of each group. And a step S33 for determining the classification reference value, and before the step S33 for determining the classification reference value, step S32 for S grouping and step S33 for obtaining an average value for each group. ) Is repeated to determine the final mean value of each group, that is, the above-described center value, and to determine the classification reference value (S33).

특히 상기에서 언급한 가공된 데이터를 바탕으로 의사결정나무 알고리즘을 이용하여 진단규칙을 도출하는 단계(S40)에서는 전 단계(S30)에서 도출된 분류류기준값을 근거로 이하의 식 (1) 및 식 (2)에 의해 도출되는 지수에 의해 의사결정나무 알고리즘을 이용하여 진단규칙을 도출할 수 있다. In particular, in the step (S40) of deriving a diagnostic rule using a decision tree algorithm based on the processed data mentioned above, the following equations (1) and equations are based on the classification standard value derived in the previous step (S30). The index derived by (2) can be used to derive the diagnosis rule using the decision tree algorithm.

식(1)

Formula (1)

식(2)

Equation (2)

여기서 pi 는 S가 i 분류에 속하는 분율이며, A는 한 변수, Sv는 변수 A가 v라는 값을 가질 때의 S의 부분집합이다. Where pi is the fraction of S belonging to class i, A is a variable, and Sv is a subset of S when variable A has the value v.

상기 단계(S40)에서 진단규칙이 도출되면, 이러한 진단규칙의 검증 및 확정 단계를 거치는 것이 바람직한 바, 이러한 검증 및 확정단계에서는 해당 처리장에 대한 사전 지식이 풍부한 운전자 혹은 타당하다고 알려진 이론서의 내용과 규칙을 대조하여 규칙이 이론적으로도 타당한지 해당 진단결과를 도출하기 위해 참조하도록 도출된 변수(인자)들이 적합한지에 대해 검증 및 확정을 수행하는 것이 바람직하다. When the diagnosis rule is derived in step S40, it is preferable to go through the verification and confirmation step of the diagnosis rule. In the verification and confirmation step, the contents and rules of a driver who has abundant prior knowledge about the relevant plant or a known theory are valid. In contrast, it is advisable to verify and confirm that the rules are theoretically valid and that the derived variables (factors) are appropriate for reference in order to derive the relevant diagnostic results.

상기에서 언급한 도출된 진단규칙을 이용하여 예측진단을 수행하는 단계(S50)에서는 도출된 진단규칙으로 예측진단을 수행함에 있어, 새로이 운영 데이터가 얻어졌을 때, 이를 사전에 설정된 단위(가공에 있어)에 따라 수집 및 재정렬을 수행하고, 이를 진단규칙에 적용하여 예측진단을 수행하고 예측결과를 도출함으로써 본 진단규칙을 실제 현장 운영에 활용하는 것이다. In the step (S50) of performing the predictive diagnosis using the derived diagnostic rule mentioned above, in performing the predictive diagnosis using the derived diagnostic rule, when new operational data is obtained, the unit is set in advance (in processing). Collecting and rearranging, applying it to the diagnostic rule, performing the predictive diagnosis and deriving the prediction result, and using this diagnostic rule in actual field operation.

이하에서는 상기에서 설명한 구성(단계)에 기초하여 그 일 실시 예를 설명한다. Hereinafter, one embodiment will be described based on the configuration (step) described above.

본 실시 예에서는 해당 하폐수처리장의 유출수 상태 진단을 위해 일일 1번의 측정 결과인 유입수 및 유출수 수질 데이터들과 반응조 내 pH, DO, ORP 및 슬러지 폐기량 내부 반송률 등의 운전 데이터들을 수집하여 데이터 가공을 하였는 바, 이는 수집된 데이터를 일일 1세트 기준으로 세트화 하여 재정렬을 하였다. In the present embodiment, the inflow and outflow water quality data, which is a measurement result once a day, and operating data such as pH, DO, ORP, and sludge waste internal return rate were collected and processed to diagnose the outflow condition of the wastewater treatment plant. In this case, the collected data was rearranged by setting the set of data on a daily basis.

이후, 가공된 데이터에 진단결과를 부여함에 있어 유출수 각 항목의 농도가 높다/낮다를 진단하기로 선정하고, 통계적인 기법을 사용하기 위해 유출수 BOD, COD, SS, TN, TP가 입력 변수(input variable)로 하여 K-means clustering 분석을 수행하였다. 그 결과 각 그룹에서 각 변수들의 평균값이 구해졌고, 각 그룹 내 각 변수들의 평균값이 비교되었다. 비교를 통해 각 변수들의 평균 값 중 높은 평균값을 기준으로 하여 유출 수질 분포가 두 개의 그룹으로 재분류 되었다. 즉 아래 표 1에서 보는 바와 같이 높은 유출수질을 가지는 한 그룹과 낮은 유출수질을 가지는 다른 그룹으로 재분류가 되었는 바, 이는 K-means clustering에 의해 도출된 높은 평균값이 유출수질이 높은지 낮은지를 결정하기 위한 경계값 즉 분류기준값으로써 사용될 것을 의미한다. 즉 분류 기준값보다 높은 경우를 High, 낮은 경우를 Low로 진단하게 된다. Then, in assigning the diagnosis result to the processed data, the effluent BOD, COD, SS, TN, and TP are selected as input variables to diagnose high / low concentration of each item of effluent. K-means clustering analysis was performed. As a result, the mean value of each variable in each group was obtained, and the mean value of each variable in each group was compared. By comparison, the runoff distribution was reclassified into two groups based on the higher mean of each variable. In other words, as shown in Table 1 below, the classification was reclassified into one group having high effluent quality and the other having low effluent quality. This is to determine whether the high average value derived by K-means clustering is high or low effluent quality. It means that it will be used as the boundary value for the classification criteria. That is, a case where the value is higher than the classification reference value is diagnosed as high and a case where the value is lower than the classification reference value.

항목Item BODBOD CODCOD SS SS TNTN TP TP 그룹 1에서의 평균값Average value in group 1 5.35.3 14.514.5 3.93.9 12.30712.307 0.5920.592 그룹 2에서의 평균 값Average value in group 2 10.310.3 19.719.7 7.47.4 17.00217.002 0.4340.434 분류 기준 값Classifier value 10.310.3 19.719.7 7.47.4 17.00217.002 0.5920.592

K-means clustering에 의해 결정된 유출수 상태 분류를 위한 기준 값
Criteria for classification of runoff states determined by K-means clustering

이후, K-means clustering의 결과를 바탕으로 사전에 정의된 각 그룹에 대한 의사 결정나무를 생성하기 위해 의사결정나무 알고리즘이 적용되었다. Subsequently, a decision tree algorithm was applied to generate decision trees for each predefined group based on the results of K-means clustering.

앞에서 언급한 바와 같이 K-means clustering 분석을 통해 도출된 결과들을 활용하여 목표 변수가 사전 정의되었다. 문헌에서 보고되는 대표적 의사결정나무 알고리즘에는 CART, C4.5, ASSISTANT, CHAID, QUEST, RIPPER등이 있다. 본 실시예에서는 그 중 가장 일반적으로 사용되는 방법인 CART(Classification and Regression Tree) 알고리즘이 사용되었고, 이 알고리즘의 구현은 SPSS ANSWER TREE(ver 3.0)을 이용하여 이루어졌다. CART 알고리즘은 지니지수(Gini Index)를 가장 감소시켜 주는 방향으로 이지 분리(binary split)를 수행하는 알고리즘이다.As mentioned earlier, the target variables were predefined using the results from K-means clustering analysis. Representative decision tree algorithms reported in the literature include CART, C4.5, ASSISTANT, CHAID, QUEST, and RIPPER. In this embodiment, the most commonly used method, the CART (Classification and Regression Tree) algorithm was used, the implementation of this algorithm was made using SPSS ANSWER TREE (ver 3.0). The CART algorithm performs binary split in the direction of reducing the Gini Index.

대상 하수처리장으로부터 확보된 유입수질 및 부하, 운전조건, 슬러지 침전성과 관련된 데이터를 포함하는 104개의 데이터 셋을 입력으로 하여 CART 알고리즘을 통해 유출수의 BOD의 상태를 분류하는 의사결정나무가 도 3에서 보는 바와 같이 생성되었고, 아래와 같이 이들의 상태분류를 위한 IF-THEN규칙이 도출되었다. A decision tree for classifying the BOD state of the effluent through the CART algorithm is inputted with 104 data sets including data related to influent water quality, load, operating conditions, and sludge settling from the sewage treatment plant. The IF-THEN rules for classifying these states are derived as follows.

유출수 BOD상태는 x1(슬러지 폐기량), x2(반응조 온도), x3(내부 반송율), x4(반응조 pH), x5(BOD용적 부하)와 같은 5개의 변수(인자)들의 크기에 따라 High와 Low로 진단될 수 있다.The runoff BOD states are high and low depending on the size of the five variables (factors): x1 (sludge volume), x2 (reactor temperature), x3 (internal return rate), x4 (reactor pH), and x5 (BOD volume load). Can be diagnosed.

Rules for classification of effluent BOD state: Rules for classification of effluent BOD state:

Rule 1: IF x1≤2172.5 and x2≤15.35 and x3≤1.925, THEN effluent BOD is HighRule 1: IF x1≤2172.5 and x2≤15.35 and x3≤1.925, THEN effluent BOD is High

Rule 2: IF x1≤2172.5 and x2≤15.35 and x3 >1.925, THEN effluent BOD is LowRule 2: IF x1≤2172.5 and x2≤15.35 and x3> 1.925, THEN effluent BOD is Low

Rule 3: IF x1≤2172.5 and x2 >15.35, THEN effluent BOD is LowRule 3: IF x1≤2172.5 and x2> 15.35, THEN effluent BOD is Low

Rule 4: IF x1>2172.5 and x4≤6.79 and x5>0.206, THEN effluent BOD is HighRule 4: IF x1> 2172.5 and x4≤6.79 and x5> 0.206, THEN effluent BOD is High

Rule 5: IF x1>2172.5 and x4≤6.79 and x5≤0.206, THEN effluent BOD is LowRule 5: IF x1> 2172.5 and x4≤6.79 and x5≤0.206, THEN effluent BOD is Low

Rule 6: IF x1>2172.5 and x4>6.79, THEN effluent BOD is High Rule 6: IF x1> 2172.5 and x4> 6.79, THEN effluent BOD is High

상기 실시 예에서 보는 바와 같이 유출수 BOD상태에 대한 높다/낮다로 진단항목 및 진단결과를 설정하고, 이러한 결과가 도출되는 복합적 인자로서 슬러지 폐기량, 반응조 온도, 내부 반송율, 반응조 pH, BOD용적 부하 등에 대해서도 높다/낮다로 진단결과를 설정하여 각각의 항목에 대한 K-means clustering에 의해 높다/낮다의 판단근거로 분류기준값을 도출하고, 이렇게 분류기준값이 도출되면 의사결정나무 알고리즘을 이용하여 도 3과 같은 진단규칙을 도출하게 되는 것이며, 이러한 진단규칙에 의거 새로운 데이터로서 슬러지 폐기량, 반응조 온도, 내부 반송율, 반응조 pH, BOD용적 부하에 대한 데이터만 있다면 유출수의 BOD상태에 대한 높다/낮다를 예측진단(판단)할 수 있게 되는 것이며, 만약 유출수의 BOD상태가 높다면 이러한 유출수의 BOD상태에 대해 영향을 준 인자에 대한 추론이 가능하게 되는 것이며 이러한 추론에 의거 그 제어가 용이하게 되는 것이다. As shown in the above embodiment, the diagnostic items and the diagnosis results are set to high / low on the effluent BOD state, and the sludge waste amount, the reaction tank temperature, the internal return rate, the reaction tank pH, the BOD volume load, etc. are the complex factors from which these results are derived. By setting the diagnosis result as high / low, deriving the classification standard value based on the decision basis of high / low by K-means clustering for each item, and using the decision tree algorithm as shown in FIG. Based on the new data, sludge waste volume, reactor temperature, internal return rate, reactor pH, and BOD volume load are predicted to be high / low on the BOD status of the effluent. If the BOD state of the effluent is high, it affects the BOD state of the effluent. Which will enable reasoning about the given factor is that it is easy to control the basis of such inferences.

Claims

In the process diagnosis system of sewage water treatment plant,
A data collecting unit for collecting data on operation and water histories accumulated in the process;
A data processing unit for processing data relating to a driving history and a water quality history collected by the data collection unit;
A data diagnosis unit for providing a diagnosis result to the data processed by the data processing unit;
A rule derivation unit for deriving a diagnosis rule using a decision tree algorithm based on each diagnosis result of the processed data performed by the data diagnosis unit;
Process prediction system of the sewage treatment plant, characterized in that consisting of; a data prediction unit for deriving a predictive diagnosis for the new data by the diagnostic rule derived by the rule drawing unit.

The method of claim 1,
The operation history is a process diagnostic system of the sewage treatment plant, characterized in that it comprises one or more of the drug consumption, waste sludge treatment costs, aeration costs.

The method of claim 1,
The water quality history includes at least one of influent, effluent, the concentration of the substance to be removed in the bioreactor, dissolved oxygen concentration, PH value, redox potential (ORP), and sludge sedimentation capacity (SV30, SV1). Process diagnosis system of sewage treatment plant.

The method of claim 1,
Processing the data in the data processing unit is a process diagnostic system of sewage treatment plant, characterized in that to set the collected data at regular intervals.

The method of claim 1,
The data diagnosis unit is a process diagnostic system of the sewage treatment plant, characterized in that to give a diagnostic result to each of the processed data using the K-means clustering algorithm.

The method of claim 5,
The data diagnosis unit includes a grouping unit for grouping the processed data, a data calculating unit for classifying an average value of each processed data grouped by the grouping unit, and a reference value for deriving a classification reference value from the average value calculated at the data calculating unit. Process diagnosis system of sewage treatment plant, characterized in that the derivation unit.

In the process diagnostic method of the wastewater treatment plant,
Collecting data on operational and water histories accumulated in the process;
Processing data relating to the operating history and the water quality history collected in the step;
Assigning a diagnosis result to the data processed in the step;
Deriving a diagnosis rule using a decision tree algorithm based on the processed data to which the diagnosis result is assigned;
Process diagnosis method of the sewage treatment plant characterized in that it comprises the step of performing a predictive diagnosis using the diagnostic rules derived in the step.

The method of claim 7, wherein
The processing of the collected operating and water history data process comprises the steps of classifying the collected data into a data set and the step of selecting a unit for each data set process diagnostic method of the wastewater treatment plant.

The method of claim 8,
Selecting a unit for each data set comprises the step of classifying the data set at a predetermined time interval process diagnostic method of the wastewater treatment plant.

The method of claim 9,
In the step of assigning a diagnosis result to the processed data in the step, the diagnostic result is assigned to each processed data using the K-means clustering algorithm, the process diagnostic method of the wastewater treatment plant.

The method of claim 10,
In the step of assigning a diagnosis result to the processed data in the step, selecting the items and results to be diagnosed, k grouping, obtaining the average value of each group, and determining the classification reference value Process diagnostic method of wastewater treatment plant, characterized in that made.

The method of claim 10,
On the basis of the classification reference value of each processed data derived in the above step,

,

(Where pi is the fraction of S belonging to class i, A is one variable, Sv is a subset of S when variable A has a value of v). Process diagnostic system of sewage treatment plant, characterized in that the derivation of diagnostic rules.