WO2012002713A2

WO2012002713A2 - System and method for diagnosing the processes of a sewage and wastewater treatment plant

Info

Publication number: WO2012002713A2
Application number: PCT/KR2011/004728
Authority: WO
Inventors: 김창원; 최명원; 문태섭; 김예진; 김효수
Original assignee: 부산대학교 산학협력단
Priority date: 2010-06-29
Filing date: 2011-06-29
Publication date: 2012-01-05
Also published as: KR20120001116A; KR101237444B1; WO2012002713A9

Abstract

The present invention relates to a system and method for diagnosing the processes of a sewage and wastewater treatment plant. More particularly, the present invention relates to a system and method for diagnosing the processes of a sewage and wastewater treatment plant, which can obtain, in a quantitative and qualitative manner, the result of the diagnosis of a process performed in consideration of complex elements in a sewage and wastewater treatment plant, to thereby enable an accurate and overall diagnosis, and which can automatically perform the diagnosis of a process in consideration of complex elements in accordance with a predetermined rule, to thereby obtain the result of the diagnosis of a process based on the information contained in data irrespective of a human element, and thus reducing errors in the result of the diagnosis which might be caused by the subjective judgments of operators, which vary depending on the operator.

Description

Process diagnosis system and method of sewage treatment plant

The present invention relates to a process diagnosis system and method of a wastewater treatment plant, which will be described in more detail. It can be derived qualitatively for accurate and comprehensive diagnosis, and the diagnosis of the process considering the complex factors can be automatically derived based on a certain rule. The present invention relates to a process diagnosis system and method of a sewage treatment plant that can reduce errors in the diagnosis result by the subjective judgment of the driver, which can vary from decision to judgment.

The process of biological wastewater treatment plant is a process to remove organic matter, nitrogen and phosphorus, which are included in the influent, by activated sludge, aeration cost for oxidizing organic matter and sludge disposal cost for retaining a certain amount of activated sludge, The cost of supplying various medicines is large. In addition, the water quality of the effluent must be maintained in accordance with the legal discharge standards, it is important to maintain the performance of activated sludge removal. In this process of wastewater treatment plant, the maintenance of performance and the optimization of cost are performed by the operator who has accumulated operation know-how, and the operator understands the status of the process every day and measures to maintain the desired performance of the process and optimize the operation cost. In this case, a series of tasks that determine the state of the process and draw conclusions are called diagnostics.

The biological sewage treatment plant is a complex process in which fluctuations in activated sludge performance and influent water quality occur all the time, and observing the change in activated sludge performance due to biological characteristics is a quantitative measurement tool and a human operator. It depends a lot on qualitative observation.

That is, in order to diagnose the effective sewage treatment process, measurement such as daily inflow water quality, outflow water quality representing the treatment performance, measurement index on sludge sedimentation capacity, dissolved oxygen concentration and pH, ORP, etc. which are always monitored in biological reactors Information on factors such as sludge generation, drug consumption and aeration is needed as well as factors, and based on this information, the judgment of an experienced operator should be published in the diagnosis of the process.

However, at present, a general statistical process control technique is used to analyze the high or low values of each factor by using a correlation analysis, a diagnosis method using basic descriptive statistical methods that calculate mean, median, and standard deviation. The range is set based on control chart techniques, and by simple logic to determine whether a value exists in the range, the value of the factor is high / low (e.g. SRT is 'short' or 'normal' or 'long' ') Is staying in the way to diagnose the back.

On the other hand, several factors are needed to conclude the general diagnosis of the current state of the process, that is, the treatment performance is 'good', 'bad' and 'normal'. A series of complex rules is required to identify them. Conventional diagnostic methods rely on the rule-making by interviewing an experienced driver for judging such complex rules. The problem of dependence is caused.

The present invention is to solve the above problems, comprehensive and complex diagnosis is possible in the process diagnosis of the wastewater treatment plant, such that the diagnosis is automatically derived based on the rules so that the objective diagnosis can be made without depending on human resources It is intended to provide a process diagnostic system and method for a treatment plant.

As a means for achieving the above object,

The process diagnosis system of the sewage treatment plant of the present invention includes a process collecting system for collecting data on operational and water quality histories accumulated in a process; A data processing unit for processing data relating to a driving history and a water quality history collected by the data collection unit; A data diagnosis unit for providing a diagnosis result to the data processed by the data processing unit; A rule derivation unit for deriving a diagnosis rule using a decision tree algorithm based on each diagnosis result of the data processed by the data diagnosis unit; And a data predictor for deriving predictive diagnosis of new data by the diagnostic rule derived by the rule extractor.

Here, the operation history refers to data necessary for diagnosing economical operation in operating a wastewater treatment plant, including one or more of chemical consumption, waste sludge treatment cost, and aeration cost.

In addition, the water history includes at least one of influent, effluent, the concentration of the substance to be removed in the bioreactor, dissolved oxygen concentration, PH value, redox potential (ORP), sludge sedimentation capacity (SV30, SV1) Refers to data measured or experimentally measured by sensors, automatic analyzers, etc.

Processing the data in the data processing unit is to set the collected data at regular time intervals, characterized in that to derive an average value of the data set collected at a predetermined time interval.

The data diagnosis unit may assign a diagnosis result to the processed data using the K-means clustering algorithm.

In more detail, the data diagnosis unit may include a grouping unit for grouping the processed data, a data calculating unit for classifying an average value of respective data grouped by the grouping unit, and a classification reference value from the average value calculated by the data calculating unit. It is characterized by consisting of a reference value derivation unit for deriving.

On the other hand, the process diagnosis method of the wastewater treatment plant of the present invention, the process diagnosis method of the wastewater treatment plant, comprising the steps of: collecting data on the operation history and water quality history accumulated in the process; Processing data relating to the operating history and the water quality history collected in the step; Assigning a diagnosis result to the data processed in the step; Deriving a diagnosis rule using a decision tree algorithm based on the processed data in the step; And performing a predictive diagnosis using the diagnostic rule derived in the above step.

The processing of the collected data related to the operating history and the water quality history may include classifying the collected data into a data set and selecting a unit for each data set.

Selecting a unit for each data set is characterized in that it comprises the classification of the data set measured at a predetermined time interval.

In the step of assigning a diagnosis result to the processed data in the step, it is characterized in that to give a diagnosis result to each processed data using the K-means clustering algorithm.

The step of assigning a diagnosis result to the processed data in the step includes selecting the item and the result to be diagnosed, k grouping, obtaining an average value of each group, and determining a classification reference value Characterized in that made.

On the basis of the classification reference value of each processed data derived in the above step,

,

(Where pi is the fraction of S belonging to class i, A is one variable, Sv is a subset of S when variable A has a value of v). It is characterized by deriving a diagnosis rule.

In the process diagnosis system and method of the wastewater treatment plant of the present invention, the diagnosis result of the process considering the complex factors is quantitative. Since it can be derived qualitatively, there is an advantage that accurate and comprehensive diagnosis is possible.

In addition, since the diagnosis of the process considering the complex factors can be automatically derived based on a certain rule, the diagnosis can be made in the same way as the diagnosis result according to the diagnosis work performed by an experienced operator directly. Also, by deriving the diagnosis result based on the information in the data, it is possible to reduce the error of the diagnosis result by the subjective judgment of the driver, which may vary from decision to decision, thereby providing objective decision support.

In addition, since it is possible to trace back the result of each diagnosis item that comes out of the diagnosis result by deciding the diagnosis result by the decision tree algorithm, there is an advantage that it is easy to control.

1 is a schematic configuration diagram of a process diagnosis system of a wastewater treatment plant of the present invention,

Figure 2 is a block diagram showing a process diagnostic method of the wastewater treatment plant of the present invention,

3 shows an example of a diagnostic rule derived by the present invention.

In order to diagnose the effluent condition of the sewage treatment plant, we collected and processed the inflow and effluent water quality data, which is the measurement result once a day, and operation data such as pH, DO, ORP, and sludge waste internal return rate in the reactor. The data was rearranged by setting the data on a daily basis.

Then, in assigning the diagnosis result to the processed data, the effluent BOD, COD, SS, TN, and TP are selected as input variables to diagnose high / low concentration of each item of effluent. K-means clustering analysis was performed.

Hereinafter, the present invention will be described in more detail with reference to the drawings and examples. The following descriptions are for specific examples of the present invention, but are not intended to limit the scope of the rights set forth in the claims, even if there is an assertive or limited expression.

1 is a schematic configuration diagram of a process diagnosis system of a wastewater treatment plant of the present invention, FIG. 2 is a block diagram showing a process diagnosis method of a wastewater treatment plant of the present invention, and FIG. 3 is a diagram illustrating a diagnostic rule derived by the present invention. It shows an example.

In the process diagnosis system of the wastewater treatment plant of the present invention, as shown in FIG. 1, the data collection unit 10, the data processing unit 20, the data diagnosis unit 30, the rule extracting unit 40, and the data prediction unit ( In the existing simple rule diagnosis system, the effluent BOD is high, the effluent T-N is normal, and the effluent T-P is low. On the contrary, the present invention derives the rules for the complex and comprehensive diagnosis items of the process based on the above configuration, and the process performance of the process according to the diagnosis rules. As a result of comprehensive diagnosis such as 'good' / 'energy consumption' is 'efficient', a diagnosis result similar to the result of a diagnostic operation performed by an experienced operator is determined at every process decision. Based on the information By deriving it, it is possible to reduce the error of the diagnosis result by the subjective judgment of the driver and to provide a system capable of objective decision support.

Hereinafter, the configuration mentioned above will be described.

The data collection unit 10 is to collect data on the operation history and the water quality history accumulated in the process. Here, the operation history data of the process refers to the factors and data necessary for diagnosing the economic operation of the process, such as chemical consumption, waste sludge treatment cost, and aeration cost. In addition, the water quality history data shows the inflow and outflow water, the concentration of the substance to be removed in the bioreactor, and the state of activated sludge in the bioreactor, which are measured once a day or at regular intervals by using a sensor, an automatic analyzer, or experimentally. It refers to data such as dissolved oxygen concentration, pH, redox potential (ORP) and sludge sedimentation capacity (SV30, SVI).

The data collected by the data collection unit 10 is processed by the data processing unit 20. The data processing unit 20 classifies the collected data into a data set, and then classifies the data set. Selecting a unit in the data set means that selecting a unit in the dataset means that when a unit of time for performing a diagnostic task is selected as a unit of time, daily or weekly, a predetermined time to be used for diagnosis for each selected unit of time. To classify a set of data collected at intervals. In other words, processing the data means that the collected data is set at a predetermined time interval. For example, if you perform a diagnostic task once a day, the processing of this data can include daily inflow / outflow water quality measurement data collected daily from 24 hours up to that point, daily process history, and hourly The measured oxygen concentration, pH, and the daily average value of the ORP values in the reactor exist in 24 units.

The processed data is provided with a diagnosis result by the data diagnosis unit 30, respectively. The data diagnosis unit 30 is to assign the item and the diagnosis result to each of the processed data, and various configurations (methods) may be used to give the item and the result to be diagnosed to the processed data. In the present invention, a diagnostic result is given to the processed data using the statistical data clustering method by the K-means clustering algorithm.

The K-means algorithm mentioned above refers to an algorithm that divides an arbitrary data group into K groups (Clerster). In other words, K groups are set up using the given Q sample data set, K is randomly selected from the given Q data sets, and this data is set as the center of K groups. Assign to the nearest group. After all data has been allocated, the new center is calculated by averaging the data contained in each K group, and the Q center is re-assigned for the new center and the process is repeated. Continue to determine the center of each group until nothing is done.

In addition, the above-mentioned diagnosis result can be given various examples, but as an example, the item to be diagnosed and the diagnosis result are paired with "influent load"-"high / normal / low", "effluent water" Water quality "-" good / moderate / bad "," process energy consumption degree "-" efficient / moderate / inefficient ", each of the above three items in the processed data," influent load "," effluent water quality " , "Process energy consumption degree" is given, and the diagnosis result for these items "high / normal / low", etc. are assigned (classified) through the data clustering method. The diagnosis result may also reflect the experience of an experienced driver.

In order to perform the above-described K-means algorithm, the data diagnosis unit 30 may include a grouping unit 31 for grouping processed data and an average value of each processed data grouped at the grouping unit 31. And a reference value deriving unit 33 for deriving a classification reference value from the average value calculated by the data calculating unit. That is, the data processed by the grouping unit 31 is grouped by K number, and the data calculating unit 32 derives an average value of the data grouped by the grouping unit 31 to derive a new center value. The grouping unit 31 and the data calculating unit 32 repeatedly operate to determine the center values in the final K groups. When the final center values are derived, the reference value deriving unit 33 derives the classification reference values. . The classification criteria values thus derived are based on the diagnosis items and the diagnosis results described below, as well as the respective factors leading to the diagnosis items and the diagnosis results and the rules for the diagnosis results for these factors.

When the diagnosis result is provided to each of the processed data, the rule derivation unit 40 derives a diagnosis rule using a decision tree algorithm. Here, the decision tree algorithm is one of the data mining methodologies collectively known as the method of extracting knowledge or rules from the accumulated data, and it is circulated from the training data collected by the algorithm belonging to the inductive learning method. The decision tree, which is constructed by the method of constructing a tree using recursive partitioning, is composed of internal nodes including attributes separation criteria and leaves, which means final classification. Compared with other techniques, it has excellent explanatory power. It provides information extracted through the analysis in a tree model or IF-THEN format rules that can be easily understood by users. Widely used as a tool to support decision making for production yield or quality improvement That will be for. Introducing the decision tree algorithm in the present invention is a diagnostic item of the wastewater treatment process, for example, for 'organic removal performance', 'nitrogen removal performance', 'sludge sedimentation ability', 'aeration energy consumption', etc. As a result of the diagnosis, for example, the organic matter removal performance is' good '/' normal '/' bad ', etc., the diagnosis item' organic removal performance 'is' good' / 'normal' / ' This diagnostic result is derived from a series of complex rules composed of various factors until the diagnosis result is 'bad'. In the case of such a diagnosis result, there will be a diagnosis result in various factors. This is to reflect the final diagnosis result based on the complex action of the diagnosis results. For example, if each effluent and its diagnosis result are 'effluent BOD concentration' of A mg / L or more, 'SRT' of B day or more, and 'aeration amount' of 'C m3 / day' or more, the final diagnosis items and diagnosis As a result, it can be concluded that 'organic removal performance' is 'bad'. In other words, it is possible to make a diagnosis process in which various functions are reflected in the sewage treatment plant, and also to conclude how various factors are acting as the cause of the diagnosis result, thereby facilitating the establishment of future countermeasures.

The data predicting unit 50 derives the predictive diagnosis for the new data by the diagnostic rule derived by the rule extracting unit 40.

Meanwhile, the present invention provides a process diagnosis method of a sewage treatment plant as shown in FIG. 2, which includes collecting data on operation and water histories accumulated in a process (S10); Processing (S20) data relating to a driving history and a water quality history collected in the step S10; (S30) giving a diagnosis result to the data processed in the step (S20); Deriving a diagnosis rule using a decision tree algorithm based on the data processed in the step S30; It includes a step (S50) for performing a predictive diagnosis using the diagnostic rule derived in the step (S40).

The processing of the collected operating and water quality data (S20) includes the step of classifying the collected data into a data set (S21) and selecting a unit for each data set (S22). It is characterized by.

In step (S22) of selecting a unit for each data set, the data sets are classified at predetermined time intervals, and the average value of the data measured at predetermined time intervals and the measured data at intervals smaller than the predetermined time intervals is selected. Characterized in that it comprises a classification.

In the step (S30) of assigning a diagnosis result to the above-mentioned processed data, the diagnosis result is assigned to the processed data by using a K-means clustering algorithm.

In more detail, the step (S30) of assigning a diagnosis result to the processed data includes selecting an item and a result to be diagnosed (S31), performing a grouping of K (S32), and an average value of each group. And a step S33 for determining the classification reference value, and before the step S33 for determining the classification reference value, step S32 for S grouping and step S33 for obtaining an average value for each group. ) Is repeated to determine the final mean value of each group, that is, the above-described center value, and to determine the classification reference value (S33).

In particular, in the step (S40) of deriving a diagnostic rule using a decision tree algorithm based on the processed data mentioned above, the following equations (1) and equations are based on the classification standard value derived in the previous step (S30). The index derived by (2) can be used to derive the diagnosis rule using the decision tree algorithm.

Formula (1)

Formula (2)

Where pi is the fraction of S belonging to class i, A is a variable, and Sv is a subset of S when variable A has the value v.

When the diagnosis rule is derived in step S40, it is preferable to go through the verification and confirmation step of the diagnosis rule. In the verification and confirmation step, the contents and rules of a driver who has abundant prior knowledge about the relevant plant or a known theory are valid. In contrast, it is advisable to verify and confirm that the rules are theoretically valid and that the derived variables (factors) are appropriate for reference in order to derive the relevant diagnostic results.

In the step (S50) of performing the predictive diagnosis using the derived diagnostic rule mentioned above, in performing the predictive diagnosis using the derived diagnostic rule, when new operational data is obtained, the unit is set in advance (in processing). Collecting and rearranging, applying it to the diagnostic rule, performing the predictive diagnosis and deriving the prediction result, and using this diagnostic rule in actual field operation.

Hereinafter, one embodiment will be described based on the configuration (step) described above.

In this embodiment, the inflow and outflow water quality data, which is a measurement result once a day, and operating data such as pH, DO, ORP, and sludge waste internal return rate were collected and processed to diagnose the outflow condition of the wastewater treatment plant. In this case, the collected data was rearranged by setting the set of data on a daily basis.

Then, in assigning the diagnosis result to the processed data, the effluent BOD, COD, SS, TN, and TP are selected as input variables to diagnose high / low concentration of each item of effluent. K-means clustering analysis was performed. As a result, the mean value of each variable in each group was obtained, and the mean value of each variable in each group was compared. By comparison, the runoff distribution was reclassified into two groups based on the higher mean of each variable. In other words, as shown in Table 1 below, the classification was reclassified into one group having high effluent quality and the other having low effluent quality. This is to determine whether the high average value derived by K-means clustering is high or low effluent quality. It means that it will be used as the boundary value for the classification criteria. That is, a case where the value is higher than the classification reference value is diagnosed as high and a case where the value is lower than the classification reference value.

Table 1

Item	BOD	COD	SS	TN	TP
Average value in group 1	5.3	14.5	3.9	12.307	0.592
Average value in group 2	10.3	19.7	7.4	17.002	0.434
Classifier value	10.3	19.7	7.4	17.002	0.592

Criteria for classification of runoff states determined by K-means clustering

Subsequently, a decision tree algorithm was applied to generate decision trees for each predefined group based on the results of K-means clustering.

As mentioned earlier, the target variables were predefined using the results from K-means clustering analysis. Representative decision tree algorithms reported in the literature include CART, C4.5, ASSISTANT, CHAID, QUEST, and RIPPER. In this embodiment, the most commonly used method, the CART (Classification and Regression Tree) algorithm was used, the implementation of this algorithm was made using SPSS ANSWER TREE (ver 3.0). The CART algorithm performs binary split in the direction of reducing the Gini Index.

A decision tree for classifying the BOD state of the effluent through the CART algorithm is inputted with 104 data sets including data related to influent water quality, load, operating conditions, and sludge settling from the sewage treatment plant. The IF-THEN rules for classifying these states are derived as follows.

The runoff BOD states are high and low depending on the size of the five variables (factors): x1 (sludge volume), x2 (reactor temperature), x3 (internal return rate), x4 (reactor pH), and x5 (BOD volume load). It can be diagnosed.

Rules for classification of effluent BOD state:

Rule 1: IF x1≤2172.5 and x2≤15.35 and x3≤1.925, THEN effluent BOD is High

Rule 2: IF x1≤2172.5 and x2≤15.35 and x3> 1.925, THEN effluent BOD is Low

Rule 3: IF x1≤2172.5 and x2> 15.35, THEN effluent BOD is Low

Rule 4: IF x1> 2172.5 and x4≤6.79 and x5> 0.206, THEN effluent BOD is High

Rule 5: IF x1> 2172.5 and x4≤6.79 and x5≤0.206, THEN effluent BOD is Low

Rule 6: IF x1> 2172.5 and x4> 6.79, THEN effluent BOD is High

As shown in the above embodiment, the diagnostic items and the diagnosis results are set to high / low on the effluent BOD state, and the sludge waste amount, the reaction tank temperature, the internal return rate, the reaction tank pH, the BOD volume load, etc. are the complex factors from which these results are derived. By setting the diagnosis result as high / low, deriving the classification standard value based on the decision basis of high / low by K-means clustering for each item, and using the decision tree algorithm as shown in FIG. Based on the new data, sludge waste volume, reactor temperature, internal return rate, reactor pH, and BOD volume load are predicted to be high / low on the BOD status of the effluent. If the BOD state of the effluent is high, it affects the BOD state of the effluent. Which will enable reasoning about the given factor is that it is easy to control the basis of such inferences.

The present invention can be widely applied to the field of sewage treatment process of the sewage treatment plant, which can accurately and comprehensively diagnose the diagnosis result of the process considering the complex factors.

Claims

In the process diagnosis system of sewage water treatment plant,

A data collecting unit for collecting data on operation and water histories accumulated in the process;

A data processing unit for processing data relating to a driving history and a water quality history collected by the data collection unit;

A data diagnosis unit for providing a diagnosis result to the data processed by the data processing unit;

A rule derivation unit for deriving a diagnosis rule using a decision tree algorithm based on each diagnosis result of the processed data performed by the data diagnosis unit;

Process prediction system of the sewage treatment plant, characterized in that consisting of; a data prediction unit for deriving a predictive diagnosis for the new data by the diagnostic rule derived by the rule drawing unit.
The method of claim 1,

The operation history is a process diagnostic system of the sewage treatment plant, characterized in that it comprises one or more of the drug consumption, waste sludge treatment costs, aeration costs.
The method of claim 1,

The water quality history includes at least one of influent, effluent, the concentration of the substance to be removed in the bioreactor, dissolved oxygen concentration, PH value, redox potential (ORP), and sludge sedimentation capacity (SV30, SV1). Process diagnosis system of sewage treatment plant.
The method of claim 1,

Processing the data in the data processing unit is a process diagnostic system of sewage treatment plant, characterized in that to set the collected data at regular intervals.
The method of claim 1,

The data diagnosis unit is a process diagnostic system of the sewage treatment plant, characterized in that to give a diagnostic result to each of the processed data using the K-means clustering algorithm.
The method of claim 5,

The data diagnosis unit includes a grouping unit for grouping processed data, a data calculating unit for classifying an average value of each processed data grouped by the grouping unit, and a reference value for deriving a classification reference value from the average value calculated at the data calculating unit. Process diagnosis system of sewage treatment plant, characterized in that the derivation unit.
In the process diagnostic method of the wastewater treatment plant,

Collecting data on operational and water histories accumulated in the process;

Processing data relating to the operating history and the water quality history collected in the step;

Assigning a diagnosis result to the data processed in the step;

Deriving a diagnosis rule using a decision tree algorithm based on the processed data to which the diagnosis result is assigned;

Process diagnosis method of the sewage treatment plant characterized in that it comprises the step of performing a predictive diagnosis using the diagnostic rules derived in the step.
The method of claim 7, wherein

The processing of the collected operating and water history data process comprises the steps of classifying the collected data into a data set and the step of selecting a unit for each data set process diagnostic method of the wastewater treatment plant.
The method of claim 8,

Selecting a unit for each data set comprises the step of classifying the data set at a predetermined time interval process diagnostic method of the wastewater treatment plant.
The method of claim 9,

In the step of assigning a diagnosis result to the processed data in the step, the diagnostic result is assigned to each processed data using the K-means clustering algorithm, the process diagnostic method of the wastewater treatment plant.
The method of claim 10,

In the step of assigning a diagnosis result to the processed data in the step, selecting the items and results to be diagnosed, k grouping, obtaining the average value of each group, and determining the classification reference value Process diagnostic method of wastewater treatment plant, characterized in that made.
The method of claim 10,

On the basis of the classification reference value of each processed data derived in the above step,

,

(Where pi is the fraction of S belonging to class i, A is one variable, Sv is a subset of S when variable A has a value of v). Process diagnostic method of sewage treatment plant characterized by deriving a diagnostic rule.