KR101871940B1

KR101871940B1 - Method and system for establishing predictive model of plant abnormality

Info

Publication number: KR101871940B1
Application number: KR1020140056574A
Authority: KR
Inventors: 조철형; 방현진; 김경택; 김덕수; 이기천; 조영송
Original assignee: 한화에어로스페이스 주식회사; 한양대학교 산학협력단
Priority date: 2014-05-12
Filing date: 2014-05-12
Publication date: 2018-06-27
Also published as: US10460531B2; KR20150129507A; US20150323425A1

Abstract

본 발명은 설비 데이터와 비선형관계를 가지는 설비 이상을 실시간으로 진단하기 위한 방법을 개시한다. 이를 위해 과거의 설비 이상을 포함하는 설비 데이터를 입력받고, 유전적 알고리즘으로 설비 이상 예측 모델을 구축하기 위한 변수를 생성한 뒤 설비 이상 예측 모델을 구축 및 검증한다. 미리 설정된 기준 적합도를 만족할 때까지 설비 이상 예측 모델을 구축하기 위한 변수 생성과 모델의 구축, 검증을 반복하며 최종 결정된 설비 이상 예측 모델로 설비 데이터로부터 설비 이상을 실시간으로 진단할 수 있다.The present invention discloses a method for real-time diagnosis of equipment abnormality having a non-linear relationship with equipment data. To do this, we construct a plant anomaly prediction model by constructing a parameter to construct a plant anomaly prediction model using genetic algorithms. It is possible to diagnose the plant abnormality in real time from the plant data with the finally determined plant abnormality prediction model by repeatedly generating and verifying the model to construct the plant abnormality prediction model until the predetermined standard fitness is satisfied.

Description

[0001] The present invention relates to a method and system for establishing a plant abnormality prediction model,

본 발명은 설비 이상 예측 모델의 구축방법 및 시스템에 관한 것으로, 구체적으로는 설비데이터와 설비 이상간에 관계성 모델을 도출하고 이를 적용하여 설비데이터로부터 설비 이상을 실시간으로 진단하는 방법 및 시스템에 관한 것이다.More particularly, the present invention relates to a method and a system for deriving a relationship model between equipment data and equipment abnormality and diagnosing equipment abnormality from equipment data in real time by applying the model .

설비 데이터란 설비를 동작 및 관리하면서 산출되는 모든 데이터를 의미하며, 설비 이상은 상기 설비를 동작시키는 데 있어서 발생하는 외부 요인을 제외하고 설비 자체를 원인으로 하는 결함, 오동작을 의미한다.The term "equipment data" refers to all data that are calculated while operating and managing the equipment, and "equipment malfunction" means a malfunction or malfunction caused by the equipment itself, excluding external factors that occur when operating the equipment.

종래에는 선형성이 명확한 설비 데이터와 설비 이상을 정의하여 각종 통계적 방법을 사용하여 그 관계성을 찾아 관리해 왔다. 그러나, 설비 데이터와 설비 이상간의 관계가 반드시 선형성을 가지는 것은 아니고, 설비 데이터와 설비 이상간에 비선형성을 가지는 경우에 상기 통계적 방법(다변량, SPC, PCA)으로는 관계성을 찾기 어려워 설비 이상을 예측하기 곤란하고, 변경점이 발생했을 시에 대처가 어렵다는 점이 있었으며, 일반적인 설비데이터는 정규성과 등분산성을 보이지 않아 다양한 방식의 비모수 방법론들을 찾아야 하지만 시간이 많이 걸리고, 신뢰성을 높이기 어렵다는 한계가 있었다.In the past, we have defined facility data and equipment abnormalities with clear linearity, and have used various statistical methods to find and manage the relationship. However, when the relationship between the equipment data and the equipment abnormality does not necessarily have a linearity, and there is nonlinearity between the equipment data and the equipment abnormality, it is difficult to find a relationship with the statistical method (multivariate, SPC, PCA) It is difficult to cope with the change, and general facility data does not show uniformity and uniform distribution. Therefore, various methods of non-parametric methodologies have to be found, but it is time consuming and difficult to increase reliability.

본 발명이 이루고자 하는 기술적 과제는 설비 이상과 선형성을 가지는 설비 데이터뿐만 아니라 비선형성을 가지는 설비 데이터로부터 설비 이상을 예측할 수 있는 관계성 모델을 제공하고 이를 활용하여 설비 이상을 실시간으로 예측하는 데에 있다.The present invention has been made in view of the above problems, and it is an object of the present invention to provide a relational model capable of predicting a plant abnormality from equipment data having non-linearity as well as equipment data having facility abnormality and linearity, .

상기의 기술적 과제를 해결하기 위한 본 발명의 제 1실시예는, 설비 데이터를 입력받고 동기화하는 데이터 관리 단계; 최초 크로모좀을 정의하는 모델 구축 준비 단계; 상기 설비 데이터를 학습 데이터와 시험 데이터로 나누는 데이터 분할 단계; 상기 최초 크로모좀으로부터 상기 학습 데이터를 모두 반영하는 예측 모델을 구축하는 모델 구축 단계; 상기 예측 모델에 상기 시험 데이터를 입력하여 모델 적합도를 산출하는 모델 적합도 산출 단계; 미리 설정된 기준 적합도보다 상기 모델 적합도가 높거나 같은 경우 상기 예측 모델을 설비 이상 예측 모델로 결정하는 모델 결정 단계; 및 미리 설정된 기준 적합도보다 상기 모델 적합도가 낮은 경우 현재의 크로모좀을 전세대 크로모좀으로 새로 정의하고, 상기 전세대 크로모좀을 교차 및 돌연변이 시켜 만든 후세대 크로모좀으로 새로운 예측 모델을 구축하고, 상기 모델 적합도 산출단계부터 반복 진행하는 모델 구축 반복 단계; 를 포함한다.According to a first aspect of the present invention, there is provided a data management method comprising: a data management step of receiving and synchronizing equipment data; Preparing a model to define the first chromosome; A data dividing step of dividing the facility data into learning data and test data; A model building step of constructing a prediction model reflecting all the learning data from the first chromosome; A model fitness calculating step of calculating the model fitness by inputting the test data into the prediction model; A model determining step of determining the prediction model as a facility anomaly prediction model when the model fitness is higher than or equal to a preset reference fitness; And if the model fit is lower than the preset reference fitness, the current chromosome is newly defined as a new chromosome, and a new prediction model is constructed with the next generation chromosome that crosses and mutates the old chromosome, Repeating the model building repeating step from the step; .

상기의 기술적 과제를 해결하기 위한 본 발명의 제 2실시예는, 입력받은 설비 데이터를 동기화하는 데이터 관리부; 최초 크로모좀을 정의하는 최초 크로모좀 정의부; 상기 설비 데이터를 학습 데이터와 시험 데이터로 나누는 데이터 분할부; 상기 최초 크로모좀으로부터 상기 학습 데이터를 모두 반영하는 예측 모델을 구축하는 예측 모델 구축부; 상기 예측 모델에 상기 시험 데이터를 입력하여 모델 적합도를 산출하는 모델 적합도 생성부; 미리 설정된 기준 적합도와 상기 모델 적합도를 비교하는 적합도 비교부; 및 상기 기준 적합도보다 상기 모델 적합도가 높거나 같은 경우 상기 예측 모델을 설비 이상 예측 모델로 결정하고, 그 외에 경우는 현재의 크로모좀을 전세대 크로모좀으로 새로 정의하고, 상기 전세대 크로모좀을 교차 및 돌연변이 시켜 만든 후세대 크로모좀으로 새로운 예측 모델을 구축하고, 상기 모델 적합도 생성부의 동작부터 설비 이상 예측 모델이 결정될 때까지 반복 동작하는 후세대 크로모좀 정의부;를 포함한다.According to another aspect of the present invention, there is provided a data processing apparatus including: a data management unit for synchronizing input facility data; A first chromosomal defining portion defining a first chromosome; A data dividing unit dividing the facility data into learning data and test data; A prediction model construction unit for constructing a prediction model reflecting all the learning data from the first chromosome; A model fitness generating unit for inputting the test data into the prediction model to calculate a model fitness; A fitness comparison unit for comparing the model fitness with a preset reference fitness; And determining the prediction model as an equipment abnormality prediction model when the model fits are higher than or equal to the reference fits. In another case, the present chromosome is newly defined as a global chromosome and the old chromosome is crossed and mutated Generation chromosome definition unit that constructs a new prediction model with a later generation chromosome and repeats the operation until the equipment abnormality prediction model is determined from the operation of the model fitness generation unit.

본 발명에 의해 도출된 모델을 이용하면 설비 데이터와 설비 이상간에 선형관계를 가질 때는 물론, 비선형관계를 가지더라도 설비 데이터로부터 설비 이상을 높은 확률로 예측할 수 있어서, 해당 설비를 관리하는 데에 도움을 줄 수 있다.By using the model derived by the present invention, it is possible to predict a facility abnormality from a plant data with a high probability, even if it has a linear relationship between the plant data and the plant abnormality and has a nonlinear relationship. You can give.

구체적으로는, 설비 데이터의 경향성을 읽어내어 설비 이상을 조속히 판단함으로써, 더 큰 설비의 결함이나 오동작을 막고 사고를 미연에 방지할 수 있게 된다.Specifically, by reading the tendency of the facility data and judging the facility abnormality at a short time, it is possible to prevent defects and malfunctions of larger facilities and to prevent accidents in advance.

도 1은 종래의 선형성이 명확한 설비 데이터와 설비 이상을 통계적 방법을 사용하여 그 관계성을 찾는 방법을 나타낸다.
도 2는 본 발명에 따른 설비 이상 예측 모델 구축 시스템을 나타낸 블록도이다.
도 3은 본 발명에 따른 설비 이상 예측 모델의 구축 방법을 나타낸 순서도이다.
도 4은 데이터 관리 단계에서 일어나는 과정을 상세하게 나타낸 흐름도이다.
도 5는 본 발명의 실시예 중 하나인, 서포트 벡터 머신 모델을 설비 이상 예측 모델로 구축하는 과정을 나타나는 것으로, 특히 데이터 관리부를 제외한 모델 구축 준비부와 모델 구축부에서의 동작을 나타내는 흐름도이다.Fig. 1 shows a conventional method of finding the relationship between equipment data with a clear linearity and a facility abnormality using a statistical method.
FIG. 2 is a block diagram showing a system for predicting a plant anomaly prediction model according to the present invention.
3 is a flowchart showing a construction method of the equipment abnormality prediction model according to the present invention.
FIG. 4 is a flowchart illustrating details of a process occurring in the data management step.
FIG. 5 shows a process of constructing a support vector machine model as an equipment anomaly prediction model, which is one of embodiments of the present invention. In particular, FIG. 5 is a flowchart showing operations in a model construction preparation unit and a model construction unit excluding a data management unit.

이하 본 발명의 바람직한 실시예가 첨부된 도면들을 참조하여 설명될 것이다. 도면들 중 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 참조번호들 및 부호들로 나타내고 있음에 유의해야 한다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that the same components of the drawings are denoted by the same reference numerals and signs as possible even if they are shown on different drawings.

본 발명을 설명함에 있어서 관련된 공지 기능 혹은 구성에 대해 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그에 대한 상세한 설명은 생략하거나 간략하게 설명하는 것으로 한다. 한편, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. On the other hand, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

도 1은 종래의 선형성이 명확한 설비 데이터와 설비 이상을 통계적 방법을 사용하여 그 관계성을 찾는 방법을 나타낸다.Fig. 1 shows a conventional method of finding the relationship between equipment data with a clear linearity and a facility abnormality using a statistical method.

정상관리선으로부터 벗어난 설비데이터 FAULT I내지 IV에 의해 설비 이상이 발생한 것을 개략적으로 나타내고 있다.And out of facility management data FAULT I to IV out of the normal management line.

통계적으로 두 데이터의 상관성을 찾아내기 위해 분류 모델(Classification Model)을 사용한다. 분류 모델은 기계학습알고리즘에 의해 구축될 수 있는데, 기계학습 알고리즘이란 주어진 데이터를 컴퓨터에 입력하고 어떠한 특정 알고리즘을 기반으로 학습을 행하여 판별기준을 구축함으로써 새로운 데이터가 주어졌을 때 그 데이터가 어떠한 종류로 판별되는지를 예측하게 되는 과정을 가리킨다. 기계학습알고리즘에는 케이-근접이웃(KNN : K-Nearest Neighbors algorithm), 퍼셉트론(perceptron), 방사 기저 함수 네트워크(Radial Basis Function Network), 유전적 알고리즘(GA : Genetic Algorithm), 서포트 벡터 머신(SVM : Support Vector Machine) 등이 있으며 본 발명에서는 이 중 적어도 하나 이상의 알고리즘을 통해 설비 이상 예측 모델을 구축하는 것을 제안한다. 본 발명의 설비 이상 예측 모델이란, 설비 데이터를 입력받아서 설비 이상을 진단할 수 있는 모델을 의미하며 특히 본 발명의 실시예 중 하나에서는 유전적 알고리즘과 서포트 벡터 머신을 동시에 이용하고 있다.Statistical Classification Model is used to find the correlation of two data. Classification models can be constructed by machine learning algorithms. By inputting given data into a computer and constructing a discrimination criterion by performing learning based on a certain algorithm, new types of data are given, And the process of predicting whether it is discriminated. Machine learning algorithms include K-Nearest Neighbors algorithm, Perceptron, Radial Basis Function Network, Genetic Algorithm (GA), Support Vector Machine (SVM) Support Vector Machine) and the like. In the present invention, it is proposed to construct an equipment abnormality prediction model through at least one of the above algorithms. The equipment abnormality prediction model of the present invention means a model capable of diagnosing equipment abnormality by receiving equipment data. In particular, one of the embodiments of the present invention uses a genetic algorithm and a support vector machine at the same time.

유전적 알고리즘이란, 기계학습 알고리즘의 일종으로서 다윈의 진화론를 모티브로 하여 주위 환경에 따라 스스로 진화해나가는 방식으로 수행되는 알고리즘으로 기본적으로 1회 이상의 반복과정을 전제로 한다는 특징이 있다.Genetic Algorithm is a kind of machine learning algorithm that is based on Darwin's theory of evolution. It is an algorithm that is performed in a way that evolves itself according to the surrounding environment.

유전적 알고리즘의 진행은 여러 가지 선택법 중 하나의 선택법에 따라 복수 개의 크로모좀을 만들고, 각각의 크로모좀이 얼마나 우수한지 판별한 후, 우수한 크로모좀만 남기고 나머지 크로모좀을 제거하여 일정 기준을 넘는 크로모좀이 나올 때까지 진화를 거듭시켜 나가는 것이다. 이에 대한 자세한 설명은 도 5를 참고하여 후술한다.Genetic algorithms can be developed by making multiple chromosomes according to one of several selection methods, distinguishing the superiority of each chromosome, removing the remaining chromosomes and removing the remaining chromosomes, It will continue to evolve until it comes out. A detailed description thereof will be described later with reference to FIG.

서포트 벡터 머신은 두 가지 종류의 데이터를 적절하게 나누는 판별방식을 컴퓨터로 학습하여 새로운 데이터에 대한 예측을 수행하는 방식이다. 서포트 벡터 머신 모델을 이용하면 비선형분류라는 현실적인 문제들에서는 만족스런 성능을 낼 수 없다는 한계가 존재했으나 커널(kernel)이라는 함수를 사용한 맵핑(mapping) 방식을 적용하면 효과적으로 예측이 가능해지며, 커널을 통한 맵핑이라는 것은 우리가 실제로 데이터를 배치하는 입력공간에서는 잘 나누어지기 힘든 비선형문제를 특징 공간(feature space)라는 고차원의 공간으로 이동시켜서 이 새로운 공간에서 서포트 벡터 머신의 선형판별을 수행함으로써, 마치 처음의 입력 공간(input space)에서 복잡한 비선형 판별 문제를 해결한 것과 같은 효과를 얻는 것을 가리킨다. The support vector machine is a method of predicting new data by learning the discrimination method of dividing two kinds of data properly by computer. The support vector machine model has a limitation that it can not achieve satisfactory performance in the real problems of nonlinear classification. However, it can be effectively predicted by applying the mapping method using a function called kernel, Mapping refers to moving the nonlinear problem, which is difficult to divide in the input space where we actually place the data, to the high dimensional space called the feature space and performing the linear discrimination of the support vector machine in this new space, Indicates that the same effect as solving the complex nonlinear discrimination problem in the input space is obtained.

서포트 벡터 머신 모델을 이용하기 위해서는 입력 벡터(input vector)를 결정하는 데 이용되는 입력 변수(input parameter)와 커널함수를 이용하기 위한 커널함수 파라미터(kernel function parameter), 이상 두 가지가 필요하다. 이에 대한 자세한 설명은 도 5를 참고하여 후술하며 상술한 서포트 벡터 머신은 공지의 기술이므로 본 발명과 직접적으로 연관이 없는 일반적인 설명은 발명의 간명화를 위해 생략한다.To use the support vector machine model, we need two input parameters, which are used to determine the input vector, and kernel function parameters, to use the kernel function. A detailed description thereof will be given later with reference to FIG. 5, and since the support vector machine described above is a known technology, a general description not directly related to the present invention is omitted for the sake of simplicity of the invention.

도 2는 본 발명에 따른 설비 이상 예측 모델 구축 시스템을 나타낸 블록도이다.FIG. 2 is a block diagram showing a system for predicting a plant anomaly prediction model according to the present invention.

도 2를 참조하면, 본 발명에 따른 설비 이상 예측 모델 구축 시스템은 데이터 관리부(200), 모델 구축 준비부(210), 및 모델 구축부(220)를 포함한다.2, the equipment abnormality prediction model construction system according to the present invention includes a data management unit 200, a model construction preparation unit 210, and a model construction unit 220.

데이터 관리부(200)는 설비 데이터를 관리하며 데이터 입력부(201)와 데이터 완성부(202)를 포함한다. 데이터 입력부(201)는 텍스트 파일형태의 생산 데이터 파일과 트레이스 데이터 파일을 읽고 저장한 뒤 이를 동기화한다. 데이터 완성부(202)는 누적버림률(accumulated loss rate)로 설비 이상을 나타내는 반응변수를 설정하고, 반응변수 자체를 생산 데이터에 포함시켜서 설비데이터를 완성시킨다. 즉, 완성된 설비 데이터는 과거에 설비가 정상적으로 동작할 때의 데이터 뿐만 아니라, 설비 이상이 발생했을 때의 데이터도 포함하게 된다. 완성된 동기화된 설비 데이터는 모델 구축 준비부(210)로 전달된다.The data management unit 200 manages facility data and includes a data input unit 201 and a data completion unit 202. The data input unit 201 reads and stores the production data file and the trace data file in the form of a text file and synchronizes them. The data completion unit 202 sets a reaction variable indicating an equipment abnormality with an accumulated loss rate, and completes the equipment data by including the response variable itself in the production data. That is, the completed facility data includes not only the data at the time when the equipment normally operates in the past, but also the data at the time when the equipment abnormality occurs. The completed synchronized facility data is transmitted to the model building preparation unit 210.

모델 구축 준비부(210)는 최적의 설비 이상 예측 모델을 구축하기 위한 변수를 유전적 알고리즘을 통해 찾으며 최초 크로모좀 정의부(211), 적합도 비교부(212), 및 후세대 크로모좀 정의부(213)을 포함한다.The model construction preparation unit 210 finds a parameter for constructing an optimal equipment abnormality prediction model through a genetic algorithm and generates a first chromosome definition unit 211, a fitness comparison unit 212, and a next generation chromosome definition unit 213 ).

최초 크로모좀 정의부(211)은 '설비 이상 예측 모델'을 결정하기 위해 실험적으로 만들어지는 '예측 모델'을 구축하는 변수인 최초 크로모좀을 정의한다. 최초 크로모좀이 포함하는 예측 모델의 구축 변수의 개수는 단수뿐만 아니라 복수도 될 수 있으며 그 예로서, 예측 모델을 서포트 벡터 머신 모델로 하는 본 발명의 실시예에 의할 경우 최초 크로모좀은 입력 변수의 조합과 커널함수 파라미터를 포함한다. 적합도 비교부(212)와 후세대 크로모좀 정의부(213)에 대해서는 후술한다.The first chromosome defining unit 211 defines a first chromosome, which is a variable for constructing a 'prediction model' experimentally created to determine an 'equipment abnormality prediction model'. The number of construction parameters of the prediction model included in the first chromosome may be plural as well as the number of stages. For example, according to the embodiment of the present invention in which the prediction model is a support vector machine model, And kernel function parameters. The fitness comparison unit 212 and the later generation chromosome definition unit 213 will be described later.

최초 크로모좀 정의부(211)에서 정의된 최초 크로모좀은 데이터 완성부(202)에서 완성된 설비 데이터와 함께 모델 구축부(220)로 전달된다.The first chromosome defined in the first chromosome defining unit 211 is transmitted to the model building unit 220 together with the completed equipment data in the data completion unit 202.

모델 구축부(220)는 예측 모델을 생성하고 검증하며, 데이터 분할부(221)와 예측 모델 구축부(222), 모델 적합도 생성부(223)를 포함한다.The model construction unit 220 includes a data division unit 221, a prediction model construction unit 222, and a model fitness generation unit 223.

데이터 분할부(221)는 데이터 완성부(202)에서 완성된 설비 데이터를 전달받아 학습 데이터와 시험 데이터로 분할한다.The data division unit 221 receives the facility data completed by the data completion unit 202 and divides the facility data into learning data and test data.

예측 모델 구축부(222)는 모델 구축 준비부(210)로부터 입력받은 최초 크로모좀 또는 후세대 크로모좀을 이용하여 예측 모델을 구축하고, 데이터 분할부(221)의 학습 데이터를 모두 반영할 수 있도록 학습시킨다. 후세대 크로모좀에 대해서는 후세대 크로모좀 정의부(213)에서 후술한다.The predictive model building unit 222 constructs a predictive model using the first chromosome or the next generation chromosome received from the model construction preparation unit 210 and performs a learning operation so that all of the learning data of the data division unit 221 can be reflected . Generation chromosome will be described later in the chromosome definition unit 213 of the later generation.

모델 적합도 생성부(223)는 예측 모델 구축부(222)로부터 전달받은 '예측 모델'에 시험 데이터를 적용하여 모델 적합도를 생성한다.The model fitness generator 223 applies the test data to the 'prediction model' received from the prediction model building unit 222 to generate a model fitness.

모델 구축 준비부의 적합도 비교부(212)는 모델 적합도 생성부(223)로부터 생성된 모델 적합도와 미리 설정된 기준 적합도를 비교한다.The fitness comparison section 212 of the model preparation preparation section compares the model fitness generated by the model fitness generation section 223 with a preset reference fitness.

모델 구축 준비부의 후세대 크로모좀 정의부(213)은 적합도 비교부(212)에서 모델 적합도가 기준 적합도보다 낮은 경우 현재 정의된 크로모좀을 전세대 크로모좀으로 놓고, 이를 미리 설정된 비율에 따른 교차 및 돌연변이를 적용하여 후세대 크로모좀을 새로 정의한다. 모델 적합도가 기준 적합도보다 높거나 양 값이 같은 경우에는 현재 예측 모델을 설비 이상 예측 모델로 결정한다.The next generation chromosome definition unit 213 of the model construction preparation unit sets the currently defined chromosome as a global chromosome when the model fitness is lower than the standard fitness in the fitness comparison unit 212 and sets the cross chromosome as a crossing and mutation according to the preset ratio To define a new generation of chromosomes. If the model fitness is higher than or equal to the reference fitness, the current forecast model is determined as the equipment anomaly prediction model.

도 3은 본 발명에 따른 설비 이상 예측 모델의 구축 방법을 나타낸 순서도이다. 도 3의 각 단계는 도 2의 설비 이상 예측 모델 구축 시스템에 의해 수행된다.3 is a flowchart showing a construction method of the equipment abnormality prediction model according to the present invention. Each step of Fig. 3 is performed by the equipment abnormality prediction model construction system of Fig.

단계 S300에서는, 설비 데이터를 입력받고 이를 활용하기 위해서 동기화시킨다. 이에 대한 자세한 내용은 도 4에서 후술한다.In step S300, facility data is received and synchronized to utilize the facility data. This will be described later in detail with reference to FIG.

단계 S400에서는, 예측 모델을 구축하기 위한 변수로 구성된 최초 크로모좀을 정의한다. 최초 크로모좀을 정의하는 것 외의 과정은 단계 S700에서 후술한다.In step S400, a first chromosome composed of variables for constructing a prediction model is defined. The process other than defining the initial chromosome will be described later in step S700.

단계 S500에서는, 최초 크로모좀으로부터 예측 모델을 구축한다. 구체적으로는 예측 모델 구축에 앞서 실험 데이터를 학습 데이터와 시험 데이터로 분할하는 단계가 진행되어야 하며, 최초 크로모좀으로부터 구축된 예측 모델은 분할된 실험 데이터 중 학습 데이터에 의해 학습과정을 거친다. 학습 과정에 대한 자세한 설명은 도 5를 참조하여 후술한다.In step S500, a prediction model is constructed from the first chromosome. Specifically, the step of dividing the experimental data into the learning data and the test data should be performed prior to the construction of the prediction model, and the prediction model constructed from the initial chromosome is subjected to the learning process by the learning data among the divided experimental data. A detailed description of the learning process will be described later with reference to FIG.

단계 S600에서는, 학습과정을 거친 예측 모델에 시험 데이터를 입력하여 시험 데이터에 따른 설비 이상을 도출하고, 그로부터 모델 적합도를 산출한다.In step S600, test data is input to a predictive model that has undergone a learning process to derive an equipment abnormality based on test data, and a model fitness is calculated therefrom.

단계 S700에서는, 단계 S600의 모델 적합도와 미리 설정된 기준 적합도를 비교하는 과정을 나타낸다. 기준 적합도란 예측 모델을 구축하는 반복과정을 중지시키는 데에 필요한 척도로서 허위양성(false positive)과 커버리지(coverage)로 구성된다. In step S700, a process of comparing the model fitness of step S600 with a preset reference fitness is shown. Criteria fitness is a measure of the need to stop the iterative process of constructing a predictive model, which consists of false positives and coverage.

허위양성이란, 실제로는 양성이 아닌데 양성이 나오는 정도를 비율로 나타내는 것으로서, 본 발명의 제 1실시예에서는 '90%'을 적용시키며 이 값은 실시예에 따라서 변경될 수 있다.The term "false positive" refers to a rate at which a positive result is obtained, which is not actually positive. In the first embodiment of the present invention, "90%" is applied, and this value can be changed according to the embodiment.

커버리지란, 예측 모델이 설비 이상이라고 판단하는 정도를 비율로 나타내는 것으로서, 본 발명의 제 1실시예에서는 '30%'을 적용시키며 이 값은 실시예에 따라서 변경될 수 있다.The coverage indicates the degree to which the prediction model is determined to be abnormal in the equipment. In the first embodiment of the present invention, '30%' is applied, and this value can be changed according to the embodiment.

예를 들어, 본 발명의 제 1실시예에 의할 때, 100번 설비 이상이 발생했을 경우, 설비 이상 예측 모델은 커버리지에 의해 적어도 30번이상은 설비 이상이 발생되었다고 설비 관리자에게 보고 해야하며, 설비 이상 보고에 의해 설비 관리자가 설비 이상이 있는지 직접 조사했을 때, 허위양성에 의해 실제로 적어도 27번이상은 설비 이상이 발생했어야 한다.For example, according to the first embodiment of the present invention, when an equipment abnormality number 100 occurs, the equipment abnormality prediction model should report to the facility manager that at least 30 times of equipment abnormality has occurred due to coverage, When a facility manager reports to the facility manager that there is an equipment malfunction, the equipment malfunction should have occurred at least 27 times.

모델 적합도가 기준 적합도보다 높거나 같은 경우, 모델 결정 단계로 진행되며, 그 외의 경우는 모델 구축 준비 단계로 진행된다.If the model fitness is greater than or equal to the reference fitness, the process proceeds to the model determination phase. Otherwise, the process proceeds to the model construction preparation phase.

모델 구축 준비 단계로 진행하는 경우, 현재의 크로모좀을 전세대 크로모좀으로 정의한 후 이를 교차 및 돌연변이를 적용하여 후세대 크로모좀을 생성하는 과정을 거친다. 후세대 크로모좀은 전세대 크로모좀을 대신하여 현재의 크로모좀이 되며, 상세하게는 새로운 예측 모델을 구축할 수 있는 변수가 된다.When proceeding to the preparation stage of the model, the current chromosome is defined as the former chromosome, and then crossing and mutation are applied to generate the next generation chromosome. Subsequent chromosomes become the current chromosomes on behalf of the former chromosomes, and in detail, are the variables that can be used to construct new prediction models.

단계 S800에서는, 기준 적합도보다 높거나 같은 모델 적합도를 보이는 예측 모델을 설비 이상 예측 모델로 결정한다. 결정된 설비 이상 예측 모델은 단계 S300에서 입력되는 모든 설비 데이터에 대해서 기준 적합도 이상으로 설비 이상을 진단할 수 있는 모델이다.In step S800, a prediction model showing a model fitness that is higher than or equal to the reference fitness is determined as the equipment abnormality prediction model. The determined equipment abnormality prediction model is a model capable of diagnosing equipment abnormality with respect to all equipment data inputted in step S300 by more than the standard fitness.

도 4는 데이터 관리 단계에서 일어나는 과정을 상세하게 나타낸 흐름도이다.FIG. 4 is a flowchart showing details of a process occurring in the data management step.

단계 S310에서는, 텍스트 파일형태의 생산 데이터 파일과 트레이스 데이터 파일을 읽고 저장 및 동기화시킨다. 생산데이터란 설비에 의해 생산되는 데이터이며, 생산 데이터와 연동되어서 그 값을 가리키기 위한 데이터가 트레이스 데이터이다.In step S310, the production data file and the trace data file in the form of a text file are read, stored, and synchronized. The production data is the data produced by the facility, and the data for correlating with the production data is trace data.

단계 S320에서는, 폐기되는 생산물을 수치화한 누적 버림률(accumulated loss rate)로 반응변수를 설정한다. 누적 버림률은 설비 이상이 발생했을 때의 생산물을 버리는 비율인 순간 버림률을 공정을 반복하며 누적시킨 값으로, 높을 수록 큰 반응변수가 산출되며 높은 반응변수는 곧 설비의 잦은 이상을 나타낸다. 반응 변수는 불량률, 장착 이상, 누적버림률 등으로 측정될 수 있으며 본 발명에서는 가장 경향을 알기 쉬운 누적 버림률로 반응변수를 산출한다.In step S320, a response variable is set with an accumulated loss rate obtained by digitizing the product to be discarded. The cumulative rate of abandonment is a cumulative value obtained by repeating the instantaneous rate of abandonment, which is the rate of discarding the product when the facility abnormality occurs. The higher the response variable is calculated, the higher the response variable is the frequent abnormality of the facility. The response variable can be measured by the defect rate, the mounting error, the cumulative rejection rate, etc. In the present invention, the reaction variable is calculated by the cumulative rejection rate which is easy to know the tendency.

단계 S330에서는, 완성된 동기화된 설비 데이터를 모델 구축 준비부에 전달한다. 완성된 동기화된 설비 데이터는 생산 데이터, 트레이스 데이터 및 누적버림률로 인해 정해진 반응변수를 포함한다.In step S330, the completed synchronized facility data is transmitted to the model building preparation unit. The completed synchronized facility data includes response variables determined by production data, trace data and cumulative rejection rates.

완성된 동기화된 설비 데이터는 과거에 설비가 정상적으로 동작할 때의 데이터뿐만 아니라, 설비 이상이 발생했을 때의 데이터도 포함하게 되므로, 이 데이터를 기초로 최종적으로 구축되는 설비 이상 예측 모델에 과거에 설비 이상을 일으켰던 설비 데이터와 동일한 설비 데이터가 입력되는 경우에는 설비 데이터와 설비 이상간의 관계성과 상관없이 설비 이상을 반드시 진단할 수 있게 된다.The completed synchronized equipment data includes not only the data at the time when the equipment normally operates in the past but also the data at the time when the equipment abnormality has occurred. Therefore, in the equipment abnormality prediction model finally built based on this data, When the same facility data as the facility data causing the abnormality is inputted, it is possible to diagnose the facility abnormality irrespective of the relation between the facility data and the facility abnormality.

도 5는 본 발명의 실시예 중 하나인, 서포트 벡터 머신 모델을 설비 이상 예측 모델로 구축하는 과정을 나타나는 것으로, 특히 데이터 관리부를 제외한 모델 구축 준비부와 모델 생성부에서의 동작을 나타내는 흐름도이다.FIG. 5 is a flowchart showing a process of constructing a support vector machine model as a plant anomaly prediction model, which is one of embodiments of the present invention, and particularly, operations in a model construction preparation unit and a model generation unit excluding a data management unit.

단계 S401에서는 서포트 벡터 머신을 만들기 위한 입력 변수의 조합과 커널함수 파라미터를 선택한다. 이는 유전적 알고리즘을 진행하기 위한 최초의 단계로서, 이 두 가지 요소는 단계 S403에서 후술할 크로모좀을 구성하게 된다.In step S401, a combination of input variables and a kernel function parameter for generating a support vector machine are selected. This is the first step to proceed with the genetic algorithm, and these two elements constitute the chromosome to be described later in step S403.

입력변수의 조합이란, 서포트 벡터 머신에 적용시킬 입력 벡터의 크기와 방향을 나타낸 값이다. 커널함수(kernel function)는 서포트 벡터 머신 모델이 비선형성을 가지는 두 데이터간에서도 효과적인 성능을 발휘하게끔 해주는 함수로 선형 커널(linear kernel), 다항식 커널(polynomial kernel), 방사 기저 함수 커널 (RBF : Radial Basis Function kernel)등이 있으며 각각의 커널에서는 최적화를 도와주는 파라미터들이 따로 존재하고 어떠한 파라미터를 선택하는 것이 가장 좋은 지 바로 자동적으로 찾아주는 방법이 없으므로 이를 유전적 알고리즘의 반복과정을 통해 구하게 된다.The combination of input variables is a value indicating the size and direction of the input vector to be applied to the support vector machine. The kernel function is a function that allows the support vector machine model to perform effectively between two nonlinear data, including a linear kernel, a polynomial kernel, a radial basis kernel (RBF) Basis Function kernel). In each kernel, there are parameters that help optimization. There is no automatic way to find out which parameter is best to choose, so it is obtained through iterative process of genetic algorithm.

유전적 알고리즘에서의 선택이란, 목적으로 하는 최종해를 구하기 위해 입력하는 복수의 최초해들을 결정하는 단계로서, 룰렛 휠 선택, 토너먼트 선택, 순위 기반 선택 등 여러가지 방법이 존재한다. 최초해들이 적절하게 결정되는 경우에는 유전적 알고리즘 특유의 반복과정이 최소화되어 빠른 시간 내에 최종해를 구해낼 수 있게 된다. 예를 들어 최초해는 (15,04), (13,07), (11, 10), (09, 13) 으로 선택될 수 있다. 순서는 입력변수의 조합과 커널함수 파라미터순으로 가정한다.The selection in the genetic algorithm is a step of determining a plurality of initial solutions to be input in order to obtain a desired final solution, and there are various methods such as roulette wheel selection, tournament selection, rank based selection, and the like. If the initial solutions are appropriately determined, the iterative process specific to the genetic algorithm is minimized, and the final solution can be obtained in a short time. For example, the initial solution may be selected as (15,04), (13,07), (11,10), (09,13). The order is assumed to be a combination of input variables and kernel function parameters.

단계 S402에서는 입력변수의 조합과 이를 이진수(binary number)로 변환시킨다. 단계 S402에서 선택된 해를 이용하면 (1111,0100), (1101,0111), (1011, 1010), (1001, 1101)이 된다.In step S402, a combination of the input variables and a binary number are converted. (1111, 0 100), (1101, 0111), (1011, 1010), (1001, 1101) are obtained by using the solution selected in step S402.

단계 S403에서는 입력변수의 조합과 커널함수 파라미터를 최초 크로모좀(First Chromosome)으로 정의한다. 크로모좀은 유전적 알고리즘을 구동하기 위한 입력단위로서, 처음에 결정된 크로모좀은 진화를 거듭하여 최종해로 구성된 최종 크로모좀(Last Chromosome)을 산출하게 된다. 단계 S403에서 선택된 해를 이용하면 최초 크로모좀은 각각 11110100, 11010111, 10111010, 10011101 이 된다.In step S403, the combination of the input variables and the kernel function parameters are defined as first chromosomes. The chromosome is the input unit to drive the genetic algorithm, and the chromosome initially determined will evolve to produce the final chromosome composed of the final solution. If the solution selected in step S403 is used, the initial chromosomes are 11110100, 11010111, 10111010, and 10011101, respectively.

단계 S404에서는 최초 크로모좀과 도2의 데이터 관리부로부터 전달받은 설비 데이터가 모델 구축부로 전달된다.In step S404, the facility data transmitted from the first chromosome and the data management unit in Fig. 2 are transmitted to the model building unit.

단계 S405에서는 데이터 관리부를 거쳐 모델 구축 준비부로부터 받은 설비 데이터가 학습 데이터와 시험 데이터로 분할된다. 분할비율은 정해져 있지는 않으나, 유전적 알고리즘의 반복과정의 유효성을 위해서 학습 데이터와 시험 데이터 중 어느 한 쪽의 값이 0이어서는 안되고, 학습 데이터와 시험 데이터는 각각 설비 이상을 포함하는 설비 데이터이어야 한다. 설비 데이터를 학습 데이터와 시험 데이터로 나누는 과정은 다른 단계에 영향을 미치지 않으므로 모델 구축부뿐만이 아니라, 모델 구축 준비부에서 수행하는 것도 가능하다.In step S405, the facility data received from the model building preparation unit via the data management unit is divided into the learning data and the test data. The partition ratio is not fixed, but for the validity of the iterative process of the genetic algorithm, the value of either learning data or test data should not be zero, and the learning data and test data should be facility data including equipment . Since the process of dividing the facility data into the learning data and the test data does not affect the other steps, it can be performed not only by the model building unit but also by the model building preparation unit.

단계 S406에서는 현재의 크로모좀으로 학습 데이터의 내용을 모두 반영하는 서포트 벡터 머신 모델을 구축한다.In step S406, a support vector machine model that reflects the contents of the learning data to the current chromosome is constructed.

여기서 현재의 크로모좀이란 단계 S404에서 정의된 최초 크로모좀, 또는 후술할 단계 S409에 의해 정의되는 후세대 크로모좀이 될 수도 있다. 이는 단계 S409에서 후술한다.Here, the current chromosome may be the first chromosome defined in step S404, or a later generation chromosome defined by step S409 described later. This will be described later in step S409.

학습 데이터의 내용을 모두 반영한다는 것은 구체적으로는 현재의 크로모좀으로 구축되는 서포트 벡터 머신 모델이 시험 데이터를 제외한 학습 데이터에 대해서만큼은 완전하게 설비 이상을 예측할 수 있는 모델이어야 한다는 것을 의미한다.Reflecting all the contents of the learning data means that the support vector machine model constructed with the current chromosome should be a model that can predict the equipment abnormality more completely than the learning data except the test data.

학습 데이터에는 일반 설비 동작에 관한 데이터뿐만 아니라 그에 따른 설비 이상 데이터도 포함되므로 현재의 크로모좀으로 구축되는 서포트 벡터 머신 모델이 학습 데이터에 한해서 제대로 동작하는지 판단이 가능하다. 구분을 위해서 입력변수의 조합만으로 구축된 서포트 벡터 머신 모델을 제1 예측 모델, 제1 예측 모델을 학습 데이터로 학습시켜서 학습 데이터에 한해서는 기준 적합도 이상 잘 동작하는 서포트 벡터 머신 모델을 제2 예측 모델이라고 칭할 수 있다.Since the learning data includes not only the data related to general facility operation but also the equipment abnormal data, it is possible to judge whether the support vector machine model constructed with the current chromosome operates only with the learning data. A support vector machine model constructed by only a combination of input variables for classifying is used as a first predictive model and a first predictive model is learned as learning data, Can be called.

단계 S407에서는 구축된 서포트 벡터 머신 모델에 시험 데이터를 입력하여 모델 적합도를 산출하고, 그 모델 적합도를 서포트 벡터 머신 모델과 함께 모델 구축 준비부로 다시 전달하는 과정을 나타낸다.In step S407, test data is input to the constructed support vector machine model to calculate the model fitness, and the model fitness is transmitted to the model construction preparation unit together with the support vector machine model.

서포트 벡터 머신 모델은 학습 데이터에만 최적화되어있기 때문에 시험 데이터에 대해서 설비 이상을 완전하게 예측하리라는 보장은 없다. 그러므로 시험 데이터를 현재의 서포트 벡터 머신 모델에 입력하여 설비 이상 데이터를 산출한다. 그 다음, 서포트 벡터 머신 모델에 의해 예측된 시험 데이터에 대한 설비 이상 데이터와, 시험 데이터 내에 포함되어 있는 과거 시험 데이터에 따른 설비 이상 데이터를 비교하여 모델 적합도를 추가로 산출한다.Since the support vector machine model is optimized only for the learning data, there is no guarantee that the equipment error will be completely predicted for the test data. Therefore, the test data is input to the current support vector machine model to calculate the equipment abnormality data. Then, the model anomaly is further calculated by comparing the equipment abnormality data for the test data predicted by the support vector machine model with the equipment abnormality data based on the past test data contained in the test data.

모델 적합도란 구축된 서포트 벡터 머신 모델이 설비 이상을 얼마나 잘 진단할 수 있는지 판단할 수 있는 척도로서 미리 설정된 기준 적합도와 비교를 위해 산출된다. 모델 적합도가 높다는 의미는 학습 데이터는 물론이고, 시험 데이터에 대해서도 설비 이상을 설비 관리자가 설정한 기준에 맞게 예측할 수 있다는 의미가 된다. The model fitness is a measure of how well the established support vector machine model can diagnose faults over the plant and is calculated for comparison with a pre-established reference fitness. The high fitness of the model means that not only the learning data but also the test data can be predicted according to the standard set by the facility manager.

마지막으로 기준 적합도와 비교를 위해 산출된 모델 적합도와 모델 구축부에서 생성된 서포트 벡터 머신 모델은 모델 구축 준비부로 다시 전달된다.Finally, the model fits calculated for comparison with the reference fitness and the support vector machine model generated by the model building unit are transmitted back to the model construction preparation unit.

단계 S408에서는 미리 설정된 기준 적합도와 단계 S407에서 산출된 모델 적합도를 비교한다. In step S408, the predetermined reference fitness is compared with the model fitness calculated in step S407.

단계 S409에서는 기준 적합도와 모델 적합도를 비교한 것을 기초로 하여 예측 모델 구축을 계속 할지 결정한다.In step S409, it is determined whether to continue the prediction model construction based on the comparison of the reference fitness and the model fitness.

모델 적합도가 허위양성 90%이상와 커버리지 30%이상 중 어느 하나라도 만족시키지 못하는 경우 후세대 크로모좀을 정의한 후 단계 S406으로 진행한다.If the model fitness does not satisfy at least 90% of the false positive and 30% or more of the coverage, the next generation chromosome is defined and the process proceeds to step S406.

후세대 크로모좀은 현재 서포트 벡터 머신 모델을 구축하는 데에 사용된 크로모좀을 전세대 크로모좀으로 놓고, 교차(crossover)와 돌연변이(mutation)를 적용시켜 산출한 크로모좀을 의미한다.The next generations chromosome refers to the chromosomes obtained by applying the crossover and the mutation to the chromosome used to construct the current support vector machine model as the former chromosome.

교차의 경우, 2점교차, 3점교차 등이 사용될 수 있고, 돌연변이의 발생확률은 최종해의 수렴에 방해가 되지 않도록 적절히 낮은 값을 설정할 수 있다.In the case of the intersection, a two-point crossing, a three-point crossing, and the like can be used, and the probability of occurrence of the mutation can be suitably set to a low value so as not to interfere with the convergence of the final solution.

예를 들어 최초 크로모좀의 한 쌍이 단계 S403에서 예로 든 11110100, 11010111 이고 2점교차가 각각 2~3번째, 7~8번째자리에 일어났다면 후세대 크로모좀은 11010110 이 된다. 여기서 돌연변이가 8번째자리에 일어나면 후세대 크로모좀으로 확정되는 것은 11010111 이 된다.For example, if a pair of first chromosomes are 11110100 and 11010111, as exemplified in step S403, and the two-point crossover occurs at positions 2 to 3 and 7 to 8, respectively, the chromosome of the next generation becomes 11010110. Here, when the mutation occurs at the 8th position, 11010111 is identified as the next generation chromosome.

확정된 후세대 크로모좀은 현재의 크로모좀이 되어 단계 S406에서 새로운 서포트 벡터 머신 모델을 구축하고 기준 적합도 이상의 모델 적합도를 산출할 때까지 이후 단계를 반복하게 된다. 유전적 알고리즘을 충분히 반복하며 산출된 예측 모델의 모델 적합도가 기준 적합도에 현저히 못 미치는 경우에는 단계 S406가 아닌 단계 S405로 돌아가 설비 데이터의 분할비율이나 분할구성을 달리 할 수 있다.The confirmed next generation chromosome becomes the current chromosome and builds a new support vector machine model in step S406 and repeats the later steps until it calculates the model fit above the reference fit. If the genetic algorithm is sufficiently repeated and the model fidelity of the predicted model that is calculated is significantly lower than the standard fidelity, the process returns to step S405 instead of step S406, and the division ratios or the division configurations of the facility data may be different.

모델 적합도가 허위양성 90%이상와 커버리지 30%이상을 모두 만족시키는 경우, 현재의 서포트 벡터 머신 모델을 설비 이상 예측 모델로 결정한다. 단계 S406에서 상술한 개념에 의하면 제2 예측 모델의 모델 적합도가 기준 적합도보다 높은 경우 그 제2 예측 모델은 설비 이상 예측 모델로 결정된다.If the model fit satisfies both false positive 90% and coverage 30% or more, the current support vector machine model is determined as the equipment anomaly prediction model. According to the concept described above in step S406, when the model goodness of the second prediction model is higher than the reference goodness, the second prediction model is determined as the equipment anomaly prediction model.

본 발명에 의해 구축된 모델을 이용하면 설비 데이터(설비의 생산정보, 이벤트 정보 또는 센서정보 등)와 설비 이상간에 비선형관계를 보이더라도 실시간으로 설비 데이터로부터 설비 이상을 높은 확률로 예측할 수 있으므로 해당 설비를 관리하는 데에 도움을 줄 수 있으며, 특히 누적버림률의 경향성이 명확한 표면 장착 기술(SMT : Surface Mounting Technology) 설비의 장착 이상을 실시간으로 진단하는 데에 우수성을 갖는다.Even if the model constructed by the present invention shows a nonlinear relationship between equipment data (production information of equipment, event information, sensor information, and the like) and equipment abnormality, the equipment abnormality can be estimated with high probability from the equipment data in real time. And it is excellent in real-time diagnosis of mounting abnormality of surface mounting technology (SMT) equipment with a clear tendency of cumulative rejection rate.

본 발명은 상술한 실시형태 및 첨부된 도면에 의해 한정되지 아니므로 상술된 서포트 벡터 머신 외의 다른 기계학습 알고리즘에 의해서도 본 발명은 구현될 수 있다. 첨부된 청구범위에 의해 권리범위를 한정하고자 하며, 청구범위에 기재된 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 다양한 형태의 치환, 변형 및 변경할 수 있다는 것은 당 기술분야의 통상의 지식을 가진 자에게 자명할 것이다.The present invention is not limited by the above-described embodiments and the accompanying drawings, and the present invention can be implemented by a machine learning algorithm other than the support vector machine described above. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. It will be self-evident.

200 : 데이터 관리부
201 : 데이터 입력부
202 : 데이터 완성부
210 : 모델 구축 준비부
211 : 최초 크로모좀 정의부
212 : 적합도 비교부
213 : 후세대 크로모좀 정의부
220 : 모델 구축부
221 : 데이터 분할부
222 : 예측 모델 구축부
223 : 모델 적합도 생성부200:
201: Data input unit
202: Data completion unit
210: Preparing the model
211: First chromosome definition part
212: fitness comparator
213: Next generation chromosome definition part
220: Model construction unit
221:
222: prediction model construction unit
223: Model fitness generator

Claims

A data management step of inputting and synchronizing equipment data of a facility for producing a product for each process;
Preparing a model to define the first chromosome;
Dividing the synchronized facility data into learning data and test data;
A model building step of constructing a prediction model reflecting all the learning data from the first chromosome;
A model fitness calculating step of calculating the model fitness by inputting the test data into the prediction model;
A model determining step of determining the prediction model as a facility anomaly prediction model when the model fitness is higher than or equal to a preset reference fitness; And
If the model fit is lower than the predetermined criterion fitness, the current chromosome is newly defined as a global chromosome, a new prediction model is constructed with a next generation chromosome that crosses and mutates the old chromosome, And repeating the model constructing and repeating steps from the step
The data management step comprises:
Wherein the facility includes an accumulated loss rate calculated based on the amount of the discarded product in the synchronized equipment data when discarding the product produced through the process in which the equipment abnormality occurs,
Wherein the equipment abnormality prediction model predicts an abnormality operation time point of the facility based on the accumulated discard ratio included in the input new equipment data,
The reference fidelity,
False positive and coverage. &Lt; RTI ID = 0.0 > 11. < / RTI >

The method of claim 1,
And a support vector machine (SVM) model is used.

3. The method of claim 2,
The first chromosome, the old chromosome and the next generation chromosome,
A combination of input variables represented by binary numbers and a kernel function parameter.

4. The method according to any one of claims 1 to 3,
Wherein the facility abnormality and the facility data have a mutually non-linear relationship.

The method according to claim 1,
And a facility abnormality prediction model for diagnosing a facility abnormality from equipment data flowing into the facility abnormality prediction model in real time.

A data management unit for receiving and synchronizing facility data of a facility that produces a product for each process;
A first chromosomal defining portion defining a first chromosome;
A data dividing unit for dividing the synchronized facility data into learning data and test data;
A prediction model construction unit for constructing a prediction model reflecting all the learning data from the first chromosome;
A model fitness generating unit for inputting the test data into the prediction model to calculate a model fitness;
A fitness comparison unit for comparing the model fitness with a preset reference fitness; And
The prediction model is determined as the equipment anomaly prediction model when the model fitness is higher than or equal to the criterion fitness, and in the other cases, the current chromosome is newly defined as a global chromosome, and the old chromosome is crossed and mutated Generation chromosome definition unit that constructs a new prediction model with a later generation chromosome and repeatedly operates from the operation of the model fitness generation unit to the determination of the equipment abnormality prediction model,
The data management unit,
Wherein the facility includes an accumulated loss rate calculated based on the amount of the discarded product in the synchronized equipment data when discarding the product produced through the process in which the equipment abnormality occurs,
Wherein the equipment abnormality prediction model predicts an abnormality operation time point of the facility based on the accumulated discard ratio included in the input new equipment data,
The reference fidelity,
A false positive and a coverage. The system of claim 1,