KR102623390B1

KR102623390B1 - Method, apparatus and program for maintaining accuracy of equipment anomaly detection model

Info

Publication number: KR102623390B1
Application number: KR1020230084805A
Authority: KR
Inventors: 하승재
Original assignee: 주식회사 에이아이비즈
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2024-01-11

Abstract

본 발명의 다양한 실시예에 따른 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법이 개시된다. 상기 방법은: 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링하는 단계; 상기 이상 탐지 모델이 특정 장비에서 획득된 특정 센서 데이터에서 이상 데이터를 탐지한 경우, 상기 특정 장비와 관련된 예방정비 정보를 획득하고, 상기 특정 장비에 대한 정합성 검사를 수행하는 단계; 상기 정합성 검사를 기초로, 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것인지 또는, 상기 특정 장비의 불량으로 인해 탐지된 것인지 여부를 인식하는 단계; 및 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식한 경우, 상기 이상 데이터를 기초로 상기 이상 탐지 모델을 업데이트하는 단계;를 포함할 수 있다.A method for maintaining the accuracy of an equipment abnormality detection model according to various embodiments of the present invention is disclosed. The method includes: monitoring sensor data of each of one or more devices using a learned anomaly detection model to detect anomalies in the devices; When the abnormality detection model detects abnormal data in specific sensor data obtained from specific equipment, obtaining preventive maintenance information related to the specific equipment and performing a consistency check on the specific equipment; Based on the consistency check, recognizing whether the abnormal data detected by the abnormality detection model was detected due to preventive maintenance or due to a defect in the specific equipment; And when the abnormal data detected by the abnormality detection model is recognized as being detected by preventive maintenance, updating the abnormality detection model based on the abnormal data.

Description

Method, apparatus and program for maintaining the accuracy of equipment anomaly detection model {METHOD, APPARATUS AND PROGRAM FOR MAINTAINING ACCURACY OF EQUIPMENT ANOMALY DETECTION MODEL}

본 발명은 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법, 장치 및 프로그램에 관한 것으로서, 구체적으로 이상 탐지 모델의 업데이트 및 초기화 과정을 통해 탐지 모델의 정확도를 유지하기 위한 방법, 장치 및 프로그램에 관한 것이다.The present invention relates to a method, device, and program for maintaining the accuracy of an equipment anomaly detection model, and specifically relates to a method, device, and program for maintaining the accuracy of the detection model through the update and initialization process of the anomaly detection model. .

일반적으로 양산 체제를 구축한 공장은 제품의 양산 효율을 높이기 위해 제조 공정이 여러 단계의 공정으로 분업화되어 있으며, 분업화된 여러 단계의 공정 별로 각 공정에 적합한 자동화 설비를 가동하여 운영하고 있다.In general, factories that have established a mass production system have the manufacturing process divided into several stages in order to increase the efficiency of mass production of products, and automation equipment suitable for each process is operated and operated for each of the various stages of the division.

자동화 설비의 경우, 동일한 공정을 반복적으로 수행하는 과정에서 설비 자체의 오류나 주변 환경의 영향에 의해 비정상적으로 공정을 수행하는 상황이 발생할 수 있다. 각각의 공정이 연속적이고, 유기적으로 연계된 제조 공정 시스템에서 일부 공정의 설비에 이상이 발생하는 경우, 전체 공정 및 생산품의 불량을 초래할 수 있다. 이에 따라, 각 공정 별로 가동되는 공장 설비에 대해 이상 여부를 주기적으로 탐지하는 작업은 공장 양상 체제의 유지 관리 차원에서 매우 중요하다.In the case of automated equipment, in the process of repeatedly performing the same process, situations may arise where the process is performed abnormally due to errors in the equipment itself or the influence of the surrounding environment. In a manufacturing process system where each process is continuous and organically linked, if a problem occurs in the equipment of some process, it may result in defects in the entire process and product. Accordingly, periodic detection of abnormalities in factory equipment operated for each process is very important in terms of maintenance of the factory operation system.

공장 설비의 이상 여부에 대한 탐지는, 숙련된 작업자가 해당 설비의 가동 상황을 다양한 센서 공정 데이터를 통해 수시로 체크하는 것으로 이루어질 수 있다. 다만, 아무리 숙련된 작업자라도 실시간으로 쏟아지는 방대한 센서 공정 데이터들을 모두 확인하여 정확하게 이상 데이터(또는 비정상 데이터)를 탐지하는 데는 한계가 있으며, 이상 데이터 탐지 과정에 많은 시간이 소요될 수 있다. 또한, FA 시스템의 도입 등 공장의 설비가 복잡해짐에 따라, 작업자에게 요구되는 지식과 노하우가 매우 많아져, 경험이 부족한 작업자에게는 비정상 상태가 된 요인의 특정이 곤란한 경우가 발생할 수 있다.Detection of abnormalities in factory equipment can be accomplished by skilled workers frequently checking the operation status of the equipment through various sensor process data. However, no matter how skilled a worker is, there are limits to accurately detecting abnormal data (or abnormal data) by checking all the vast amounts of sensor process data flowing in in real time, and the abnormal data detection process can take a lot of time. In addition, as factory equipment becomes more complex, such as the introduction of FA systems, the knowledge and know-how required for workers increases, and it may be difficult for inexperienced workers to identify the factors that led to the abnormal state.

한편, 일시적으로 또는 데이터베이스에 저장되어 영구적으로 사용할 수 있는 센서 공정 데이터가 축적됨에 따라, 다양한 분야에 관련한 산업 장비의 모니터링 데이터의 자동화 처리에 대한 연구가 진행되고 있다. 특히, 컴퓨터 기술의 발전으로 처리할 수 있는 정보량이 늘어남에 따라 인공지능이 빠른 속도로 진화하고 있으며, 인공지능을 활용하여 공정 데이터의 이상 여부를 탐지하기 기술들에 대한 연구 개발이 진행되고 있다.Meanwhile, as sensor process data that can be used temporarily or permanently by being stored in a database is accumulated, research is being conducted on automated processing of monitoring data from industrial equipment related to various fields. In particular, as the amount of information that can be processed increases with the development of computer technology, artificial intelligence is evolving at a rapid pace, and research and development is underway on technologies to detect abnormalities in process data using artificial intelligence.

대한민국 공개특허 제10-2021-0128713호는, 딥러닝을 활용하여 실시간 수집되는 제품 공정 데이터를 분석함으로써, 공정 품질에 대한 분석 및 예측을 수행하는 방법을 개시하고 있다.Republic of Korea Patent Publication No. 10-2021-0128713 discloses a method of analyzing and predicting process quality by analyzing product process data collected in real time using deep learning.

본 발명은 전술한 배경기술에 대응하여 안출된 것으로 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법, 장치 및 프로그램을 제공하고자 하는 것이다.The present invention was conceived in response to the above-described background technology and is intended to provide a method, device, and program for maintaining the accuracy of an equipment abnormality detection model.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the description below.

전술한 바와 같은 과제를 해결하기 위한 본 발명의 일 실시예에 따라, 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법이 개시된다. 상기 방법은: 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링하는 단계; 상기 이상 탐지 모델이 특정 장비에서 획득된 특정 센서 데이터에서 이상 데이터를 탐지한 경우, 상기 특정 장비와 관련된 예방정비 정보를 획득하고, 상기 특정 장비에 대한 정합성 검사를 수행하는 단계; 상기 정합성 검사를 기초로, 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것인지 또는, 상기 특정 장비의 불량으로 인해 탐지된 것인지 여부를 인식하는 단계; 및 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식한 경우, 상기 이상 데이터를 기초로 상기 이상 탐지 모델을 업데이트하는 단계;를 포함할 수 있다.According to an embodiment of the present invention for solving the problems described above, a method for maintaining the accuracy of an equipment abnormality detection model is disclosed. The method includes: monitoring sensor data of each of one or more devices using a learned anomaly detection model to detect anomalies in the devices; When the abnormality detection model detects abnormal data in specific sensor data obtained from specific equipment, obtaining preventive maintenance information related to the specific equipment and performing a consistency check on the specific equipment; Based on the consistency check, recognizing whether the abnormal data detected by the abnormality detection model was detected due to preventive maintenance or due to a defect in the specific equipment; And when the abnormal data detected by the abnormality detection model is recognized as being detected by preventive maintenance, updating the abnormality detection model based on the abnormal data.

대안적인 실시예에서, 상기 이상 탐지 모델이 특정 장비에서 획득된 특정 센서 데이터에서 이상 데이터를 탐지한 경우, 상기 특정 장비와 관련된 예방정비 정보를 획득하고, 상기 특정 장비에 대한 정합성 검사를 수행하는 단계는, 상기 예방정비 정보를 기초로 상기 특정 장비에서 예방정비가 수행되었는지 여부를 인식하는 단계; 및 상기 특정 장비에서 예방정비가 수행되었다고 인식한 경우, 상기 정합성 검사를 수행하는 단계;를 포함할 수 있다.In an alternative embodiment, when the anomaly detection model detects abnormal data in specific sensor data obtained from specific equipment, obtaining preventive maintenance information related to the specific equipment and performing a consistency check for the specific equipment. Recognizing whether preventive maintenance has been performed on the specific equipment based on the preventive maintenance information; and performing the consistency test when it is recognized that preventive maintenance has been performed on the specific equipment.

대안적인 실시예에서, 상기 정합성 검사를 수행하는 단계는, 예방정비가 수행된 특정 장비 및 상기 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각의 센서 데이터를 획득하는 단계; 상기 센서 데이터를 샘플링하여, 각 장비에서 획득된 센서 데이터에 대한 얼라인(align)을 수행하는 단계; 상기 샘플링한 센서 데이터에서 장비 별 최소 값, 최대 값 및 중간 값을 결정하는 단계; 상기 장비 별 상기 최소 값, 상기 최대 값 및 상기 중간 값 중 적어도 하나를 기초로 각 장비에 대응하는 센서 데이터의 유사도를 측정하는 단계; 및 상기 유사도에 기초하여, 상기 예방정비가 수행된 상기 특정 장비의 정상 여부를 결정하는 단계;를 포함할 수 있다.In an alternative embodiment, performing the consistency check may include acquiring sensor data for each of a specific device on which preventive maintenance has been performed and one or more devices operating with the same recipe as the specific device; sampling the sensor data and performing alignment on the sensor data acquired from each device; determining minimum, maximum, and median values for each device from the sampled sensor data; measuring similarity of sensor data corresponding to each device based on at least one of the minimum value, the maximum value, and the intermediate value for each device; and determining, based on the similarity, whether the specific equipment on which the preventive maintenance has been performed is normal.

대안적인 실시예에서, 상기 정합성 검사를 기초로, 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것인지 또는, 상기 특정 장비의 불량으로 인해 탐지된 것인지 여부를 인식하는 단계는, 상기 정합성 검사를 수행함에 따라 상기 특정 장비가 정상이라고 결정한 경우 상기 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식하고, 상기 정합성 검사를 수행함에 따라 상기 특정 장비가 정상이 아니라고 결정한 경우 상기 이상 탐지 모델이 탐지한 이상 데이터가 특정 장비의 불량으로 인해 탐지된 것이라고 인식하는 단계;를 포함할 수 있다.In an alternative embodiment, based on the consistency check, the step of recognizing whether the abnormal data detected by the abnormality detection model is detected by preventive maintenance or is detected due to a defect in the specific equipment includes: When it is determined that the specific equipment is normal by performing the consistency check, and the abnormal data detected by the anomaly detection model is recognized as having been detected by preventive maintenance, and when it is determined that the specific equipment is not normal by performing the consistency check. It may include recognizing that abnormal data detected by the anomaly detection model was detected due to a defect in specific equipment.

대안적인 실시예에서, 상기 이상 탐지 모델을 업데이트하는 단계는, 상기 이상 데이터를 스케릴링하여 정규화된 데이터를 획득하는 단계; 상기 이상 탐지 모델의 특징 추출 레이어(Feature extraction layer)를 유지한 상태에서, 상기 정규화된 데이터를 이용하여 완전 연결 레이어(Fully connected layer)를 전이학습시키는 단계;를 포함할 수 있다.In an alternative embodiment, updating the anomaly detection model includes: scaling the anomaly data to obtain normalized data; It may include the step of transfer learning a fully connected layer using the normalized data while maintaining the feature extraction layer of the anomaly detection model.

대안적인 실시예에서, 상기 이상 탐지 모델의 특징 추출 레이어(Feature extraction layer)를 유지한 상태에서, 상기 정규화된 데이터를 이용하여 완전 연결 레이어(Fully connected layer)를 전이학습시키는 단계는, 상기 이상 탐지 모델에서 이상을 탐지하기 위한 제1 최대 값과 상기 정규화된 데이터의 제2 최대 값을 비교하는 단계; 및 상기 제2 최대 값이 상기 제1 최대 값 보다 큰 경우, 상기 이상 탐지 모델에서 이상을 탐지하기 위한 상기 제1 최대 값을 상기 제2 최대 값으로 변경하는 단계;를 포함할 수 있다.In an alternative embodiment, the step of transfer learning a fully connected layer using the normalized data while maintaining the feature extraction layer of the anomaly detection model includes detecting the anomaly. Comparing a first maximum value for detecting anomalies in a model and a second maximum value of the normalized data; and when the second maximum value is greater than the first maximum value, changing the first maximum value for detecting an anomaly in the anomaly detection model to the second maximum value.

대안적인 실시예에서, 상기 업데이트된 상기 이상 탐지 모델은, 이상을 탐지하기 위한 데이터의 범위가 증가하도록 업데이트되고, 상기 이상 탐지 모델의 정확도를 유지하기 위해 기 설정된 주기 마다 상기 이상 탐지 모델을 초기 학습된 모델로 초기화시킬 수 있다.In an alternative embodiment, the updated anomaly detection model is updated to increase the range of data for detecting anomaly, and initial training is performed on the anomaly detection model at preset intervals to maintain the accuracy of the anomaly detection model. It can be initialized to the model.

대안적인 실시예에서, 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링하는 단계는, 공정 센서 데이터 및 상기 공정 센서 데이터에 대응하는 사전 지식 데이터를 획득하는 단계; 딥러닝 모델을 활용하여 상기 공정 센서 데이터에 대응하는 재건 공정 센서 데이터를 생성하는 단계; 상기 공정 센서 데이터 및 상기 재건 공정 센서 데이터에 기초하여 재건율 오차를 산출하는 단계; 및 상기 재건율 오차와 기준 임계값의 비교에 기초하여 비정상 동작을 감지하는 단계;를 포함하며, 상기 딥러닝 모델은, 상기 공정 센서 데이터에 대응하는 피처 정보를 출력하는 추출하는 제1서브 모델; 상기 공정 센서 데이터 및 상기 사전 지식 데이터에 기초하여 각 공정 센서 데이터들 간의 교호 관계 정보를 추출하는 제2서브 모델; 상기 제1서브 모델 및 상기 제2서브 모델의 출력을 조합하여 특징 정보를 생성하는 어텐션 모듈; 및 상기 특징 정보를 복원하여 상기 재건 공정 센서 데이터를 생성하는 차원 복원모델;을 포함할 수 있다.In an alternative embodiment, monitoring sensor data of each of one or more pieces of equipment using a learned anomaly detection model to detect anomalies in the equipment includes obtaining process sensor data and prior knowledge data corresponding to the process sensor data. steps; Generating reconstruction process sensor data corresponding to the process sensor data using a deep learning model; calculating a reconstruction rate error based on the process sensor data and the reconstruction process sensor data; and detecting abnormal operation based on comparison of the reconstruction rate error and a reference threshold, wherein the deep learning model includes: a first sub-model for extracting feature information corresponding to the process sensor data; a second sub-model that extracts interaction relationship information between each process sensor data based on the process sensor data and the prior knowledge data; an attention module that generates feature information by combining outputs of the first sub-model and the second sub-model; and a dimensional reconstruction model that restores the feature information to generate the reconstruction process sensor data.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따라, 장치가 개시된다. 상기 장치는: 하나 이상의 인스트럭션을 저장하는 메모리; 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션을 실행하는 프로세서를 포함하고, 상기 프로세서는 상기 하나 이상의 인스트럭션을 실행함으로써, 상술한 방법들을 수행할 수 있다.According to one embodiment of the present invention for solving the above-described problems, a device is disclosed. The device includes: a memory storing one or more instructions; and a processor executing the one or more instructions stored in the memory, and the processor may perform the above-described methods by executing the one or more instructions.

상술한 과제를 해결하기 위한 본 발명의 일 실시예에 따라, 하드웨어인 컴퓨터와 결합되어, 상술한 방법들을 수행할 수 있도록 컴퓨터에서 독출가능한 기록매체에 저장된 컴퓨터프로그램이 개시된다.According to an embodiment of the present invention for solving the above-described problem, a computer program is disclosed that is combined with a computer as hardware and stored in a computer-readable recording medium to perform the above-described methods.

본 발명의 기타 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Other specific details of the invention are included in the detailed description and drawings.

본 발명은 장비에서 측정된 센서 데이터에 대한 이상 판정이 나타나더라도, 동일한 레시피로 동작되는 장비에서 측정된 센서 데이터와의 정합성 검사를 통해 예방정비에 의한 이상 판정인지, 또는 장비에 이상이 생겼는지 여부를 판단할 수 있다.In the present invention, even if an abnormality is determined for the sensor data measured from the equipment, it is possible to determine whether the abnormality is due to preventive maintenance or whether an abnormality has occurred in the equipment through a consistency check with sensor data measured from equipment operating with the same recipe. You can judge.

또한, 본 발명은 장비에 대한 이상 탐지를 수행하는 이상 탐지 모델에 이상 판정된 데이터를 이용해 전이학습시켜, 데이터 수집 및 모델 학습에 필요한 시간과 비용을 줄일 수 있으며, 나아가 이상 탐지 모델의 정확도를 유지시킬 수 있다.In addition, the present invention can reduce the time and cost required for data collection and model learning by transfer learning the anomaly detection model that detects anomalies in equipment using data determined to be abnormal, and further maintains the accuracy of the anomaly detection model. You can do it.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 본 발명의 일 실시예에 따른 시스템을 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.
도 3은 본 발명의 일 실시예에 따른 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법의 일례를 설명하기 위한 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 유사도를 측정하는 방법의 일례를 설명하기 위한 흐름도이다.
도 5는 본 발명의 일 실시예에 따른 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법의 일례를 설명하기 위한 흐름도이다.
도 6은 본 발명의 일 실시예에 따른 특정 장비에 대한 정합성 검사를 수행할지 여부를 결정하는 방법의 일례를 설명하기 위한 흐름도다.
도 7 내지 도 9는 본 발명의 일 실시예에 따른 센서 데이터에 대한 얼라인을 수행하는 방법의 일례를 설명하기 위한 도면이다.
도 10은 본 발명의 일 실시예와 관련된 하나 이상의 네트워크 함수를 나타낸 개략도이다.1 is a diagram illustrating a system according to an embodiment of the present invention.
Figure 2 is a hardware configuration diagram of a computing device according to an embodiment of the present invention.
Figure 3 is a flowchart illustrating an example of a method for checking consistency between a plurality of manufacturing process equipment based on sensor data according to an embodiment of the present invention.
Figure 4 is a flowchart illustrating an example of a method for measuring similarity according to an embodiment of the present invention.
Figure 5 is a flowchart illustrating an example of a method for maintaining the accuracy of an equipment abnormality detection model according to an embodiment of the present invention.
Figure 6 is a flowchart illustrating an example of a method for determining whether to perform a consistency check for specific equipment according to an embodiment of the present invention.
7 to 9 are diagrams for explaining an example of a method for performing alignment on sensor data according to an embodiment of the present invention.
Figure 10 is a schematic diagram showing one or more network functions related to one embodiment of the present invention.

다양한 실시예들이 이제 도면을 참조하여 설명된다. 본 명세서에서, 다양한 설명들이 본 발명의 이해를 제공하기 위해서 제시된다. 그러나, 이러한 실시예들은 이러한 구체적인 설명 없이도 실행될 수 있음이 명백하다.Various embodiments are now described with reference to the drawings. In this specification, various descriptions are presented to provide an understanding of the invention. However, it is clear that these embodiments may be practiced without these specific descriptions.

본 명세서에서 사용되는 용어 "컴포넌트", "모듈", "시스템" 등은 컴퓨터-관련 엔티티, 하드웨어, 펌웨어, 소프트웨어, 소프트웨어 및 하드웨어의 조합, 또는 소프트웨어의 실행을 지칭한다. 예를 들어, 컴포넌트는 프로세서상에서 실행되는 처리과정(procedure), 프로세서, 객체, 실행 스레드, 프로그램, 및/또는 컴퓨터일 수 있지만, 이들로 제한되는 것은 아니다. 예를 들어, 컴퓨팅 장치에서 실행되는 애플리케이션 및 컴퓨팅 장치 모두 컴포넌트일 수 있다. 하나 이상의 컴포넌트는 프로세서 및/또는 실행 스레드 내에 상주할 수 있다. 일 컴포넌트는 하나의 컴퓨터 내에 로컬화 될 수 있다. 일 컴포넌트는 2개 이상의 컴퓨터들 사이에 분배될 수 있다. 또한, 이러한 컴포넌트들은 그 내부에 저장된 다양한 데이터 구조들을 갖는 다양한 컴퓨터 판독가능한 매체로부터 실행할 수 있다. 컴포넌트들은 예를 들어 하나 이상의 데이터 패킷들을 갖는 신호(예를 들면, 로컬 시스템, 분산 시스템에서 다른 컴포넌트와 상호작용하는 하나의 컴포넌트로부터의 데이터 및/또는 신호를 통해 다른 시스템과 인터넷과 같은 네트워크를 통해 전송되는 데이터)에 따라 로컬 및/또는 원격 처리들을 통해 통신할 수 있다.As used herein, the terms “component,” “module,” “system,” and the like refer to a computer-related entity, hardware, firmware, software, a combination of software and hardware, or an implementation of software. For example, a component may be, but is not limited to, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a computing device and the computing device can be a component. One or more components may reside within a processor and/or thread of execution. A component may be localized within one computer. A component may be distributed between two or more computers. Additionally, these components can execute from various computer-readable media having various data structures stored thereon. Components can transmit signals, for example, with one or more data packets (e.g., data and/or signals from one component interacting with other components in a local system, a distributed system, to other systems and over a network such as the Internet). Depending on the data being transmitted, they may communicate through local and/or remote processes.

더불어, 용어 "또는"은 배타적 "또는"이 아니라 내포적 "또는"을 의미하는 것으로 의도된다. 즉, 달리 특정되지 않거나 문맥상 명확하지 않은 경우에, "X는 A 또는 B를 이용한다"는 자연적인 내포적 치환 중 하나를 의미하는 것으로 의도된다. 즉, X가 A를 이용하거나; X가 B를 이용하거나; 또는 X가 A 및 B 모두를 이용하는 경우, "X는 A 또는 B를 이용한다"가 이들 경우들 어느 것으로도 적용될 수 있다. 또한, 본 명세서에 사용된 "및/또는"이라는 용어는 열거된 관련 아이템들 중 하나 이상의 아이템의 가능한 모든 조합을 지칭하고 포함하는 것으로 이해되어야 한다.Additionally, the term “or” is intended to mean an inclusive “or” and not an exclusive “or.” That is, unless otherwise specified or clear from context, “X utilizes A or B” is intended to mean one of the natural implicit substitutions. That is, either X uses A; X uses B; Or, if X uses both A and B, “X uses A or B” can apply to either of these cases. Additionally, the term “and/or” as used herein should be understood to refer to and include all possible combinations of one or more of the related listed items.

또한, "포함한다" 및/또는 "포함하는"이라는 용어는, 해당 특징 및/또는 구성요소가 존재함을 의미하는 것으로 이해되어야 한다. 다만, "포함한다" 및/또는 "포함하는"이라는 용어는, 하나 이상의 다른 특징, 구성요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않는 것으로 이해되어야 한다. 또한, 달리 특정되지 않거나 단수 형태를 지시하는 것으로 문맥상 명확하지 않은 경우에, 본 명세서와 청구범위에서 단수는 일반적으로 "하나 또는 그 이상"을 의미하는 것으로 해석되어야 한다.Additionally, the terms “comprise” and/or “comprising” should be understood to mean that the corresponding feature and/or element is present. However, the terms “comprise” and/or “comprising” should be understood as not excluding the presence or addition of one or more other features, elements and/or groups thereof. Additionally, unless otherwise specified or the context is clear to indicate a singular form, the singular terms herein and in the claims should generally be construed to mean “one or more.”

당업자들은 추가적으로 여기서 개시된 실시예들과 관련되어 설명된 다양한 예시적 논리적 블록들, 구성들, 모듈들, 회로들, 수단들, 로직들, 및 알고리즘 단계들이 전자 하드웨어, 컴퓨터 소프트웨어, 또는 양쪽 모두의 조합들로 구현될 수 있음을 인식해야 한다. 하드웨어 및 소프트웨어의 상호교환성을 명백하게 예시하기 위해, 다양한 예시적 컴포넌트들, 블록들, 구성들, 수단들, 로직들, 모듈들, 회로들, 및 단계들은 그들의 기능성 측면에서 일반적으로 위에서 설명되었다. 그러한 기능성이 하드웨어로 또는 소프트웨어로서 구현되는지 여부는 전반적인 시스템에 부과된 특정 어플리케이션(application) 및 설계 제한들에 달려 있다. 숙련된 기술자들은 각각의 특정 어플리케이션들을 위해 다양한 방법들로 설명된 기능성을 구현할 수 있다. 다만, 그러한 구현의 결정들이 본 발명내용의 영역을 벗어나게 하는 것으로 해석되어서는 안된다.Those skilled in the art will additionally recognize that the various illustrative logical blocks, components, modules, circuits, means, logic, and algorithm steps described in connection with the embodiments disclosed herein may be implemented using electronic hardware, computer software, or a combination of both. It must be recognized that it can be implemented with To clearly illustrate the interchangeability of hardware and software, various illustrative components, blocks, configurations, means, logics, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented in hardware or software will depend on the specific application and design constraints imposed on the overall system. A skilled technician can implement the described functionality in a variety of ways for each specific application. However, such implementation decisions should not be construed as departing from the scope of the present invention.

제시된 실시예들에 대한 설명은 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시예들에 대한 다양한 변형들은 본 발명의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이다. 여기에 정의된 일반적인 원리들은 본 발명의 범위를 벗어남이 없이 다른 실시예들에 적용될 수 있다. 그리하여, 본 발명은 여기에 제시된 실시예들로 한정되는 것이 아니다. 본 발명은 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다.The description of the presented embodiments is provided to enable anyone skilled in the art to use or practice the present invention. Various modifications to these embodiments will be apparent to those skilled in the art. The general principles defined herein may be applied to other embodiments without departing from the scope of the invention. Therefore, the present invention is not limited to the embodiments presented herein. The present invention is to be interpreted in the broadest scope consistent with the principles and novel features presented herein.

본 명세서에서, 컴퓨터는 적어도 하나의 프로세서를 포함하는 모든 종류의 하드웨어 장치를 의미하는 것이고, 실시 예에 따라 해당 하드웨어 장치에서 동작하는 소프트웨어적 구성도 포괄하는 의미로서 이해될 수 있다. 예를 들어, 컴퓨터는 스마트폰, 태블릿 PC, 데스크톱, 노트북 및 각 장치에서 구동되는 사용자 클라이언트 및 애플리케이션을 모두 포함하는 의미로서 이해될 수 있으며, 또한 이에 제한되는 것은 아니다.In this specification, a computer refers to all types of hardware devices including at least one processor, and depending on the embodiment, it may be understood as encompassing software configurations that operate on the hardware device. For example, a computer can be understood to include, but is not limited to, a smartphone, tablet PC, desktop, laptop, and user clients and applications running on each device.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

본 명세서에서 설명되는 각 단계들은 컴퓨터에 의하여 수행되는 것으로 설명되나, 각 단계의 주체는 이에 제한되는 것은 아니며, 실시 예에 따라 각 단계들의 적어도 일부가 서로 다른 장치에서 수행될 수도 있다.Each step described in this specification is described as being performed by a computer, but the subject of each step is not limited thereto, and depending on the embodiment, at least part of each step may be performed in a different device.

도 1은 본 발명의 일 실시예에 따른 시스템을 도시한 도면이다.1 is a diagram illustrating a system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 시스템은 컴퓨팅 장치(100), 사용자 단말(200) 및 외부 서버(300)를 포함할 수 있다. 도 1에 도시된 시스템은 일 실시예에 따른 것이고, 그 구성 요소가 도 1에 도시된 실시예에 한정되는 것은 아니며, 필요에 따라 부가, 변경 또는 삭제될 수 있다.Referring to FIG. 1, a system according to an embodiment of the present invention may include a computing device 100, a user terminal 200, and an external server 300. The system shown in FIG. 1 is according to one embodiment, and its components are not limited to the embodiment shown in FIG. 1, and may be added, changed, or deleted as necessary.

일 실시예에서, 인공지능 기술을 활용한 제조 공정 센서 데이터를 측정하는 경우 기준치를 초과하면 바로 불량으로 판정된다. 하지만, 장비 부품 수리, 교체를 할 때 센서 데이터가 이전과는 확연히 달라져서 정상적으로 판정되어야 되는 센서 데이터가 불량으로 판정되는 경우가 존재한다. 이러한 상황이 발생되는 경우 엔지니어가 불량으로 판정된 센서 데이터를 다른 장비, 챔버와 일일이 비교하는 것은 불가능한 일이기 때문에, 이상이 발생했을 경우 자동으로 장비(또는, 챔버) 간 센서 데이터를 비교해 양불 판정을 해야할 필요성이 존재하였다.In one embodiment, when measuring manufacturing process sensor data using artificial intelligence technology, if it exceeds the standard value, it is immediately determined as defective. However, when repairing or replacing equipment parts, there are cases where the sensor data is significantly different from before and sensor data that should be judged normal is judged to be defective. When this situation occurs, it is impossible for an engineer to compare sensor data determined as defective with other equipment or chambers. Therefore, when an abnormality occurs, the sensor data between devices (or chambers) is automatically compared to determine good or bad. There was a need to do it.

본 발명의 일 실시예에 따른 컴퓨팅 장치(100)는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사를 수행할 수 있다.The computing device 100 according to an embodiment of the present invention may perform a consistency check between a plurality of manufacturing process equipment based on sensor data.

구체적으로, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비 및 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각의 센서 데이터를 획득할 수 있다. 여기서, 예방정비는 장비를 구성하는 부품의 교체, 장비를 구성하는 부품의 수리 및 장비 자체에 대한 수리 중 적어도 하나를 포함할 수 있으며 이에 한정되지 않는다.Specifically, the computing device 100 may acquire sensor data for each of the specific equipment on which preventive maintenance has been performed and one or more equipment operated with the same recipe as the specific equipment. Here, preventive maintenance may include, but is not limited to, at least one of replacement of parts constituting the equipment, repair of parts constituting the equipment, and repair of the equipment itself.

컴퓨팅 장치(100)는 센서 데이터를 획득한 경우, 센서 데이터를 샘플링하여, 각 장비에서 획득된 센서 데이터에 대한 얼라인을 수행할 수 있다. 여기서, 얼라인은 센서 데이터를 가공하는 과정으로, 하나 이상의 장비 각각에서 획득된 센서 데이터 각각을 동일한 기준(예를 들어, 시간, 길이 등)으로 조절하는 과정을 포함할 수 있다.When sensor data is acquired, the computing device 100 may sample the sensor data and perform alignment on the sensor data acquired from each device. Here, alignment is a process of processing sensor data and may include a process of adjusting each sensor data acquired from one or more pieces of equipment to the same standard (eg, time, length, etc.).

컴퓨팅 장치(100)는 얼리인을 수행한 후, 샘플링한 센서 데이터에서 장비 별 최소 값, 최대 값 및 중간 값을 결정할 수 있다. 또한, 컴퓨팅 장치(100)는 장비 별 최소 값, 최대 값 및 중간 값 중 적어도 하나를 기초로 각 장비에 대응하는 센서 데이터의 유사도를 측정할 수 있다. 그리고, 컴퓨팅 장치(100)는 유사도에 기초하여, 예방정비가 수행된 특정 장비의 정상 여부를 결정할 수 있다.After performing early-in, the computing device 100 may determine the minimum, maximum, and median values for each device from the sampled sensor data. Additionally, the computing device 100 may measure the similarity of sensor data corresponding to each device based on at least one of the minimum value, maximum value, and median value for each device. Additionally, the computing device 100 may determine whether the specific equipment on which preventive maintenance has been performed is normal based on the similarity.

따라서, 본 발명의 컴퓨팅 장치(100)는 센서 데이터에 대한 이상 판정이 나타나더라도, 동일한 레시피로 동작되는 장비에서 측정된 센서 데이터와의 정합성 검사를 통해 예방정비에 의한 이상 판정인지, 또는 장비에 이상이 생겼는지 여부를 판단하여, 장비 관리에 대한 편의성을 높일 수 있다.Therefore, even if an abnormality is determined for sensor data, the computing device 100 of the present invention determines whether the abnormality is due to preventive maintenance or an abnormality in the equipment through a consistency check with sensor data measured from equipment operating with the same recipe. By determining whether this has occurred, you can increase the convenience of equipment management.

이하, 컴퓨팅 장치(100)가 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사를 수행하는 방법에 대한 구체적인 설명은 도 3 및 도 4를 참조하여 후술한다.Hereinafter, a detailed description of how the computing device 100 performs a consistency check between a plurality of manufacturing process equipment based on sensor data will be described with reference to FIGS. 3 and 4 .

일 실시예에서, 제조 공정에서의 인공지능 기술을 활용한 이상탐지 솔루션은 학습한 데이터와 다른 데이터가 수집될 경우 이상으로 판단했다. 하지만, 장비 부품의 수리 및 교체 등 장비운용을 위한 유지보수로 인해 발생하는 이상탐지는 실제 불량이 아니기 때문에 새로운 데이터를 반영하여 학습하지 않을 경우 모델의 정확도가 하락하는 문제가 존재한다. 이를 해결하기 위해서는 인공지능 모델의 예측 결과가 실제 불량인지 아닌지를 판단하고, 불량이 아닌 경우 새로운 데이터를 반영하여 학습하는 과정이 필요하지만 일련의 과정들을 현장 엔지니어가 직접 장비 별, 공정 별로 반복해야 하고 학습에 필요한 충분한 양의 데이터를 수집하는데 소요되는 시간으로 인해 현장 도입에 어려움이 있다.In one embodiment, an anomaly detection solution utilizing artificial intelligence technology in the manufacturing process determined an anomaly when data different from learned data was collected. However, since abnormal detection that occurs due to maintenance for equipment operation, such as repair and replacement of equipment parts, is not an actual defect, there is a problem that model accuracy decreases if new data is not reflected and learned. To solve this problem, it is necessary to determine whether the prediction result of the artificial intelligence model is actually defective or not, and if it is not, a process of learning by reflecting new data is necessary. However, the field engineer must repeat the series of processes for each equipment and process. It is difficult to introduce it in the field due to the time it takes to collect a sufficient amount of data needed for learning.

본 발명의 일 실시예에 따르면, 컴퓨팅 장치(100)는 장비 이상 탐지 모델의 정확도를 유지할 수 있다.According to one embodiment of the present invention, the computing device 100 can maintain the accuracy of the equipment abnormality detection model.

구체적으로, 컴퓨팅 장치(100)는 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링할 수 있다. 또한, 컴퓨팅 장치(100)는 이상 탐지 모델이 특정 장비에서 획득된 특정 센서 데이터에서 이상 데이터를 탐지한 경우, 특정 장비와 관련된 예방정비 정보를 획득하고, 특정 장비에 대한 정합성 검사를 수행할 수 있다.Specifically, the computing device 100 may monitor sensor data of each of one or more devices using a learned anomaly detection model to detect anomalies in the equipment. Additionally, when the anomaly detection model detects abnormal data in specific sensor data obtained from specific equipment, the computing device 100 may obtain preventive maintenance information related to the specific equipment and perform a consistency check on the specific equipment. .

컴퓨팅 장치(100)는 정합성 검사를 기초로, 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것인지 또는, 특정 장비의 불량으로 인해 탐지된 것인지 여부를 인식할 수 있다. 그리고, 컴퓨팅 장치(100)는 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식한 경우, 이상 데이터를 기초로 이상 탐지 모델을 업데이트할 수 있다. 여기서, 이상 탐지 모델의 업데이트는 전이학습 방식으로 처리될 수 있으나 이에 한정되는 것은 아니다.Based on the consistency check, the computing device 100 may recognize whether the abnormal data detected by the abnormality detection model was detected by preventive maintenance or was detected due to a defect in specific equipment. Additionally, when the computing device 100 recognizes that the abnormal data detected by the abnormality detection model is detected through preventive maintenance, the computing device 100 may update the abnormality detection model based on the abnormal data. Here, the update of the anomaly detection model may be processed using a transfer learning method, but is not limited to this.

따라서, 본 발명의 컴퓨팅 장치(100)는 전이학습을 통해 데이터 수집 및 모델 학습에 필요한 시간과 비용을 줄일 수 있으며, 지속적인 업데이트를 통해 이상 탐지 모델의 정확도를 유지시킬 수 있다.Therefore, the computing device 100 of the present invention can reduce the time and cost required for data collection and model learning through transfer learning, and can maintain the accuracy of the anomaly detection model through continuous updates.

또한, 본 발명은 예방정비 정보(예를 들어, 부품교체)와 장비 간의 센서 데이터의 정합성 검사를 통해 이상 탐지 모델의 예측결과 판단에 있어 현장 엔지니어의 불필요한 개입을 최소화할 수 있다.In addition, the present invention can minimize unnecessary intervention of field engineers in determining the predicted results of an abnormality detection model by checking the consistency of sensor data between preventive maintenance information (e.g., part replacement) and equipment.

또한, 초기모델을 학습하는데 비해 전이학습은 데이터 수집, 모델 학습에 필요한 시간과 비용이 줄어든다는 장점이 있기 때문에 엔지니어의 별도 작업 없이 자동으로 이상 탐지 모델의 정확도를 유지할 수 있어 현장 도입에 용이할 것으로 기대할 수 있다.In addition, compared to learning an initial model, transfer learning has the advantage of reducing the time and cost required for data collection and model learning, so it can automatically maintain the accuracy of the anomaly detection model without additional work by engineers, making it easy to introduce in the field. You can expect it.

이하, 컴퓨팅 장치(100)가 장비 이상 탐지 모델의 정확도를 유지하는 방법에 대한 구체적인 설명은 도 5 및 도 6을 참조하여 후술한다.Hereinafter, a detailed description of how the computing device 100 maintains the accuracy of the equipment abnormality detection model will be described with reference to FIGS. 5 and 6.

다양한 실시예에서, 컴퓨팅 장치(100)는 웹(Web) 또는 애플리케이션(Application) 기반의 서비스를 제공할 수 있다. 그러나, 이에 한정되지 않는다.In various embodiments, the computing device 100 may provide web- or application-based services. However, it is not limited to this.

컴퓨팅 장치(100)는 예를 들어, 마이크로프로세서, 메인프레임 컴퓨터, 디지털 프로세서, 휴대용 디바이스 및 디바이스 제어기 등과 같은 임의의 타입의 컴퓨터 시스템 또는 컴퓨터 디바이스를 포함할 수 있다. 다만, 이에 한정되는 것은 아니다.Computing device 100 may include any type of computer system or computer device, such as, for example, microprocessors, mainframe computers, digital processors, portable devices, and device controllers. However, it is not limited to this.

이하, 컴퓨팅 장치(100)의 하드웨어 구성에 대한 설명은 도 2를 참조하여 후술한다.Hereinafter, the hardware configuration of the computing device 100 will be described with reference to FIG. 2 .

한편, 사용자 단말(200)은 네트워크(400)를 통해 컴퓨팅 장치(100)와 연결될 수 있으며, 컴퓨팅 장치(100)에서 수행되는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 및 제조 공정 장비를 관리하는 관리자의 단말일 수 있다. 또한, 사용자 단말(200)은 장비 이상 탐지 모델의 정확도를 유지하기 위해 컴퓨팅 장치(100)에서 수행되는 프로세스를 관리하는 관리자의 단말일 수 있다.Meanwhile, the user terminal 200 may be connected to the computing device 100 through the network 400, and may check consistency between a plurality of manufacturing process equipment based on sensor data performed by the computing device 100 and manage the manufacturing process equipment. It may be the administrator's terminal. Additionally, the user terminal 200 may be an administrator's terminal that manages processes performed on the computing device 100 to maintain the accuracy of the equipment abnormality detection model.

여기서, 사용자 단말(200)은 예를 들어, 다양한 형태의 컴퓨터 장치를 포함할 수 있다. 자세히 예를 들어, 사용자 단말(200)은 스마트폰, 태블릿 PC, 데스크톱, 노트북과 같은 다양한 단말 장치를 의미할 수 있다.Here, the user terminal 200 may include, for example, various types of computer devices. For example, the user terminal 200 may refer to various terminal devices such as a smartphone, tablet PC, desktop, or laptop.

사용자 단말(200)은 단말의 적어도 일부분에 디스플레이를 포함하며, 컴퓨팅 장치(100)로부터 제공되는 애플리케이션 혹은 확장 프로그램 기반의 서비스 구동을 위한 운영체제를 포함할 수 있다. 예를 들어, 사용자 단말(200)은 스마트폰(Smart-phone)일 수 있으나, 이에 한정되지 않고, 사용자 단말(200)은, 휴대성과 이동성이 보장되는 무선 통신 장치로서, 네비게이션, PCS(Personal Communication System), GSM(Global System for Mobile communications), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet) 단말, 스마트 패드(Smartpad), 태블릿 PC(Tablet PC) 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치를 포함할 수 있다.The user terminal 200 includes a display in at least a portion of the terminal, and may include an operating system for running an application or extension program-based service provided by the computing device 100. For example, the user terminal 200 may be a smart phone, but is not limited to this. The user terminal 200 is a wireless communication device that guarantees portability and mobility, and may be used for navigation, personal communication (PCS), etc. System), GSM (Global System for Mobile communications), PDC (Personal Digital Cellular), PHS (Personal Handyphone System), PDA (Personal Digital Assistant), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)- 2000, all types of handheld-based wireless communication devices such as W-CDMA (W-Code Division Multiple Access), Wibro (Wireless Broadband Internet) terminals, smartpads, tablet PCs, etc. It can be included.

외부 서버(300)는 네트워크(400)를 통해 컴퓨팅 장치(100)와 연결될 수 있으며, 컴퓨팅 장치(100)가 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법 및 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법 각각을 수행하기 위하여 필요한 각종 정보/데이터를 송수신 할 수 있고, 컴퓨팅 장치(100)가 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법 및 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법 각각을 수행함에 따라 생성되는 각종 정보/데이터를 저장 및 관리할 수 있다.The external server 300 may be connected to the computing device 100 through the network 400, and the computing device 100 maintains the accuracy of the consistency inspection method and equipment abnormality detection model between a plurality of manufacturing process equipment based on sensor data. It is possible to transmit and receive various information/data necessary to perform each of the methods, and the computing device 100 provides a method for checking consistency between a plurality of manufacturing process equipment based on sensor data and a method for maintaining the accuracy of the equipment abnormality detection model. Various information/data generated as each operation is performed can be stored and managed.

예를 들어, 외부 서버(300)는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법 및 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법에서 이용되는 정보를 저장하는 데이터베이스 서버일 수 있다. 다른 예를 들어, 외부 서버(300)는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법 및 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법에 이용되는 정보를 제공하는 서버일 수 있다.For example, the external server 300 may be a database server that stores information used in a method for checking consistency between a plurality of manufacturing process equipment based on sensor data and a method for maintaining the accuracy of an equipment abnormality detection model. For another example, the external server 300 may be a server that provides information used in a method for checking consistency between a plurality of manufacturing process equipment based on sensor data and a method for maintaining the accuracy of an equipment abnormality detection model.

네트워크(400)는 컴퓨팅 장치, 복수의 단말 및 서버들과 같은 각각의 노드 상호 간에 정보 교환이 가능한 연결 구조를 의미할 수 있다. 예를 들어, 네트워크(400)는 근거리 통신망(LAN: Local Area Network), 광역 통신망(WAN: Wide Area Network), 인터넷(WWW: World Wide Web), 유무선 데이터 통신망, 전화망, 유무선 텔레비전 통신망 등을 포함한다.The network 400 may refer to a connection structure that allows information exchange between nodes such as a computing device, a plurality of terminals, and servers. For example, the network 400 includes a local area network (LAN), a wide area network (WAN), the World Wide Web (WWW), a wired and wireless data communication network, a telephone network, and a wired and wireless television communication network. do.

무선 데이터 통신망은 3G, 4G, 5G, 3GPP(3rd Generation Partnership Project), 5GPP(5th Generation Partnership Project), LTE(Long Term Evolution), WIMAX(World Interoperability for Microwave Access), 와이파이(Wi-Fi), 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), RF(Radio Frequency), 블루투스(Bluetooth) 네트워크, NFC(Near-Field Communication) 네트워크, 위성 방송 네트워크, 아날로그 방송 네트워크, DMB(Digital Multimedia Broadcasting) 네트워크 등이 포함되나 이에 한정되지는 않는다.Wireless data communication networks include 3G, 4G, 5G, 3GPP (3rd Generation Partnership Project), 5GPP (5th Generation Partnership Project), LTE (Long Term Evolution), WIMAX (World Interoperability for Microwave Access), Wi-Fi, and Internet. (Internet), LAN (Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network), RF (Radio Frequency), Bluetooth (Bluetooth) network, NFC (Near- Field Communication) network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, etc., but is not limited thereto.

도 2는 본 발명의 일 실시예에 따른 컴퓨팅 장치의 하드웨어 구성도이다.Figure 2 is a hardware configuration diagram of a computing device according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 일 실시예에 따른 컴퓨팅 장치(100)는 하나 이상의 프로세서(110), 프로세서(110)에 의하여 수행되는 컴퓨터 프로그램(151)을 로드(Load)하는 메모리(120), 버스(130), 통신 인터페이스(140) 및 컴퓨터 프로그램(151)을 저장하는 스토리지(150)를 포함할 수 있다. 여기서, 도 2에는 본 발명의 실시예와 관련 있는 구성요소들만 도시되어 있다. 따라서, 본 발명이 속한 기술분야의 통상의 기술자라면 도 2에 도시된 구성요소들 외에 다른 범용적인 구성 요소들이 더 포함될 수 있음을 알 수 있다.Referring to FIG. 2, the computing device 100 according to an embodiment of the present invention includes one or more processors 110 and a memory 120 that loads a computer program 151 executed by the processor 110. , it may include a bus 130, a communication interface 140, and a storage 150 that stores a computer program 151. Here, only components related to the embodiment of the present invention are shown in Figure 2. Accordingly, anyone skilled in the art to which the present invention pertains will know that other general-purpose components may be included in addition to the components shown in FIG. 2.

프로세서(110)는 컴퓨팅 장치(100)의 각 구성의 전반적인 동작을 제어한다. 프로세서(110)는 하나 이상의 코어로 구성될 수 있으며, 컴퓨팅 장치의 중앙 처리 장치(CPU: central processing unit), 범용 그래픽 처리 장치(GPGPU: general purpose graphics processing unit), 텐서 처리 장치(TPU: tensor processing unit) 등의 데이터 분석, 딥러닝을 위한 프로세서를 포함할 수 있다. 또는 본 발명의 기술 분야에 잘 알려진 임의의 형태의 프로세서를 포함하여 구성될 수 있다.The processor 110 controls the overall operation of each component of the computing device 100. The processor 110 may be composed of one or more cores, and may include a central processing unit (CPU), a general purpose graphics processing unit (GPGPU), and a tensor processing unit (TPU) of the computing device. unit) may include a processor for data analysis and deep learning. Alternatively, it may be configured to include any type of processor well known in the art of the present invention.

또한, 프로세서(110)는 본 발명의 실시예들에 따른 방법을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있으며, 컴퓨팅 장치(100)는 하나 이상의 프로세서를 구비할 수 있다.Additionally, the processor 110 may perform operations on at least one application or program for executing methods according to embodiments of the present invention, and the computing device 100 may include one or more processors.

다양한 실시예에서, 프로세서(110)는 프로세서(110) 내부에서 처리되는 신호(또는, 데이터)를 일시적 및/또는 영구적으로 저장하는 램(RAM: Random Access Memory, 미도시) 및 롬(ROM: Read-Only Memory, 미도시)을 더 포함할 수 있다. 또한, 프로세서(110)는 그래픽 처리부, 램 및 롬 중 적어도 하나를 포함하는 시스템온칩(SoC: system on chip) 형태로 구현될 수 있다.In various embodiments, the processor 110 includes random access memory (RAM) (not shown) and read memory (ROM) that temporarily and/or permanently store signals (or data) processed within the processor 110. -Only Memory, not shown) may be further included. Additionally, the processor 110 may be implemented in the form of a system on chip (SoC) that includes at least one of a graphics processing unit, RAM, and ROM.

메모리(120)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(120)는 본 발명의 다양한 실시예에 따른 방법/동작을 실행하기 위하여 스토리지(150)로부터 컴퓨터 프로그램(151)을 로드할 수 있다. 메모리(120)에 컴퓨터 프로그램(151)이 로드되면, 프로세서(110)는 컴퓨터 프로그램(151)을 구성하는 하나 이상의 인스트럭션들을 실행함으로써 상기 방법/동작을 수행할 수 있다. 메모리(120)는 RAM과 같은 휘발성 메모리로 구현될 수 있을 것이나, 본 발명의 기술적 범위가 이에 한정되는 것은 아니다.Memory 120 stores various data, commands and/or information. Memory 120 may load a computer program 151 from storage 150 to execute methods/operations according to various embodiments of the present invention. When the computer program 151 is loaded into the memory 120, the processor 110 can perform the method/operation by executing one or more instructions constituting the computer program 151. The memory 120 may be implemented as a volatile memory such as RAM, but the technical scope of the present invention is not limited thereto.

버스(130)는 컴퓨팅 장치(100)의 구성 요소 간 통신 기능을 제공한다. 버스(130)는 주소 버스(address Bus), 데이터 버스(Data Bus) 및 제어 버스(Control Bus) 등 다양한 형태의 버스로 구현될 수 있다.Bus 130 provides communication functionality between components of computing device 100. The bus 130 may be implemented as various types of buses, such as an address bus, a data bus, and a control bus.

통신 인터페이스(140)는 컴퓨팅 장치(100)의 유무선 인터넷 통신을 지원한다. 또한, 통신 인터페이스(140)는 인터넷 통신 외의 다양한 통신 방식을 지원할 수도 있다. 이를 위해, 통신 인터페이스(140)는 본 발명의 기술 분야에 잘 알려진 통신 모듈을 포함하여 구성될 수 있다. 몇몇 실시예에서, 통신 인터페이스(140)는 생략될 수도 있다.The communication interface 140 supports wired and wireless Internet communication of the computing device 100. Additionally, the communication interface 140 may support various communication methods other than Internet communication. To this end, the communication interface 140 may be configured to include a communication module well known in the technical field of the present invention. In some embodiments, communication interface 140 may be omitted.

스토리지(150)는 컴퓨터 프로그램(151)을 비 임시적으로 저장할 수 있다. 컴퓨팅 장치(100)를 통해 본 발명의 실시예에 따른 프로세스를 수행하는 경우, 스토리지(150)는 개시된 실시예에 따른 분석을 수행하기 위하여 필요한 각종 정보를 저장할 수 있다.Storage 150 may store the computer program 151 non-temporarily. When performing a process according to an embodiment of the present invention through the computing device 100, the storage 150 may store various information necessary to perform analysis according to the disclosed embodiment.

스토리지(150)는 ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리 등과 같은 비휘발성 메모리, 하드 디스크, 착탈형 디스크, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터로 읽을 수 있는 기록 매체를 포함하여 구성될 수 있다.The storage 150 is a non-volatile memory such as Read Only Memory (ROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), flash memory, a hard disk, a removable disk, or a device well known in the technical field to which the present invention pertains. It may be configured to include any known type of computer-readable recording medium.

컴퓨터 프로그램(151)은 메모리(120)에 로드 될 때 프로세서(110)로 하여금 본 발명의 다양한 실시예에 따른 방법/동작을 수행하도록 하는 하나 이상의 인스트럭션들을 포함할 수 있다. 즉, 프로세서(110)는 상기 하나 이상의 인스트럭션들을 실행함으로써, 본 발명의 다양한 실시예에 따른 상기 방법/동작을 수행할 수 있다.The computer program 151, when loaded into the memory 120, may include one or more instructions that cause the processor 110 to perform methods/operations according to various embodiments of the present invention. That is, the processor 110 can perform the method/operation according to various embodiments of the present invention by executing the one or more instructions.

일 실시예에서, 컴퓨터 프로그램(151)은 신경망 모델의 학습과 관련된 다양한 작업과 관련된 다양한 방법들을 수행하도록 하는 하나 이상의 인스트럭션을 포함할 수 있다.In one embodiment, computer program 151 may include one or more instructions to perform various methods related to various tasks related to training a neural network model.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of the method or algorithm described in connection with embodiments of the present invention may be implemented directly in hardware, implemented as a software module executed by hardware, or a combination thereof. The software module may be RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), Flash Memory, hard disk, removable disk, CD-ROM, or It may reside on any type of computer-readable recording medium well known in the art to which the present invention pertains.

본 발명의 구성 요소들은 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 애플리케이션)으로 구현되어 매체에 저장될 수 있다. 본 발명의 구성 요소들은 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있으며, 이와 유사하게, 실시예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다.The components of the present invention may be implemented as a program (or application) and stored in a medium in order to be executed in conjunction with a hardware computer. Components of the invention may be implemented as software programming or software elements, and similarly, embodiments may include various algorithms implemented as combinations of data structures, processes, routines or other programming constructs, such as C, C++, , may be implemented in a programming or scripting language such as Java, assembler, etc. Functional aspects may be implemented as algorithms running on one or more processors.

도 3은 본 발명의 일 실시예에 따른 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 방법의 일례를 설명하기 위한 흐름도이다. 도 4는 본 발명의 일 실시예에 따른 유사도를 측정하는 방법의 일례를 설명하기 위한 흐름도이다.Figure 3 is a flowchart illustrating an example of a method for checking consistency between a plurality of manufacturing process equipment based on sensor data according to an embodiment of the present invention. Figure 4 is a flowchart illustrating an example of a method for measuring similarity according to an embodiment of the present invention.

본 발명의 일 실시예에 따르면, 컴퓨팅 장치(100)는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사를 수행할 수 있다.According to an embodiment of the present invention, the computing device 100 may perform a consistency check between a plurality of manufacturing process equipment based on sensor data.

도 3을 참조하면, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비 및 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각의 센서 데이터를 획득할 수 있다(S110). 여기서, 레시피는 제조 공정을 수행하는데 필요한 작업 단계 및 조건을 포함하는 지침을 의미할 수 있다.Referring to FIG. 3, the computing device 100 may acquire sensor data for each of the specific equipment on which preventive maintenance has been performed and one or more equipment operated with the same recipe as the specific equipment (S110). Here, a recipe may refer to instructions including work steps and conditions necessary to perform a manufacturing process.

구체적으로, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비에서 측정된 제1 센서 데이터를 획득할 수 있다. 또한, 컴퓨팅 장치(100)는 예방정비가 수행되기 전 특정 장비에서 측정된 제2 센서 데이터를 획득할 수 있다. 또한, 컴퓨팅 장치(100)는 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터를 획득할 수 있다.Specifically, the computing device 100 may acquire first sensor data measured from specific equipment on which preventive maintenance has been performed. Additionally, the computing device 100 may acquire second sensor data measured from specific equipment before preventive maintenance is performed. Additionally, the computing device 100 may acquire third sensor data measured from each of one or more devices that operate with the same recipe as the specific device.

일 실시예에서, 제1 센서 데이터, 제2 센서 데이터 및 제3 센서 데이터 각각은 웨이퍼를 생산하기 위한 레시피로 기 설정된 개수의 웨이퍼를 생산하는 동안 각 장비에서 측정된 데이터일 수 있다.In one embodiment, each of the first sensor data, second sensor data, and third sensor data may be data measured by each equipment while producing a preset number of wafers using a recipe for producing wafers.

예를 들어, 제1 센서 데이터 및 제3 센서 데이터 각각은 현재 생산된 웨이퍼부터 지난 20개의 웨이퍼를 생산하는 동안 측정된 데이터일 수 있다. 또한, 제2 센서 데이터는 예방정비가 수행되기 전에 생산된 웨이퍼부터 지난 20개의 웨이퍼를 생산하는 동안 측정된 데이터일 수 있다. 다만, 이에 한정되는 것은 아니다.For example, each of the first sensor data and the third sensor data may be data measured while producing the last 20 wafers from the currently produced wafer. Additionally, the second sensor data may be data measured during production of the last 20 wafers starting from the wafer produced before preventive maintenance was performed. However, it is not limited to this.

컴퓨팅 장치(100)는 센서 데이터를 샘플링하여, 각 장비에서 획득된 센서 데이터에 대한 얼라인을 수행할 수 있다(S120).The computing device 100 may sample sensor data and perform alignment on sensor data acquired from each device (S120).

구체적으로, 컴퓨팅 장치(100)는 센서 데이터의 샘플링 에러를 보정하기 위해, 센서 데이터에서 기 설정된 시간 간격으로 데이터를 추출할 수 있다. 또한, 컴퓨팅 장치(100)는 샘플링되지 않은 누락 센서 데이터를 추측하여 누락된 부분의 데이터 값을 채울 수 있다. 또한, 컴퓨팅 장치(100)는 센서 데이터의 평균 프로세스 시간을 산출할 수 있다. 그리고, 컴퓨팅 장치(100)는 센서 데이터를 평균 프로세스 시간에 대응되도록 변경할 수 있다.Specifically, the computing device 100 may extract data from sensor data at preset time intervals to correct sampling errors of sensor data. Additionally, the computing device 100 may guess missing sensor data that has not been sampled and fill in the data values of the missing portion. Additionally, the computing device 100 may calculate the average processing time of sensor data. Additionally, the computing device 100 may change the sensor data to correspond to the average process time.

즉, 본 발명에서 센서 데이터에 대해 수행되는 얼라인은 데이터의 전처리의 일종일 수 있다. 이에 대한 설명은 도 7 내지 9를 참조하여 후술한다.That is, in the present invention, alignment performed on sensor data may be a type of data preprocessing. This will be explained later with reference to FIGS. 7 to 9.

컴퓨팅 장치(100)는 샘플링한 센서 데이터에서 장비 별 최소 값, 최대 값 및 중간 값을 결정할 수 있다(S130).The computing device 100 may determine the minimum, maximum, and median values for each device from the sampled sensor data (S130).

구체적으로, 컴퓨팅 장치(100)는 장비 별 센서 데이터 값 및 제1 값을 곱연산한 제1 곱연산 값을 장비 별 최소 값으로 추출할 수 있다. 그리고, 컴퓨팅 장치(100)는 장비 별 센서 데이터 값 및 제1 값 보다 큰 제2 값을 곱연산한 제2 곱연산 값을 장비 별 최대 값으로 추출할 수 있다.Specifically, the computing device 100 may extract the first multiplication value obtained by multiplying the sensor data value for each device and the first value as the minimum value for each device. Additionally, the computing device 100 may extract a second multiplication value obtained by multiplying the sensor data value for each device and a second value greater than the first value as the maximum value for each device.

예를 들어, 컴퓨팅 장치(100)는 장비 별 센서 데이터 값에서 20%에 해당되는 값을 최소 값으로 추출하고, 센서 데이터 값에서 80%에 해당되는 값을 최대 값으로 추출할 수 있다. 다만, 이에 한정되는 것은 아니다.For example, the computing device 100 may extract a value corresponding to 20% of the sensor data values for each device as the minimum value, and extract a value corresponding to 80% of the sensor data values as the maximum value. However, it is not limited to this.

다양한 실시셰에서, 컴퓨팅 장치(100)는 장비 별 센서 데이터 값에서 상위 20%에 해당되는 값을 삭제하고, 하위 20%에 해당되는 값을 삭제한 후, 샘플링한 센서 데이터에서 장비 별 최소 값, 최대 값 및 중간 값을 결정할 수 있다.In various embodiments, the computing device 100 deletes values corresponding to the top 20% of sensor data values for each device, deletes values corresponding to the bottom 20%, and then selects the minimum value for each device from the sampled sensor data, Maximum and intermediate values can be determined.

즉, 컴퓨팅 장치(100)는 불안정한 값으로 인한 최소 값 및 최대 값의 오류를 방지하기 위한 스케일링을 수행할 수 있다.That is, the computing device 100 may perform scaling to prevent errors in the minimum and maximum values due to unstable values.

컴퓨팅 장치(100)는 장비 별 최소 값과 장비 별 상기 최대 값을 추출한 경우, 최소 값 및 최대 값을 기초로 장비 별 상기 중간 값을 산출할 수 있다.When the minimum value for each device and the maximum value for each device are extracted, the computing device 100 may calculate the intermediate value for each device based on the minimum and maximum values.

예를 들어, 컴퓨팅 장치(100)는 장비 별 최소 값 및 최대 값의 합을 2로 나누어 중간 값을 산출할 수 있다. 즉, 중간 값은 최소 값 및 최대 값의 평균 값을 의미할 수 있으나 이에 한정되는 것은 아니다.For example, the computing device 100 may calculate the intermediate value by dividing the sum of the minimum and maximum values for each device by 2. That is, the middle value may mean the average value of the minimum and maximum values, but is not limited thereto.

컴퓨팅 장치(100)는 장비 별 최소 값, 최대 값 및 중간 값을 결정한 경우, 최소 값, 최대 값 및 중간 값 중 적어도 하나를 기초로 각 장비에 대응하는 센서 데이터의 유사도를 측정할 수 있다(S140).When the minimum, maximum, and intermediate values for each device are determined, the computing device 100 may measure the similarity of sensor data corresponding to each device based on at least one of the minimum, maximum, and intermediate values (S140 ).

구체적으로, 도 4를 참조하면, 컴퓨팅 장치(100)는 제3 센서 데이터에서 결정된 중간 값을 레퍼런스 값으로 결정할 수 있다(S141).Specifically, referring to FIG. 4, the computing device 100 may determine the intermediate value determined from the third sensor data as the reference value (S141).

그리고, 컴퓨팅 장치(100)는 레퍼런스 값을 결정한 경우, 제1 센서 데이터에서 결정된 제1 중간 값과 레퍼런스 값의 유사도를 측정하여 제1 유사도 값을 획득할 수 있다(S142). 또한, 컴퓨팅 장치(100)는 제2 센서 데이터에서 결정된 제2 중간 값과 레퍼런스 값의 유사도를 측정하여 제2 유사도 값을 획득할 수 있다(S143). 또한, 컴퓨팅 장치(100)는 제1 센서 데이터에 추출된 최소 값과 레퍼런스 값의 유사도를 측정하여 제3 유사도 값을 획득할 수 있다(S144). 또한, 컴퓨팅 장치(100)는 제1 센서 데이터에 추출된 최대 값과 레퍼런스 값의 유사도를 측정하여 제4 유사도 값을 획득할 수 있다(S145).Then, when the computing device 100 determines the reference value, the computing device 100 may obtain a first similarity value by measuring the similarity between the first intermediate value determined from the first sensor data and the reference value (S142). Additionally, the computing device 100 may obtain a second similarity value by measuring the similarity between the second intermediate value determined from the second sensor data and the reference value (S143). Additionally, the computing device 100 may obtain a third similarity value by measuring the similarity between the minimum value extracted from the first sensor data and the reference value (S144). Additionally, the computing device 100 may obtain a fourth similarity value by measuring the similarity between the maximum value extracted from the first sensor data and the reference value (S145).

즉, 제1 유사도 값은 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터의 '중간 값(레퍼런스 값)'과 예방정비가 수행된 특정 장비에서 측정된 제1 센서 데이터 '제1 중간 값' 간의 유사도 값일 수 있다.In other words, the first similarity value is the 'median value (reference value)' of the third sensor data measured in each of one or more devices operating with the same recipe as the specific equipment and the first sensor data measured in the specific equipment on which preventive maintenance was performed. It may be a similarity value between 'first intermediate values'.

또한, 제2 유사도 값은 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터의 '중간 값(레퍼런스 값)'과 특정 장비에서 예방정비 이전에 획득된 제2 센서 데이터의 '제2 중간 값' 간의 유사도 값일 수 있다.In addition, the second similarity value is the 'median value (reference value)' of the third sensor data measured from each of one or more devices operating with the same recipe as the specific equipment and the second sensor data obtained before preventive maintenance on the specific equipment. It may be a similarity value between 'second intermediate values'.

또한, 제3 유사도 값은 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터의 '중간 값(레퍼런스 값)'과 특정 장비에서 예방정비 이후에 획득된 제1 센서 데이터의 '최소 값' 간의 유사도 값일 수 있다.In addition, the third similarity value is the 'median value (reference value)' of the third sensor data measured from each of one or more devices operating with the same recipe as the specific equipment and the first sensor data obtained after preventive maintenance on the specific equipment. It may be a similarity value between 'minimum values'.

또한, 제4 유사도 값은 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터의 '중간 값(레퍼런스 값)'과 특정 장비에서 예방정비 이후에 획득된 제1 센서 데이터의 '최대 값' 간의 유사도 값일 수 있다.In addition, the fourth similarity value is the 'median value (reference value)' of the third sensor data measured from each of one or more devices operating with the same recipe as the specific equipment and the first sensor data obtained after preventive maintenance on the specific equipment. It may be a similarity value between 'maximum values'.

본 발명에서 센서 데이터 간 측정되는 유사도는 각 진폭 변화 형태(예컨대, 각 파장) 간의 유사도를 의미할 수 있다. 그리고, 센서 데이터 간 측정되는 유사도는 동적 시간 와핑(DTW, Dynamic Time Wraping) 알고리즘을 통해 판단될 수 있다. 동적 시간 와핑은, 속도 또는 길이에 따라 움직임이 다른 두 시계열 데이터 간의 유사도를 측정하는 알고리즘일 수 있다. 동적 시간 와핑은 거리가 최소화되는 방향으로 매칭시켜 누적 거리가 최소가 되는 뒤틀림(warping) 경로를 찾는 것을 특징으로 할 수 있다.In the present invention, the similarity measured between sensor data may mean the similarity between each amplitude change type (eg, each wavelength). Additionally, the similarity measured between sensor data can be determined through a dynamic time warping (DTW) algorithm. Dynamic time warping may be an algorithm that measures the similarity between two time series data with different movements depending on speed or length. Dynamic time warping may be characterized by finding a warping path that minimizes the cumulative distance by matching in the direction where the distance is minimized.

일반적으로, 두 시계열 데이터 간의 유사도를 판별할 때, 유클리디안 거리 알고리즘을 활용할 수 있다. 유클리디안 거리 알고리즘은, 같은 시간선상에 대한 거리를 계산하여 양 데이터 간 유사도를 판별할 수 있다. 예컨대, 유클리드 거리를 활용하는 경우, 0초-0초, 1초-1초...등 각 시점 별로 매칭하여 유사도를 판별하게 된다. 다만, 이러한 유클리드 거리는, 전반적으로 패턴이 비슷해 보이는 데이터에서 떨림과 움직임이 심해질수록 유사도를 찾을 수 없으며, 길이가 짧은 시계열 데이터 간의 유사도 평가가 어렵다는 단점이 있다. 본 발명에서 활용되는 센서 공정 데이터들은 각각 동작 시간이 서로 상이할 수 있다. 다시 말해, 유클리디안 거리 알고리즘의 경우, 시간의 축이 뒤틀어진 상황에서 유사도의 계산 정확도가 매우 낮아질 수 있다. 즉 양 데이터 간 alignment가 맞지 않은 경우 거리 산출이 어려워져 유사도 평가가 어려울 수 있다.In general, when determining the similarity between two time series data, the Euclidean distance algorithm can be used. The Euclidean distance algorithm can determine the similarity between both data by calculating the distance on the same time line. For example, when using the Euclidean distance, similarity is determined by matching each time point, such as 0 second to 0 second, 1 second to 1 second, etc. However, this Euclidean distance has the disadvantage that it is difficult to find similarity in data that appears to have similar overall patterns as tremor and movement become more severe, and that it is difficult to evaluate similarity between short-length time series data. The sensor process data used in the present invention may each have different operating times. In other words, in the case of the Euclidean distance algorithm, the accuracy of calculating similarity can be very low in situations where the time axis is distorted. In other words, if the alignment between the two data is not correct, it may be difficult to calculate the distance and thus difficult to evaluate the similarity.

다양한 실시예에서, 컴퓨팅 장치(100)가 센서 데이터 간 유사도를 측정할 때, 코사인 유사도 비교를 활용하여 양 데이터를 벡터화하고, 벡터 간 각도를 통해 거리를 산출하고 이에 기반하여 유사도를 평가할 수도 있다. 다만, 코사인 유사도 비교는 시계열 데이터의 sequence가 반영되지 않을 수 있다.In various embodiments, when the computing device 100 measures similarity between sensor data, it may vectorize both data using cosine similarity comparison, calculate a distance through the angle between vectors, and evaluate similarity based on this. However, cosine similarity comparison may not reflect the sequence of time series data.

이에 따라, 본 발명은 동적 시간 와핑 알고리즘을 활용하여 적어도 두 개의 데이터 간 유사도를 측정할 수 있다. 동적 시간 와핑 알고리즘은 두 개의 다른 속도의 시간축의 파장의 유사도를 평가하는 알고리즘일 수 있다. 동적 시간 와핑 알고리즘은, 동일한 시간선상의 데이터뿐만 아니라 주변 시점까지 비교 대상으로 하여 더 비슷한 요소에 매칭할 수 있다. 이러한 동적 시간 와핑 알고리즘을 활용하는 경우, 서로 다른 길이의 시계열 데이터 간의 유사도 평가를 가능하게 한다는 장점이 있다.Accordingly, the present invention can measure the similarity between at least two pieces of data by utilizing a dynamic time warping algorithm. The dynamic time warping algorithm may be an algorithm that evaluates the similarity of the wavelengths of the time axis at two different speeds. The dynamic time warping algorithm can match not only data on the same timeline but also surrounding time points to more similar elements. Using this dynamic time warping algorithm has the advantage of enabling similarity evaluation between time series data of different lengths.

다양한 실시예에서, 컴퓨팅 장치(100)는 제1 유사도 값, 제2 유사도 값, 제3 유사도 값 및 제4 유사도 값을 획득한 경우, 제1 유사도 값, 제2 유사도 값, 제3 유사도 값 및 제4 유사도 값 중 가장 큰 어느 하나의 값으로 나누어 복수의 정규화 값을 산출할 수 있다(S146).In various embodiments, when the computing device 100 obtains the first similarity value, the second similarity value, the third similarity value, and the fourth similarity value, the first similarity value, the second similarity value, the third similarity value, and A plurality of normalization values can be calculated by dividing the fourth similarity value by the largest one (S146).

구체적으로, 컴퓨팅 장치(100)는 제1 유사도 값을 가장 큰 어느 하나의 값으로 나누어 제1 정규화 값을 산출하고, 제2 유사도 값을 가장 큰 어느 하나의 값으로 나누어 제2 정규화 값을 산출하고, 제3 유사도 값을 가장 큰 어느 하나의 값으로 나누어 제3 정규화 값을 산출하고, 제4 유사도 값을 상기 가장 큰 어느 하나의 값으로 나누어 제4 정규화 값을 산출할 수 있다.Specifically, the computing device 100 calculates a first normalization value by dividing the first similarity value by the largest value, and divides the second similarity value by the largest value to calculate the second normalization value. , the third normalization value can be calculated by dividing the third similarity value by the largest value, and the fourth normalization value can be calculated by dividing the fourth similarity value by the largest value.

본 발명의 복수의 정규화 값은 이하에서 설명하는 단계(S150)에서 컴퓨팅 장치(100)가 특정 장비의 정상 여부를 결정하는데 이용될 수 있다.The plurality of normalization values of the present invention can be used by the computing device 100 to determine whether a specific device is normal in step S150, which will be described below.

다시 도3을 참조하면, 컴퓨팅 장치(100)는 유사도에 기초하여, 예방 정비가 수행된 특정 장비의 정상 여부를 결정할 수 있다(S150).Referring again to FIG. 3, the computing device 100 may determine whether the specific equipment on which preventive maintenance has been performed is normal based on the similarity (S150).

구체적으로, 컴퓨팅 장치(100)는 단계(S146)에서 유사도에 기반하여 산출된 복수의 정규화 값을 기초로 예방정비가 수행된 특정 장비의 정상 여부를 결정할 수 있다.Specifically, the computing device 100 may determine whether the specific equipment on which preventive maintenance has been performed is normal based on a plurality of normalization values calculated based on the similarity in step S146.

컴퓨팅 장치(100)는 제1 정규화 값이 1이 아닌 경우 예방정비가 수행된 특정 장비가 정상이라고 결정할 수 있다. 그리고, 컴퓨팅 장치(100)는 제1 정규화 값이 1인 경우, 예방정비가 수행된 특정 장비에서 불량이 발생될 것이라고 결정할 수 있다.If the first normalization value is not 1, the computing device 100 may determine that the specific equipment on which preventive maintenance has been performed is normal. And, if the first normalization value is 1, the computing device 100 may determine that a defect will occur in a specific piece of equipment on which preventive maintenance has been performed.

즉, 제1 유사도 값, 제2 유사도 값, 제3 유사도 값 및 제4 유사도 값 중 제1 유사도 값이 가장 큰 경우, 제1 유사도 값을 제1 유사도 값으로 나누어 제1 정규화 값이 1이 될 수 있다. 이 경우, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비에서 불량이 발생될 것이라고 결정할 수 있다.That is, if the first similarity value is the largest among the first similarity value, the second similarity value, the third similarity value, and the fourth similarity value, the first similarity value is divided by the first similarity value so that the first normalization value is 1. You can. In this case, the computing device 100 may determine that a defect will occur in a specific piece of equipment on which preventive maintenance has been performed.

예를 들어, 특정 장비에서 예방정비 이전에 획득된 제2 센서 데이터의 ‘제2 중간 값’과 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각에서 측정된 제3 센서 데이터의 ‘중간 값(레퍼런스 값)’ 간의 제2 유사도 값은 특정 장미 및 하나 이상의 장비 모두 정상적으로 작동되는 상태의 센서 데이터 간 유사도 값이기 때문에 가장 높을 수 있다. 한편, 특정 장비에서 예방정비가 수행됨에 따라 센서 데이터의 변화가 발생되는데, 변화가 발생된 데이터(즉, 제1 센서 데이터)의 중간 값과 제3 센서 데이터의 중간 값 간 유사도가 가장 높은 경우, 특정 장비에서 불량이 발생될 것으로 결정될 수 있다.For example, the 'second intermediate value' of the second sensor data obtained before preventive maintenance on specific equipment and the 'intermediate value (reference) of the third sensor data measured on each of one or more devices operating with the same recipe as the specific equipment. The second similarity value between 'values' may be the highest because it is a similarity value between sensor data in a state in which a specific rose and one or more devices are all operating normally. On the other hand, as preventive maintenance is performed on specific equipment, changes in sensor data occur. When the similarity between the median value of the data in which the change occurred (i.e., the first sensor data) and the median value of the third sensor data is the highest, It may be determined that a defect will occur in a specific piece of equipment.

한편, 제1 유사도 값, 제2 유사도 값, 제3 유사도 값 및 제4 유사도 값 중 제1 유사도 값을 제외한 나머지 유사도 값이 가장 큰 경우, 제1 정규화 값은 1이 아닐 수 있다. 이경우, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비가 정상이라고 결정할 수 있다.Meanwhile, when the remaining similarity values excluding the first similarity value among the first similarity value, the second similarity value, the third similarity value, and the fourth similarity value are the largest, the first normalization value may not be 1. In this case, computing device 100 may determine that the specific equipment on which preventive maintenance was performed is healthy.

따라서, 본 발명의 컴퓨팅 장치(100)는 도5 및 도 6을 참조하여 상술한 방법을 통해 장비에서 측정된 센서 데이터에 대한 이상 판정이 나타나더라도, 동일한 레시피로 동작되는 장비에서 측정된 센서 데이터와의 정합성 검사를 통해 예방정비에 의한 이상 판정인지, 또는 장비에 이상이 생겼는지 여부를 판단할 수 있다.Therefore, the computing device 100 of the present invention, even if an abnormality is determined for the sensor data measured from the equipment through the method described above with reference to FIGS. 5 and 6, the sensor data measured from the equipment operated with the same recipe and Through the consistency test, it is possible to determine whether an abnormality was determined due to preventive maintenance or whether a problem occurred in the equipment.

일 실시예에서, 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링하는 경우, 예방정비에 의해 이상이 탐지될 수 있는데, 본 발명의 컴퓨팅 장치(100)는 해당 이상 탐지가 예방정비에 의한 이상 판정인지, 또는 장비에 이상이 생겼는지 여부를 판단하여, 관리자의 확인 절차를 생략할 수 있으며, 나아가 장비 관리의 효율성을 높일 수 있다.In one embodiment, when sensor data of each of one or more devices is monitored using a learned anomaly detection model to detect anomalies in the equipment, anomalies may be detected through preventive maintenance, and the computing device (100) of the present invention ) determines whether the abnormality detection is an abnormality due to preventive maintenance or whether an abnormality has occurred in the equipment, thereby omitting the manager's confirmation process and further increasing the efficiency of equipment management.

다양한 실시예에서, 컴퓨팅 장치(100)는 예방정비가 수행된 장비에 대한 정상 유형 또는 불량 유형을 분류할 수 있다.In various embodiments, computing device 100 may classify equipment on which preventive maintenance has been performed as either a normal type or a defective type.

구체적으로, 컴퓨팅 장치(100)는 제1 정규화 값이 제3 정규화 값 및 제4 정규화 값보다 작은 경우 기 설정된 제1 정상 유형으로 분류할 수 있다. 여기서, 제1 정상 유형은 교체된 부품과 기존 부품 간 매칭의 특성(tool matching 특성)이 완전한 유형을 의미할 수 있다.Specifically, the computing device 100 may classify it as a preset first normal type when the first normalization value is smaller than the third normalization value and the fourth normalization value. Here, the first normal type may mean a type in which the matching characteristics (tool matching characteristics) between the replaced parts and the existing parts are complete.

또한, 컴퓨팅 장치(100)는 제1 정규화 값이 제3 정규화 값 및 제4 정규화 값보다 크고 제2 정규화 값보다 작은 경우 기 설정된 제2 정상 유형으로 분류할 수 있다. 여기서, 제2 정상 유형은 부품 교체로 tool matching 특성이 좋아졌으나 완전하진 못한 유형을 의미할 수 있다.Additionally, the computing device 100 may classify the first normalization value as a preset second normal type when the first normalization value is greater than the third normalization value and the fourth normalization value and is less than the second normalization value. Here, the second normal type may mean a type in which tool matching characteristics have improved due to parts replacement, but are not perfect.

또한, 컴퓨팅 장치(100)는 제2 정규화 값이 제3 정규화 값 및 제4 정규화 값 보다 작은 경우 기 설정된 제1 불량 유형으로 분류할 수 있다. 여기서, 제1 불량 유형은 부품 교체로 기존에 잘 맞던 tool matching특성이 깨진 유형을 의미할 수 있다.Additionally, the computing device 100 may classify the defect as a preset first defect type when the second normalization value is smaller than the third normalization value and the fourth normalization value. Here, the first defect type may refer to a type in which the previously well-matched tool matching characteristics are broken due to replacement of parts.

또한, 컴퓨팅 장치(100)는 제2 정규화 값이 제3 정규화 값 및 제4 정규화 값 보다 큰 경우 기 설정된 제2 불량 유형으로 분류할 수 있다. 여기서, 제2 불량 유형은 기존에 잘 맞지 않았던 tool matching특성이 더욱 악화된 유형을 의미할 수 있다.Additionally, the computing device 100 may classify the defect as a preset second defect type when the second normalization value is greater than the third normalization value and the fourth normalization value. Here, the second defect type may mean a type in which the previously unsuitable tool matching characteristics have become worse.

본 발명의 컴퓨팅 장치(100)는 다양한 실시예에 따른 유형 분류 방법을 제공하여, 예방정비의 예후 관찰 및 장비 관리의 편의성을 높일 수 있다.The computing device 100 of the present invention can increase the convenience of observing the prognosis of preventive maintenance and managing equipment by providing a type classification method according to various embodiments.

도 5는 본 발명의 일 실시예에 따른 장비 이상 탐지 모델의 정확도를 유지하기 위한 방법의 일례를 설명하기 위한 흐름도이다. 도 6은 본 발명의 일 실시예에 따른 특정 장비에 대한 정합성 검사를 수행할지 여부를 결정하는 방법의 일례를 설명하기 위한 흐름도다.Figure 5 is a flowchart illustrating an example of a method for maintaining the accuracy of an equipment abnormality detection model according to an embodiment of the present invention. Figure 6 is a flowchart illustrating an example of a method for determining whether to perform a consistency check for specific equipment according to an embodiment of the present invention.

도 5를 참조하면, 컴퓨팅 장치(100)는 장비의 이상을 탐지하기 위해 학습된 이상 탐지 모델을 이용하여 하나 이상의 장비 각각의 센서 데이터를 모니터링할 수 있다(S210).Referring to FIG. 5, the computing device 100 may monitor sensor data of each of one or more devices using a learned anomaly detection model to detect anomalies in the equipment (S210).

구체적으로, 컴퓨팅 장치(100)는 공정 센서 데이터 및 공정 센서 데이터에 대응하는 사전 지식 데이터를 획득할 수 있다. 또한, 컴퓨팅 장치(100)는 딥러닝 모델(즉, 이상 탐지 모델)을 활용하여 공정 센서 데이터에 대응하는 재건 공정 센서 데이터를 생성할 수 있다. 또한, 컴퓨팅 장치(100)는 공정 센서 데이터 및 재건 공정 센서 데이터에 기초하여 재건율 오차를 산출할 수 있다. 그리고, 컴퓨팅 장치(100)는 재건율 오차와 기준 임계값의 비교에 기초하여 비정상 동작을 감지할 수 있다.Specifically, the computing device 100 may acquire process sensor data and prior knowledge data corresponding to the process sensor data. Additionally, the computing device 100 may utilize a deep learning model (i.e., an anomaly detection model) to generate reconstructed process sensor data corresponding to the process sensor data. Additionally, the computing device 100 may calculate a reconstruction rate error based on process sensor data and reconstruction process sensor data. Additionally, the computing device 100 may detect abnormal operation based on comparison of the reconstruction rate error and the reference threshold.

여기서, 딥러닝 모델은 공정 센서 데이터에 대응하는 피처 정보를 출력하는 추출하는 제1서브 모델, 공정 센서 데이터 및 사전 지식 데이터에 기초하여 각 공정 센서 데이터들 간의 교호 관계 정보를 추출하는 제2서브 모델, 제1서브 모델 및 제2서브 모델의 출력을 조합하여 특징 정보를 생성하는 어텐션 모듈 및 특징 정보를 복원하여 재건 공정 센서 데이터를 생성하는 차원 복원 모델을 포함할 수 있다.Here, the deep learning model is a first sub-model that extracts feature information corresponding to process sensor data, and a second sub-model that extracts interaction information between each process sensor data based on process sensor data and prior knowledge data. , It may include an attention module that generates feature information by combining the outputs of the first sub-model and the second sub-model, and a dimensional restoration model that generates reconstruction process sensor data by restoring feature information.

일 실시예에서, 딥러닝 모델의 학습에 활용되는 학습용 공정 센서 데이터는, 정상에 관련한 센서 데이터만을 포함할 수 있다. 다시 말해, 학습용 공정 센서 데이터는 비정상 동작에 관련한 센서 데이터를 포함하지 않을 수 있다.In one embodiment, process sensor data for learning used to learn a deep learning model may include only sensor data related to normality. In other words, process sensor data for learning may not include sensor data related to abnormal operations.

즉, 딥러닝 모델은, 다년간 축적된 데이터를 기반으로 학습됨에 따라, 기존 축적된 공정 센서 데이터들과 유사한 공정 센서 데이터가 입력되는 경우, 입력에 관련한 공정 센서 데이터와 유사한 재건 공정 센서 데이터를 출력할 수 있으며, 기존 축적된 공정 센서 데이터들과 유사하지 않은 공정 센서 데이터가 입력되는 경우, 입력에 관련한 공정 센서 데이터와 유사하지 않은 재건 공정 센서 데이터를 출력할 수 있다.In other words, as the deep learning model is learned based on data accumulated over many years, when process sensor data similar to existing accumulated process sensor data is input, it can output reconstructed process sensor data similar to the process sensor data related to the input. In addition, when process sensor data that is not similar to existing accumulated process sensor data is input, reconstructed process sensor data that is not similar to the process sensor data related to the input can be output.

또한, 컴퓨팅 장치(100)는 공정 센서 데이터와 공정 센서 데이터에 대응하여 출력된 재건 공정 센서 데이터 간의 재건율 오차를 산출할 수 있다. 구체적으로, 컴퓨팅 장치(100)는 입력에 관련한 공정 센서 데이터와 출력에 관련한 재건 공정 센서 데이터 간의 차이가 클수록 재건율 오차를 크게 산출하고, 그리고 입력에 관련한 공정 센서 데이터와 출력에 관련한 재건 공정 센서 데이터 간의 차이가 클수록 재건율 오차를 작게 산출할 수 있다. 즉, 재건율 오차는, 딥러닝 모델의 입력(즉, 공정 센서 데이터)과 출력(즉, 재건 공정 센서 데이터) 간의 차이에 기초하여 산출될 수 있다.Additionally, the computing device 100 may calculate a reconstruction rate error between process sensor data and reconstruction process sensor data output in response to the process sensor data. Specifically, the computing device 100 calculates a larger reconstruction rate error as the difference between the process sensor data related to the input and the reconstruction process sensor data related to the output is larger, and the process sensor data related to the input and the reconstruction process sensor data related to the output The larger the difference between the two, the smaller the reconstruction rate error can be calculated. That is, the reconstruction rate error can be calculated based on the difference between the input (ie, process sensor data) and output (ie, reconstruction process sensor data) of the deep learning model.

또한, 컴퓨팅 장치(100)는 산출된 재건율 오차와 기준 임계값의 비교에 기초하여 비정상 동작을 감지할 수 있다. 일 실시예에서, 기준 임계값은, 딥러닝 모델의 학습 과정에서 획득되는 것으로, 복수의 학습 데이터 각각에 관련한 복수의 재건율 오차 중 최대값에 관련한 재건율 오차에 기초하여 결정되는 것을 특징으로 할 수 있다. 예를 들어, 지난 3년동안 획득된 10만개의 공정 센서 데이터 각각과 각 공정 센서 데이터에 대응하는 재건 공정 센서 데이터 각각에 대응하는 재건율 오차 중 최대가 되는 재건율 오차에 기초하여 기준 임계값이 결정될 수 있다. 다시 말해, 지난 다년간의 공정 센서 데이터들 중에서 복원이 가장 잘 되지 않은 공정 센서 데이터(즉, 재건율 오차가 가장 큰 공정 센서 데이터)의 재건율 오차에 기초하여 기준 임계값이 결정될 수 있다. 이러한 기준 임계값은, 비정상 동작 탐지를 위한 기준이 될 수 있다. 구체적인 실시예에서, 컴퓨팅 장치(100)는 재건율 오차가 기준 임계값 이하인 경우, 정상 동작으로 판별할 수 있으며, 재건율 오차가 기준 임계값을 초과하는 경우, 비정상 동작으로 판별할 수 있다.Additionally, the computing device 100 may detect abnormal operation based on comparison of the calculated reconstruction rate error and the reference threshold. In one embodiment, the reference threshold is obtained during the learning process of a deep learning model, and is characterized in that it is determined based on a reconstruction rate error related to the maximum value among a plurality of reconstruction rate errors related to each of a plurality of learning data. You can. For example, the reference threshold is based on the maximum reconstruction rate error among the reconstruction rate errors corresponding to each of the 100,000 process sensor data acquired over the past three years and the reconstruction process sensor data corresponding to each process sensor data. can be decided. In other words, the reference threshold may be determined based on the reconstruction rate error of the least reconstructed process sensor data (i.e., the process sensor data with the largest reconstruction rate error) among the process sensor data over the past several years. This reference threshold can be a standard for detecting abnormal behavior. In a specific embodiment, the computing device 100 may determine normal operation when the reconstruction rate error is below a reference threshold, and may determine abnormal operation when the reconstruction rate error exceeds the reference threshold.

다시 말해, 컴퓨팅 장치(100)는 공정 센서 데이터에 기반하여 재건 공정 센서 데이터를 생성하며, 공정 센서 데이터와 재건 공정 센서 데이터의 비교에 기초하여 재건율 오차를 산출할 수 있다. 컴퓨팅 장치(100)는 재건율 오차를 기준 임계값과 비교하여, 재건율 오차가 기준 임계값을 초과하는 경우, 설비 공정 과정에서 비정상 동작이 발생하였다고 판별할 수 있다. 예컨대, 공정 센서 데이터가 딥러닝 모델의 학습에 활용된 학습 데이터들(즉, 다년간 축적된 공정 센서 데이터들)과 유사한 경우, 재건율 오차는 적게 산출될 수 있다. 이와 반대로, 공정 센서 데이터가 딥러닝 모델의 학습에 활용된 학습 데이터들과 유사하지 않은 경우, 재건율 오차가 크게 산출될 수 있다. 즉, 컴퓨팅 장치(100)는 재건율 오차가 다년간 축적된 데이터들을 기반으로 산출된 기준 임계값 보다 큰 경우, 기존의 정상 상황에서 획득된 데이터 유형과 상이한 데이터 유형이 발생(즉, 과거에 한 번도 경험하지 못했던 유형의 센서 데이터가 감지)된 것으로 판별하여 비정상 상황으로 판별할 수 있다.In other words, the computing device 100 may generate reconstruction process sensor data based on process sensor data and calculate a reconstruction rate error based on comparison of the process sensor data and reconstruction process sensor data. The computing device 100 may compare the reconstruction rate error with a reference threshold and, if the reconstruction rate error exceeds the reference threshold, determine that an abnormal operation has occurred during the facility process. For example, if the process sensor data is similar to the learning data used to learn the deep learning model (i.e., process sensor data accumulated over many years), the reconstruction rate error may be calculated to be small. Conversely, if the process sensor data is not similar to the training data used to learn the deep learning model, the reconstruction rate error may be calculated to be large. That is, when the reconstruction rate error is greater than the reference threshold calculated based on data accumulated over many years, the computing device 100 generates a data type that is different from the data type obtained in an existing normal situation (i.e., has never been used in the past). It can be determined that sensor data of a type that has not been experienced has been detected and thus determined to be an abnormal situation.

본 발명에서 사전 지식 데이터는, 복수의 센서 간의 연관 관계에 대한 정보를 포함할 수 있다. 예를 들어, 사전 지식 데이터는 반도체 생산 과정에서, 특정 공정 단계에 대응하여 온도를 측정하는 제1센서와 동일한 공정 단계에 압력을 측정하는 제2센서가 상호 연관이 있다는 정보를 포함할 수 있다. 즉, 사전 지식 데이터는, 특정 공정 단계에서의 특정 센서 데이터가 다른 센서 데이터에 영향을 줄 수 있는 정보, 즉 상호 연관성에 관련한 정보를 포함할 수 있다. 실시예에서, 사전 지식 데이터는, 그래프 구조의 형태로 구성될 수 있다.In the present invention, prior knowledge data may include information about the correlation between a plurality of sensors. For example, the prior knowledge data may include information that, in the semiconductor production process, a first sensor that measures temperature in response to a specific process step and a second sensor that measures pressure in the same process step are correlated. In other words, prior knowledge data may include information that specific sensor data in a specific process step can affect other sensor data, that is, information related to interrelationship. In embodiments, prior knowledge data may be structured in the form of a graph structure.

이러한 공정 센서 데이터 및 사전 지식 데이터는 생산 설비 공정 과정에서 획득될 수 있으며, 비정상 동작 감지에 활용될 수 있다. 비정상 동작이란, 수율을 저해하는 다양한 공정 상황들이나, 설비 고장에 관련한 비정상 동작이나 불량 조건 등이 감지되는 상황을 포함할 수 있다.These process sensor data and prior knowledge data can be obtained during the production facility process and can be used to detect abnormal behavior. Abnormal operation may include various process situations that impede yield, or situations in which abnormal operation or defective conditions related to equipment failure are detected.

일 실시예에 따르면, 반도체 생산 공정은 수백 개의 생산 스텝으로 이루어져 있으며, 가장 마지막 스텝에서 최종 검사를 통해 생산 웨이퍼의 수율을 측정할 수 있다. 웨이퍼의 수율은 웨이퍼 안에 있는 전체 칩들 중에서 양품 칩의 비율을 의미할 수 있다. 예컨대, 전체 칩이 모두 정상일 경우, 해당 웨이퍼의 수율은 100%일 수 있다. 일반적으로 반도체 공정에서 웨이퍼의 수율이 낮은 경우, 생산에 관련한 공정 센서 데이터들을 분석할 수 있다. 예를 들어, 공정 센서 데이터들을 통해 특정 설비를 거친 웨이퍼들에서 문제가 많이 발생하는지, 또는 특정 생산 조건에서 문제가 많이 발생하는지 등을 분석할 수 있다. 반도체를 생산하는 수백 새의 스텝에서는 각 스텝 마다 특정 생산 조건이 존재하며, 이를 만족하지 못하는 경우, 웨이퍼의 수율에 영향을 줄 수 있다. 예컨대, 특정 설비에서는 웨이퍼가 투입된 이후 특정 시간 내에 설정된 온도에 도달해야 하는 조건이 존재할 수 있으나, 실제 공정 과정에서 해당 조건이 충족되지 않을 수 있으며, 이는 해당 단계에서 획득되는 공정 센서 데이터를 통해 식별될 수 있다. 이와 같이, 공정 과정에서 실시간으로 획득되는 수백 개의 공정 센서 데이터를 분석하는 것은 수율 문제 해결에 매우 도움이 될 수 있다. According to one embodiment, the semiconductor production process consists of hundreds of production steps, and the yield of produced wafers can be measured through final inspection at the last step. The wafer yield may refer to the ratio of good chips among all chips in the wafer. For example, if all chips are normal, the yield of the wafer may be 100%. In general, when wafer yield is low in a semiconductor process, process sensor data related to production can be analyzed. For example, through process sensor data, it is possible to analyze whether many problems occur in wafers that have passed through a specific facility, or whether many problems occur under specific production conditions. In the hundreds of steps that produce semiconductors, there are specific production conditions for each step, and if these are not met, the wafer yield can be affected. For example, in certain facilities, there may be a condition that requires the wafer to reach a set temperature within a certain time after being input, but the condition may not be met during the actual process, and this can be identified through process sensor data acquired at that stage. You can. In this way, analyzing hundreds of process sensor data acquired in real time during the process can be very helpful in solving yield problems.

또한, 일 실시예에서, 반도체 공정에 있어, 예방정비는 매우 중요할 수 있다. 예방정비란, 설비가 완전히 고장나기 전에 비정상 시그널이나 불량 조건 등을 파악하고 해결함으로써, 설비 전체가 중단되는 것을 예방하고자 하는 분석 과정을 의미할 수 있다. 예컨대, 반도체 공정의 경우, 300mm 웨이퍼에서 설비 문제로 발생하는 20nm 미만의 불량을 발견해야 하므로, 불량을 찾아내는 것이 매우 어려울 수 있다. 기술의 발전을 통해 회로의 선폭은 더욱 더 얇아지게 되었으며, 이에 따라 불량을 감지하는 것은 더욱 어려울 수 있다. 구체적인 예를 들어, 반도체 공정은 보통 500여개의 공정, 1000여개의 계측 단계를 포함할 수 있다. 이러한 공정 과정에서 설비에 작은 문제라도 발생하는 경우, 웨이퍼의 전량을 폐기해야 하는 등 큰 손실을 초래할 수 있다. 이와 같은 문제를 방지하기 위하여 반도체 설비 내 공정 센서에서 발생하는 데이터들을 수집하고 분석하는 것은 매우 중요할 수 있다. 본 발명의 컴퓨팅 장치(100)는 인공지능을 활용하여 반도체 공정 과정에서 발생하는 공정 센서 데이터를 획득 및 분석함으로써, 공정 과정에서 비정상 동작 감지를 자동화할 수 있다. 인공지능을 활용하여 비정상 동작 발생 여부를 감지하는 경우, 이상 데이터를 탐지하는데 소요되는 시간을 최소화할 수 있으며, 복잡한 설비에 따른 공정 센서 데이터들에 관해서도 비정상 상태의 요인을 특정할 수 있어 작업자에게 편의성을 제공할 수 있다.Additionally, in one embodiment, in semiconductor processing, preventive maintenance can be very important. Preventive maintenance can refer to an analysis process that seeks to prevent the entire facility from being shut down by identifying and resolving abnormal signals or defective conditions before the facility completely breaks down. For example, in the case of semiconductor processing, defects smaller than 20 nm that occur due to equipment problems must be found on a 300 mm wafer, so finding defects can be very difficult. As technology advances, circuit line widths have become increasingly thinner, which can make it more difficult to detect defects. For a specific example, a semiconductor process can typically include about 500 processes and 1,000 measurement steps. If even a small problem occurs in the equipment during this process, it can result in a large loss, such as having to discard the entire wafer. To prevent such problems, it can be very important to collect and analyze data generated from process sensors in semiconductor facilities. The computing device 100 of the present invention utilizes artificial intelligence to acquire and analyze process sensor data generated during the semiconductor process, thereby automating the detection of abnormal operations during the process. When using artificial intelligence to detect whether an abnormal operation has occurred, the time required to detect abnormal data can be minimized, and the cause of the abnormal state can be identified even with respect to process sensor data from complex equipment, providing convenience to workers. can be provided.

컴퓨팅 장치(100)는 이상 탐지 모델이 특정 장비에서 획득된 특정 센서 데이터에서 이상 데이터를 탐지한 경우, 특정 장비와 관련된 예방정비 정보를 획득하고, 특정 장비에 대한 정합성 검사를 수행할 수 있다(S220).When the anomaly detection model detects abnormal data in specific sensor data obtained from specific equipment, the computing device 100 may obtain preventive maintenance information related to the specific equipment and perform a consistency check for the specific equipment (S220 ).

구체적으로, 도 6을 참조하면, 컴퓨팅 장치(100)는 예방정비 정보를 기초로 특정 장비에서 예방정비가 수행되었는지 여부를 인식할 수 있다(S211).Specifically, referring to FIG. 6, the computing device 100 may recognize whether preventive maintenance has been performed on specific equipment based on preventive maintenance information (S211).

구체적으로, 컴퓨팅 장치(100)는 특정 장비와 관련된 로그 데이터로부터 예방정비 정보를 추출하고, 이를 기초로 특정 장비에서 예방정비가 수행되었는지 여부를 인식할 수 있다. 다만, 이에 한정되는 것은 아니고, 컴퓨팅 장치(100)는 관리자 단말로부터 특정 장비에 대한 예방정비 수행 여부를 수신할 수도 있다.Specifically, the computing device 100 may extract preventive maintenance information from log data related to specific equipment and, based on this, recognize whether preventive maintenance has been performed on the specific equipment. However, the present invention is not limited to this, and the computing device 100 may receive information about whether to perform preventive maintenance on specific equipment from an administrator terminal.

한편, 컴퓨팅 장치(100)는 특정 장비에서 예방정비가 수행되었다고 인식한 경우, 정합성 검사를 수행할 수 있다(S222).Meanwhile, when the computing device 100 recognizes that preventive maintenance has been performed on specific equipment, it may perform a consistency check (S222).

구체적으로, 컴퓨팅 장치(100)는 예방정비가 수행된 특정 장비 및 특정 장비와 동일한 레시피로 작동되는 하나 이상의 장비 각각의 센서 데이터를 획득할 수 있다. 또한, 컴퓨팅 장치(100)는 센서 데이터를 샘플링하여, 각 장비에서 획득된 센서 데이터에 대한 얼라인을 수행할 수 있다. 또한, 컴퓨팅 장치(100)는 샘플링한 센서 데이터에서 장비 별 최소 값, 최대 값 및 중간 값을 결정할 수 있다. 또한, 컴퓨팅 장치(100)는 장비 별 상기 최소 값, 최대 값 및 중간 값 중 적어도 하나를 기초로 각 장비에 대응하는 센서 데이터의 유사도를 측정할 수 있다. 그리고, 컴퓨팅 장치(100)는 유사도에 기초하여, 상기 예방정비가 수행된 상기 특정 장비의 정상 여부를 결정할 수 있다.Specifically, the computing device 100 may acquire sensor data for each of the specific equipment on which preventive maintenance has been performed and one or more equipment operated with the same recipe as the specific equipment. Additionally, the computing device 100 may sample sensor data and perform alignment on sensor data acquired from each device. Additionally, the computing device 100 may determine the minimum, maximum, and median values for each device from the sampled sensor data. Additionally, the computing device 100 may measure the similarity of sensor data corresponding to each device based on at least one of the minimum value, maximum value, and median value for each device. Additionally, the computing device 100 may determine whether the specific equipment on which the preventive maintenance has been performed is normal based on the similarity.

단계(S222)에서 컴퓨팅 장치(100)가 정합성 검사를 수행하는 방법은 도 5 및 도 6을 참조하여 상술한 방법들과 동일한 방법으로 수행될 수 있으므로 중복되는 설명은 생략한다.The method by which the computing device 100 performs the consistency check in step S222 may be performed in the same manner as the methods described above with reference to FIGS. 5 and 6, and thus redundant description will be omitted.

다른 한편, 컴퓨팅 장치(100)는 단계(S211)에서, 특정 장비에서 예방정비를 수행하지 않은 것으로 인식한 경우, 특정 장비에 이상이 발생된 것으로 인식할 수 있다(S223).On the other hand, if the computing device 100 recognizes that preventive maintenance has not been performed on the specific equipment in step S211, it may recognize that a problem has occurred in the specific equipment (S223).

이 경우, 컴퓨팅 장치(100)는 특정 장비와 관련된 로그 데이터에 특정 장비에서 발생된 이상을 기록하고, 해당 내용을 관리자 단말로 전송할 수 있다.In this case, the computing device 100 may record anomalies occurring in a specific device in log data related to the specific device and transmit the corresponding information to the administrator terminal.

다시 도 5를 참조하면, 컴퓨팅 장치(100)는 정합성 검사를 기초로, 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것인지 또는 특정 장비의 불량으로 인해 탐지된 것인지 여부를 인식할 수 있다(S230).Referring again to FIG. 5, the computing device 100 can recognize, based on the consistency check, whether the abnormal data detected by the abnormality detection model was detected by preventive maintenance or was detected due to a defect in specific equipment. There is (S230).

구체적으로, 컴퓨팅 장치(100)는 정합성 검사를 수행함에 따라 특정 장비가 정상이라고 결정한 경우 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식할 수 있다. 또한, 컴퓨팅 장치(100)는 정합성 검사를 수행함에 따라 특정 장비가 정상이 아니라고 결정한 경우 이상 탐지 모델이 탐지한 이상 데이터가 특정 장비의 불량으로 인해 탐지된 것이라고 인식할 수 있다.Specifically, when the computing device 100 determines that a specific piece of equipment is normal as it performs a consistency check, it may recognize that the abnormal data detected by the anomaly detection model was detected through preventive maintenance. Additionally, when the computing device 100 determines that a specific device is not normal while performing a consistency check, the computing device 100 may recognize that the abnormal data detected by the anomaly detection model was detected due to a defect in the specific device.

컴퓨팅 장치(100)는 이상 탐지 모델이 탐지한 이상 데이터가 예방정비에 의해 탐지된 것이라고 인식한 경우, 이상 데이터를 기초로 이상 탐지 모델을 업데이트할 수 있다(240).When the computing device 100 recognizes that the abnormal data detected by the abnormality detection model is detected by preventive maintenance, the computing device 100 may update the abnormality detection model based on the abnormal data (240).

구체적으로, 컴퓨팅 장치(100)는 이상 데이터를 스케일링하여 정규화된 데이터를 획득할 수 있다. 그리고, 컴퓨팅 장치(100)는 이상 탐지 모델의 특징 추출 레이어(Feature extraction layer)를 유지한 상태에서, 정규화된 데이터를 이용하여 완전 연결 레이어(Fully connected layer)를 전이학습시킬 수 있다.Specifically, the computing device 100 may obtain normalized data by scaling abnormal data. Additionally, the computing device 100 can transfer learn a fully connected layer using normalized data while maintaining the feature extraction layer of the anomaly detection model.

일 실시예에서, 예방정비에 의해 센서 데이터가 변경되더라도, 이상 탐지 모델이 데이터에서 추출하는 특징은 동일할 수 있다. 즉, 컴퓨팅 장치(100)는 센서 데이터에서 특징점을 찾아내는 특징 추출 레이어를 유지하여, 불필요한 학습 과정을 생략하고, 이상 탐지의 임계점과 관련된 완전 연결 레이어에 대한 전이학습만 수행할 수 있다.In one embodiment, even if sensor data changes due to preventive maintenance, the features that the anomaly detection model extracts from the data may be the same. That is, the computing device 100 maintains a feature extraction layer that finds feature points in sensor data, omits unnecessary learning processes, and can only perform transfer learning on the fully connected layer related to the critical point of anomaly detection.

예를 들어, 컴퓨팅 장치(100)는 완전 연결 레이어를 전이학습시키는 경우, 이상 탐지 모델에서 이상을 탐지하기 위한 제1 최대 값과 정규화된 데이터의 제2 최대 값을 비교할 수 있다. 그리고, 컴퓨팅 장치(100)는 제2 최대 값이 제1 최대 값 보다 큰 경우, 이상 탐지 모델에서 이상을 탐지하기 위한 제1 최대 값을 상기 제2 최대 값으로 변경할 수 있다.For example, when transferring a fully connected layer, the computing device 100 may compare a first maximum value for detecting an anomaly in an anomaly detection model with a second maximum value of normalized data. Additionally, when the second maximum value is greater than the first maximum value, the computing device 100 may change the first maximum value for detecting an anomaly in the anomaly detection model to the second maximum value.

상술한 바와 같이, 본 발명의 이상 탐지 모델은 이상을 탐지하기 위한 데이터의 범위가 증가하도록 업데이트될 수 있다. 이 경우, 업데이트가 지속됨에 따라 이상을 탐지하기 위한 범위가 지속적으로 증가하여 이상 탐지에 대한 정확도가 낮아질 수 있다. 이를 방지하기 위해 컴퓨팅 장치(100)는 기 설정된 주기 마다 이상 탐지 모델을 초기 학습된 모델로 초기화시킬 수 있다.As described above, the anomaly detection model of the present invention can be updated to increase the range of data for detecting anomalies. In this case, as updates continue, the range for detecting anomalies continues to increase, which may lower the accuracy of detecting anomalies. To prevent this, the computing device 100 may initialize the anomaly detection model to the initially learned model at each preset cycle.

따라서, 본 발명의 컴퓨팅 장치(100)는 전이학습을 통한 지속적인 업데이트와 주기적 초기화(초기 모델로 복원)를 수행하여 초기 모델을 학습시키기 위해 소모되는 시간과 비용을 절감하면서, 이상 탐지 모델의 정확도를 유지시킬 수 있다.Therefore, the computing device 100 of the present invention performs continuous updates and periodic initialization (restore to the initial model) through transfer learning, thereby reducing the time and cost required to learn the initial model and improving the accuracy of the anomaly detection model. It can be maintained.

도 7 내지 도 9는 본 발명의 일 실시예에 따른 센서 데이터에 대한 얼라인을 수행하는 방법의 일례를 설명하기 위한 도면이다.7 to 9 are diagrams for explaining an example of a method for performing alignment on sensor data according to an embodiment of the present invention.

본 발명의 일 실시예에 따르면, 컴퓨팅 장치(100)는 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사, 이상 탐지 모델의 학습 및 이상 탐지 모델의 정확도를 유지하기 위해 이용되는 센서 데이터에 대한 얼라인(또는 전처리)을 수행할 수 있다.According to an embodiment of the present invention, the computing device 100 is an algorithm for sensor data used to check consistency between a plurality of manufacturing process equipment based on sensor data, learn an anomaly detection model, and maintain the accuracy of the anomaly detection model. phosphorus (or pretreatment) can be performed.

이하의 설명에서, 컴퓨팅 장치(100)가 이상 탐지 모델의 학습(즉, 이상 탐지 모델의 학습 데이터를 생성)하기 위해 데이터에 대한 얼라인을 수행하는 방법의 일례를 설명하나, 이에 한정되는 것은 아니고, 센서 데이터에 기반한 복수의 제조 공정 장비 간 정합성 검사 또는 이상 탐지 모델의 정확도를 유지하기 위해 이용되는 센서 데이터에 대한 얼라인에도 동일한 방법이 적용될 수 있다.In the following description, an example of how computing device 100 performs alignment on data to train an anomaly detection model (i.e., generate training data for an anomaly detection model) is described, but is not limited thereto. , the same method can be applied to alignment of sensor data used to check consistency between multiple manufacturing process equipment based on sensor data or to maintain the accuracy of an anomaly detection model.

일 실시예에 따르면, 컴퓨팅 장치(100)는 비정상 공정을 감지하는 인공지능 모델(즉, 이상 탐지 모델)을 학습시키기 위한 학습 데이터 전처리를 수행할 수 있다.According to one embodiment, the computing device 100 may perform preprocessing of learning data to train an artificial intelligence model (i.e., an anomaly detection model) that detects an abnormal process.

구체적으로, 컴퓨팅 장치(100)가 수행하는 데이터 전처리 방법은 복수의 센서 공정 데이터를 포함하는 로우 데이터를 획득하는 단계를 포함할 수 있다.Specifically, the data preprocessing method performed by the computing device 100 may include acquiring raw data including a plurality of sensor process data.

일 실시예에서, 공정 센서 데이터는, 산업현장에서 획득되는 다양한 종류의 데이터를 포함할 수 있다. 공정 센서 데이터는, 공정 중 발생하는 초 단위의 센서 데이터를 포함할 수 있다.In one embodiment, process sensor data may include various types of data obtained at industrial sites. Process sensor data may include sensor data generated in seconds during the process.

예를 들어, 생산 설비에서 초 단위로 수백 개에 센서 데이터가 발생할 수 있다. 반도체 생산 설비에서, 온도, 압력, 다양한 화학 물질 투입량 등을 감지하는 센서들이 구비되어 있으며, 해당 센서들을 통해 실시간으로 공정 센서 데이터가 획득될 수 있다. 즉, 공정 센서 데이터는, 반도체 공정 장비의 동작에 기반하여 복수의 센서를 통해 실시간으로 획득되는 센서 데이터들을 의미할 수 있다. 공정 센서 데이터는, 반도체 팹에서 웨이퍼 제조를 위한 다양한 디바이스의 동작 파라미터 및 디바이스 동작에 의하여 획득된 다양한 센서 데이터를 포함할 수 있다. 예컨대, 공정 센서 데이터는 MES(management execution system)으로부터의 로트(lot) 장비 히스토리 데이터, 장비 인터페이스 데이터 소스로부터의 데이터, 프로세싱 툴(tool) 레시피들, 프로세싱 툴 테스트 데이터, 프로브 테스트 데이터, 전기 테스트 데이터, 결합 측정 데이터, 진단 데이터, 원격 진단 데이터, 후처리 데이터 등을 포함할 수 있으며 본 발명은 이에 제한되지 않는다.For example, a production facility may generate hundreds of sensor data per second. In semiconductor production facilities, sensors are installed to detect temperature, pressure, and various chemical input amounts, and process sensor data can be obtained in real time through the sensors. In other words, process sensor data may refer to sensor data acquired in real time through a plurality of sensors based on the operation of semiconductor processing equipment. Process sensor data may include operating parameters of various devices for wafer manufacturing in a semiconductor fab and various sensor data acquired by device operation. For example, process sensor data may include lot equipment history data from a management execution system (MES), data from equipment interface data sources, processing tool recipes, processing tool test data, probe test data, and electrical test data. , combined measurement data, diagnostic data, remote diagnostic data, post-processing data, etc., but the present invention is not limited thereto.

일 실시예에서, 복수의 센서 공정 데이터는, 시간의 흐름에 따라 변화하는 센서 값에 대한 정보를 포함할 수 있다.In one embodiment, the plurality of sensor process data may include information about sensor values that change over time.

일 실시예에서, 로우 데이터의 획득은, 메모리(120)에 저장된 데이터를 수신하거나 또는 로딩(loading)하는 것일 수 있다. 로우 데이터의 획득은, 유/무선 통신 수단에 기초하여 다른 저장 매체에, 다른 컴퓨팅 장치, 동일한 컴퓨팅 장치 내의 별도 처리 모듈로부터 로우 데이터를 수신하거나 또는 로딩하는 것일 수 있다. 구체적인 실시예에서, 로우 데이터는 생산 디바이스로 또는 외부 서버를 통해 수신될 수 있다.In one embodiment, acquiring raw data may involve receiving or loading data stored in memory 120. Acquisition of raw data may include receiving or loading raw data into another storage medium, another computing device, or a separate processing module within the same computing device based on wired/wireless communication means. In specific embodiments, raw data may be received either to the production device or through an external server.

컴퓨팅 장치(100)가 수행하는 데이터 전처리 방법은 로우 데이터에 대한 전처리를 수행하는 단계를 포함할 수 있다. 일 실시예에서, 로우 데이터에 대한 전처리는, 복수의 센서 공정 데이터들을 신경망에서 활용하기 위한 데이터로 표현하기 위한 것일 수 있다. 예컨대, 머신 러닝 시스템은 일반적으로 다차원의 배열을 가진 텐서(tensor)를 기본 데이터 구조로 하기 때문에, 본 발명의 인공지능 모델에서 데이터들을 활용할 수 있도록 센서 공정 데이터들을 텐서 형태의 데이터로 변환하는 전처리를 수행할 수 있다. 일 실시예에서, 본 발명은 센서 공정 데이터들에 대한 전처리를 통해 해당 센서 공정 데이터들에 대응하는 3차원 텐서 데이터를 생성할 수 있다.The data preprocessing method performed by the computing device 100 may include performing preprocessing on raw data. In one embodiment, preprocessing of raw data may be performed to express a plurality of sensor process data as data to be used in a neural network. For example, since machine learning systems generally use tensors with multidimensional arrays as their basic data structure, preprocessing is performed to convert sensor process data into tensor-type data so that the data can be used in the artificial intelligence model of the present invention. It can be done. In one embodiment, the present invention can generate three-dimensional tensor data corresponding to the sensor process data through preprocessing of the sensor process data.

일 실시예에서, 컴퓨팅 장치(100)는 학습 데이터 세트를 구축하는 경우, 로우 데이터에 대한 리샘플링을 수행할 수 있다.In one embodiment, the computing device 100 may perform resampling on raw data when constructing a learning data set.

컴퓨팅 장치(100)에 의해 수행되는 리샘플링은 로우 데이터에 포함된 센서 공정 데이터들 간의 샘플링 타임을 일치시키기 위한 것으로, 일정한 시간 간격을 기준으로 센서 공정 데이터를 추출하는 것을 특징으로 할 수 있다. 컴퓨팅 장치(100)는 센서 공정 데이터의 샘플링 에러를 보정하기 위하여 일정한 시간 간격으로 다시 센서 공정 데이터를 추출할 수 있다.Resampling performed by the computing device 100 is intended to match sampling times between sensor process data included in raw data, and may be characterized by extracting sensor process data based on regular time intervals. The computing device 100 may extract the sensor process data again at regular time intervals to correct sampling errors of the sensor process data.

구체적으로, 도 7의 (a)을 참조하면, 각 웨이퍼의 대응하는 동작 시간이 상이할 수 있다. wafer 1 및 wafer 2 각각은 서로 동작 시간이 상이할 수 있다. 구체적으로, wafer 1 및 wafer 2 각각의 동작 시간을 살펴보면, 동작 시간이 일정 간격을 가지지 않는 것을 확인할 수 있다. 이에 따라, 컴퓨팅 장치(100)는 동작 시간을 일정한 시간 간격을 갖도록 리샘플링을 수행할 수 있다.Specifically, referring to (a) of FIG. 7, the corresponding operation time of each wafer may be different. Wafer 1 and wafer 2 may each have different operating times. Specifically, looking at the operation times of wafer 1 and wafer 2, it can be seen that the operation times do not have a constant interval. Accordingly, the computing device 100 may perform resampling so that the operation time has constant time intervals.

구체적인 예를 들어, 도 7의 (b)를 참조하면, 컴퓨팅 장치(100)는 wafer 1에 대응하여 일정한 시간 간격(1초)으로 다시 센서 공정 데이터를 추출할 수 있다. 도 7의 (b)에 도시된 바와 같이, wafer 1에 대응하는 센서 공정 데이터가 일정한 시간 간격(1초)을 가지도록 즉, 21시 20분 31초 및 21시 20분 23초, 21시 20분 35초 내지 21시 20분 37초의 작동 시간을 포함하도록 리샘플링될 수 있다.For a specific example, referring to (b) of FIG. 7, the computing device 100 may extract sensor process data again at regular time intervals (1 second) in response to wafer 1. As shown in (b) of FIG. 7, the sensor process data corresponding to wafer 1 has a constant time interval (1 second), that is, 21:20:31, 21:20:23, and 21:20. It may be resampled to include an operating time from minutes 35 seconds to 21 hours 20 minutes 37 seconds.

일 실시예에서, 컴퓨팅 장치(100)는 학습 데이터 세트를 구축하는 경우, 리샘플링이 수행된 로우 데이터에 대한 보간을 수행할 수 있다.In one embodiment, when constructing a learning data set, the computing device 100 may perform interpolation on resampling raw data.

컴퓨팅 장치(100)에 의해 수행되는 보간은 리샘플링이 수행됨에 따라 각 센서 공정 데이터에 대응하여 생성되는 결측값에 대한 보간을 수행하는 것을 특징으로 할 수 있다. 즉, 도 7의 (b)에 도시된 바와 같이 리샘플링 됨에 따라, 각 센서의 시점 별 결측값(N/A)이 발생할 수 있다. 이 경우, 컴퓨팅 장치(100)는 결측값에 센서 값을 입력하여 보간을 수행할 수 있다. 일 실시예에서, 컴퓨팅 장치(100)는 결측값에 대응하는 센서 값은 이전 값에 이용하여 추축하는 것을 특징으로 할 수 있다. 즉, 도 7의 (c)에 도시된 바와 같이, 21시 20분 31초에 및 32초에 대응하여 이전 값인 0, 0에 기초하여 0, 0을 통해 보충될 수 있으며, 21시 20분 35초 내지 37초에 대응하여 이전 값인 1, 1에 기초하여 1, 1을 통해 보충될 수 있다. 전술한 바와 같이, 컴퓨팅 장치(100)는 모든 센서 공정 데이터가 동일한 시간 간격을 가지도록 리샘플링을 수행할 수 있으며, 리샘플링 결과에 관련한 결측값들에 대한 보간을 수행할 수 있다.Interpolation performed by the computing device 100 may be characterized as interpolating missing values generated in response to each sensor process data as resampling is performed. That is, as shown in (b) of FIG. 7, as resampling occurs, missing values (N/A) may occur for each sensor at each time point. In this case, the computing device 100 may perform interpolation by inputting the sensor value into the missing value. In one embodiment, the computing device 100 may be characterized in that the sensor value corresponding to the missing value is estimated using the previous value. That is, as shown in (c) of Figure 7, it can be supplemented through 0, 0 based on the previous values 0, 0, corresponding to 21:20:31 and 32 seconds, and 21:20:35. It can be supplemented through 1, 1 based on the previous value 1, 1, corresponding to seconds to 37 seconds. As described above, the computing device 100 may perform resampling so that all sensor process data has the same time interval, and may perform interpolation for missing values related to the resampling result.

일 실시예에서, 컴퓨팅 장치(100)는 학습 데이터 세트를 구축하는 경우, 보간이 수행된 로우 데이터에 대한 패딩을 수행할 수 있다.In one embodiment, when constructing a learning data set, the computing device 100 may perform padding on interpolated raw data.

컴퓨팅 장치(100)에 의해 수행되는 패딩은 로우 데이터에 포함된 센서 공정 데이터들의 공정 시간 정보에 기초하여 평균 공정 시간 정보를 산출할 수 있다.Padding performed by the computing device 100 may calculate average process time information based on process time information of sensor process data included in raw data.

구체적으로, 각 센서 공정 데이터는 동작하는 시간대가 서로 상이할 수 있다. 도 7의 (a)를 참조하면, wafer 1의 경우, 21시 20분 30초부터 21시 20분 40초까지 10초간 동작하나, wafer 2의 경우, 21시 20분 42초부터 21시 21분 08초까지 26초간 동작될 수 있다. 즉, 각 센서 공정 데이터 간 작동 시간(process time)이 상이함을 확인할 수 있다. 이 경우, 컴퓨팅 장치(100)는 wafer 1 및 wafer 2의 process time의 평균 값을 통해 평균 공정 시간 정보를 산출할 수 있다. 구체적인 예시에서, wafer 1 및 wafer 2에 관련한 평균 공정 시간 정보는 18초((10 +26)/2)로 산출될 수 있다.Specifically, each sensor process data may operate at different times. Referring to (a) of Figure 7, for wafer 1, it operates for 10 seconds from 21:20:30 to 21:20:40, but for wafer 2, it operates from 21:20:42 to 21:21. It can be operated for 26 seconds up to 08 seconds. In other words, it can be confirmed that the process time between each sensor process data is different. In this case, the computing device 100 may calculate average process time information through the average value of the process times of wafer 1 and wafer 2. In a specific example, the average process time information related to wafer 1 and wafer 2 can be calculated as 18 seconds ((10 +26)/2).

또한, 컴퓨팅 장치(100)가 패딩을 수행하는 단계에서, 평균 공정 시간 정보에 대응하여 각 센서 공정 데이터들의 공정 시간을 보정하는 단계를 수행할 수 있다.Additionally, in the step of performing padding, the computing device 100 may perform a step of correcting the process time of each sensor process data in response to the average process time information.

일 실시예에서, 컴퓨팅 장치(100)는 평균 공정 시간 정보에 대응하여 각 센서 공정 데이터의 작동 시간을 조정할 수 있다. 예를 들어, 작동 시간이 10초로, 평균 공정 시간 보다 짧은 wafer 1의 경우, 작동 시간을 평균 공정 시간에 대응하도록 늘릴 수 있으며, 작동 시간이 26초로, 평균 공정 시간 보다 긴 wafer 2의 경우, 작동 시간을 평균 공정 시간에 대응하도록 줄일 수 있다.In one embodiment, the computing device 100 may adjust the operation time of each sensor process data in response to the average process time information. For example, for wafer 1, which has an operation time of 10 seconds, which is shorter than the average process time, the operation time can be increased to correspond to the average process time, and for wafer 2, which has an operation time of 26 seconds, which is longer than the average process time, the operation time can be increased to correspond to the average process time. The time can be reduced to correspond to the average process time.

일 실시예에서, 컴퓨팅 장치(100)는 작동 시간이 평균 공정 시간에 대응하여 늘어난 센서 공정 데이터의 경우, 시간 증가에 따라 발생하는 결측값을 이전 값에 기초하여 보충할 수 있다.In one embodiment, in the case of sensor process data whose operating time has increased in response to the average process time, the computing device 100 may supplement missing values that occur as time increases based on previous values.

자세한 예를 들어, 도 8의 (a)와 같은 wafer 1에 관련한 공정 센서 데이터는, 작동 시간이 10초로, 다른 공정 센서 데이터들(예컨대, wafer 2에 관련한 공정 센서 데이터)과의 평균을 통해 산출된 평균 공정 시간(18초) 보다 짧을 수 있으며, 이에 따라 작동 시간이 8초 늘어날 수 있다. 즉, 도 8의 (b)에 도시된 바와 같이, 21시 20분 41초 내지 21시 20분 48초에 관련한 작동 시간이 추가될 수 있다.For example, the process sensor data related to wafer 1, as shown in (a) of Figure 8, has an operating time of 10 seconds, and is calculated by averaging with other process sensor data (e.g., process sensor data related to wafer 2). It may be shorter than the average process time (18 seconds), which may increase the operating time by 8 seconds. That is, as shown in (b) of FIG. 8, the operating time related to 21:20:41 to 21:20:48 can be added.

또한, 평균 공정 시간에 따라 작동 시간이 늘어남에 따라 발생하는 결측값은 이전 값에 기초하여 보충될 수 있다. 즉, 이전 값에 대응하는 1, 0(즉, 21시 20분 40초에 대응하는 각 센서 값)에 기초하여 21시 20분 41초 내지 21시 20분 48초에 대응하여 1, 0에 관련한 센서값이 보충될 수 있다. 전술한 작동 시간 및 평균 공정 시간에 대한 구체적인 수치적 기재는 예시일 뿐 본 발명은 이에 제한되지 않는다.Additionally, missing values that occur as operating time increases according to the average process time can be supplemented based on previous values. That is, based on 1, 0 corresponding to the previous value (i.e., each sensor value corresponding to 21:20:40), 1, 0 corresponding to 21:20:41 to 21:20:48 Sensor values can be supplemented. The detailed numerical description of the above-mentioned operating time and average process time is only an example and the present invention is not limited thereto.

즉, 패딩을 통해 모든 공정에서 진행되는 process time을 통일시킬 수 있다. 이러한 전처리 과정을 통해 모든 센서 공정 데이터들의 sampling rate가 공정에 상관없이 통일될 수 있으며, Sampling error가 발생되는 것이 방지될 수 있다.In other words, the process time in all processes can be unified through padding. Through this preprocessing process, the sampling rate of all sensor process data can be unified regardless of the process, and sampling errors can be prevented from occurring.

일 실시예에서, 학습 데이터 세트를 구축하는 단계는, 패딩이 수행된 로우 데이터에 기초하여 텐서 데이터를 획득하는 단계를 포함할 수 있다. 실시예에 따르면, 전술한 전처리 과정을 통해 생성된 텐서 데이터는, wafer id, sensor number 및 시간에 관련한 3차원 축을 기반으로 하는 3차원 텐서 데이터일 수 있다.In one embodiment, building a learning data set may include acquiring tensor data based on raw data on which padding has been performed. According to an embodiment, the tensor data generated through the above-described preprocessing process may be 3D tensor data based on a 3D axis related to wafer id, sensor number, and time.

상술한 바와 같이, 인공지능 모델에서 센서 공정 데이터들을 활용할 수 있도록 텐서 형태의 텐서 데이터로 변환하는 전처리를 수행할 수 있다.As described above, preprocessing can be performed to convert sensor process data into tensor data in tensor form so that it can be used in an artificial intelligence model.

구체적으로, 컴퓨팅 장치(100)가 수행하는 데이터 전처리 방법은, 전처리된 로우 데이터에서 유효 데이터를 선별하여 인공지능 모델을 학습시키기 위한 학습 데이터 세트를 구축하는 단계를 포함할 수 있다.Specifically, the data preprocessing method performed by the computing device 100 may include selecting valid data from preprocessed raw data and constructing a learning data set for training an artificial intelligence model.

유효 데이터는, 비정상 공정을 감지하는 인공지능 모델의 학습에 관련한 학습 데이터들로, 정상에 관련한 공정 센서 데이터인 것을 특징으로 할 수 있다.Valid data is learning data related to the learning of an artificial intelligence model that detects abnormal processes, and may be characterized as process sensor data related to normal processes.

일 실시예에서, 인공지능 모델은, 공정 과정에서 실시간으로 획득되는 공정 센서 데이터를 기반으로 비정상 상황을 감지하는 신경망 모델일 수 있다. 예를 들어, 비정상 동작이란, 수율을 저해하는 다양한 공정 상황들이나, 설비 고장에 관련한 비정상 동작이나 불량 조건 등이 감지되는 상황을 포함할 수 있다.In one embodiment, the artificial intelligence model may be a neural network model that detects abnormal situations based on process sensor data acquired in real time during the process. For example, abnormal operations may include various process situations that impede yield, or situations in which abnormal operations or defective conditions related to equipment failure are detected.

인공지능 모델은, 입력 데이터와 유사한 출력 데이터를 생성하도록 학습된 신경망 모델일 수 있다. 인공지능 모델은, 예를 들어, 입력 데이터와 유사한 출력 데이터를 출력하는 오토인코더를 포함할 수 있다. 오토인코더는 차원 감소 네트워크 함수(예컨대, 인코더) 및 차원 복원 네트워크 함수(예컨대, 디코더)를 포함할 수 있다. An artificial intelligence model may be a neural network model trained to generate output data similar to input data. The artificial intelligence model may include, for example, an autoencoder that outputs output data similar to input data. An autoencoder may include a dimensionality reduction network function (eg, encoder) and a dimensionality restoration network function (eg, decoder).

일 실시예에 따르면, 오토 인코더는 적어도 하나의 히든 레이어를 포함할 수 있으며, 홀수 개의 히든 레이어가 입출력 레이어 사이에 배치될 수 있다. 각각의 레이어의 노드의 수는 입력 레이어의 노드의 수에서 병목 레이어(인코딩)라는 중간 레이어로 축소되었다가, 병목 레이어에서 출력 레이어(입력 레이어와 대칭)로 축소와 대칭되어 확장될 수도 있다. 차원 감소 레이어와 차원 복원 레이어의 노드는 대칭일 수도 있고 아닐 수도 있다. 오토 인코더는 비선형 차원 감소를 수행할 수 있다. 입력 레이어 및 출력 레이어의 수는 입력 데이터의 전처리 이후에 남은 센서들의 수와 대응될 수 있다. 오토 인코더 구조에서 인코더에 포함된 히든 레이어의 노드의 수는 입력 레이어에서 멀어질수록 감소하는 구조를 가질 수 있다. 병목 레이어(인코더와 디코더 사이에 위치하는 가장 적은 노드를 가진 레이어)의 노드의 수는 너무 작은 경우 충분한 양의 정보가 전달되지 않을 수 있으므로, 특정 수 이상(예를 들어, 입력 레이어의 절반 이상 등)으로 유지될 수도 있다.According to one embodiment, the auto encoder may include at least one hidden layer, and an odd number of hidden layers may be disposed between input and output layers. The number of nodes in each layer may be reduced from the number of nodes in the input layer to an intermediate layer called the bottleneck layer (encoding), and then expanded symmetrically and reduced from the bottleneck layer to the output layer (symmetrical to the input layer). The nodes of the dimensionality reduction layer and dimensionality restoration layer may or may not be symmetric. Autoencoders can perform nonlinear dimensionality reduction. The number of input layers and output layers may correspond to the number of sensors remaining after preprocessing of the input data. In an auto-encoder structure, the number of nodes in the hidden layer included in the encoder may have a structure that decreases as the distance from the input layer increases. If the number of nodes in the bottleneck layer (the layer with the fewest nodes located between the encoder and decoder) is too small, not enough information may be conveyed, so if it is higher than a certain number (e.g., more than half of the input layers, etc.) ) may be maintained.

인공지능 모델은, 다년간 축적된 정상에 관련한 공정 센서 데이터들을 통해 학습될 수 있으며, 이에 따라, 실시간으로 획득되는 공정 센서 데이터들을 입력으로 하여 해당 입력에 대응하는 출력 데이터를 출력할 수 있다.The artificial intelligence model can be learned through process sensor data related to normality accumulated over many years. Accordingly, it can output output data corresponding to the input by using process sensor data acquired in real time as input.

구체적으로, 인공지능 모델은, 다년간 축적된 정상 공정 센서 데이터들을 통해 학습됨에 따라, 기존 축적된 정상 공정 센서 데이터들과 유사한 공정 센서 데이터가 입력되는 경우, 입력에 관련한 공정 센서 데이터와 유사한 재건 공정 센서 데이터를 출력할 수 있다. 이 경우, 입력된 데이터와 출력된 데이터 간의 차이인 재건율 오차가 적을 수 있다.Specifically, as the artificial intelligence model is learned through normal process sensor data accumulated over many years, when process sensor data similar to existing accumulated normal process sensor data is input, a reconstructed process sensor similar to the process sensor data related to the input is used. Data can be output. In this case, the reconstruction rate error, which is the difference between input data and output data, may be small.

이와 반대로, 인공지능 모델은, 기존 축적된 정상 공정 센서 데이터들과 유사하지 않은 공정 센서 데이터가 입력되는 경우, 입력에 관련한 공정 센서 데이터와 유사하지 않은 재건 공정 센서 데이터를 출력할 수 있다. 이 경우, 입력된 데이터와 출력된 데이터 간의 차이인 재건율 오차가 클 수 있다.Conversely, when process sensor data that is not similar to existing accumulated normal process sensor data is input, the artificial intelligence model may output reconstructed process sensor data that is not similar to the process sensor data related to the input. In this case, the reconstruction rate error, which is the difference between input data and output data, may be large.

일 실시예에서, 인공지능 모델은, 입력 데이터와 출력 데이터 간의 재건율 오차가 일정 기준치 이하인 경우, 정상 동작이라고 판별할 수 있으며, 입력 데이터와 출력 데이터 간의 재건율 오차가 일정 기준치를 초과하는 경우, 비정상 동작이 발생한 것으로 판별할 수 있다. 즉, 인공지능 모델은 재건율 오차가 기준 일정 기준치를 초과하는 경우, 기존의 정상 상황에서 획득된 데이터 유형과 상이한 데이터 유형이 획득(즉, 과거에 한 번도 경험하지 못했던 유형의 센서 데이터가 감지)된 것으로 판별하여 비정상 상황으로 판별할 수 있다.In one embodiment, the artificial intelligence model may determine normal operation if the reconstruction rate error between input data and output data is below a certain standard value, and if the reconstruction rate error between input data and output data exceeds a certain standard value, It can be determined that abnormal operation has occurred. In other words, when the reconstruction rate error exceeds a certain standard value, the artificial intelligence model acquires a data type that is different from the data type obtained in existing normal situations (i.e., a type of sensor data that has never been experienced in the past is detected). This can be determined as an abnormal situation.

상술한 바와 같이, 인공지능 모델은 공정 상황에서 획득되는 공정 센서 데이터들을 입력으로 하여, 비정상 동작이 발생하였는지 여부를 판별할 수 있다. 이러한 인공지능 모델을 학습시키기 위해서는, 상기의 설명과 같이 다년간 축적된 다량의 공정 센서 데이터가 필요하다. 특히, 다량의 공정 센서 데이터 중 정상 동작에 관련한 공정 센서 데이터가 구비되어야 한다.As described above, the artificial intelligence model can determine whether an abnormal operation has occurred by using process sensor data obtained in a process situation as input. In order to learn such an artificial intelligence model, a large amount of process sensor data accumulated over many years is required, as described above. In particular, among the large amount of process sensor data, process sensor data related to normal operation must be provided.

일 실시예에 따르면, 컴퓨팅 장치(100)는 다년간 축적된 다량의 공정 센서 데이터(예컨대, 정상과 비정상에 관련한 공정 센서 데이터)들 중에서 정상 동작에 관련한 공정 센서 데이터만을 선별하여 유효 데이터를 구축할 수 있다. 즉, 컴퓨팅 장치(100)는 공정 과정에서 획득된 양불이 판정되지 않은 전체 로우 데이터로부터 인공지능 모델의 학습에 적합한 유효 데이터를 선별하여 획득할 수 있다.According to one embodiment, the computing device 100 may construct valid data by selecting only process sensor data related to normal operation from a large amount of process sensor data (e.g., process sensor data related to normal and abnormal) accumulated over many years. there is. In other words, the computing device 100 can select and obtain valid data suitable for learning an artificial intelligence model from all raw data for which good or bad has not been determined obtained during the process.

구체적인 실시예에서, 컴퓨팅 장치(100)는 복수의 공정 센서 데이터를 포함하는 로우 데이터에 대응하여 제1차 선별 및 제2차 선별을 수행함으로써, 정상에 관련한 센서 공정 데이터들인 유효 데이터를 획득할 수 있다.In a specific embodiment, the computing device 100 may obtain valid data, which is sensor process data related to normal, by performing first screening and secondary screening in response to raw data including a plurality of process sensor data. there is.

보다 구체적으로, 컴퓨팅 장치(100)는 학습 데이터 세트를 구축하는 경우, 복수의 센서 공정 데이터에 대응하여 제1차 선별을 수행할 수 있다. 여기서, 제1차 선별은 참조값을 기준으로 복수의 공정 센서 데이터 각각의 진폭 유사도 비교를 통해 수행될 수 있다.More specifically, when constructing a learning data set, the computing device 100 may perform first selection in response to a plurality of sensor process data. Here, the first selection may be performed by comparing the amplitude similarity of each of the plurality of process sensor data based on the reference value.

일 실시예에서, 컴퓨팅 장치(100)가 제1차선별을 수행하는 경우, 복수의 센서 공정 데이터들에 기초하여 참조값을 획득하는 단계를 수행할 수 있다. 여기서, 참조값은 복수의 센서 공정 데이터들 각각과의 진폭 유사도 산정에 기준이 되는 값일 수 있다. 예를 들어, 참조값은 복수의 센서 공정 데이터들의 중앙값(median)을 통해 획득되는 것을 특징으로 할 수 있다.In one embodiment, when the computing device 100 performs first screening, a step of obtaining a reference value based on a plurality of sensor process data may be performed. Here, the reference value may be a standard value for calculating amplitude similarity with each of a plurality of sensor process data. For example, the reference value may be obtained through the median of a plurality of sensor process data.

자세히 설명하면, 복수의 센서 공정 데이터는, 시간의 흐름에 따라 변화하는 센서값에 대한 정보를 포함할 수 있다. 복수의 센서 공정 데이터는, 도 9의 (a)에 도시된 바와 같은 그래프의 형태로 표현될 수 있다. 도 9에서 x축은 시간에 해당하며, y축은 공정 과정에서 획득되는 센서값에 해당할 수 있다. 컴퓨팅 장치(100)는 각 시점별 복수의 센서 공정 데이터의 센서값들의 중앙값을 식별할 수 있으며, 해당 중앙값들의 조합이 참조값일 수 있다. 구체적인 예를 들어, 다량의 센서 공정 데이터가 제1센서 공정 데이터 내지 제5센서 공정 데이터를 포함하며, 제1시점에 대응하여 제1센서 공정 데이터의 센서값이 3이며, 제2센서 공정 데이터의 센서값이 8이며, 제3센서 공정 데이터의 센서값이 7이며, 제4센서 공정 데이터의 센서값이 10이고, 그리고 제5센서 공정 데이터의 센서값이 90인 경우, 컴퓨팅 장치(100)는 3, 7, 8, 10, 90의 값들 중 중앙값인 8을 제1시점에 대응하는 중앙값으로 식별할 수 있다. 이러한 방식으로 컴퓨팅 장치(100)는 각 시점 별 중앙값들에 기초하여 참조값을 획득할 수 있다.In detail, the plurality of sensor process data may include information about sensor values that change over time. A plurality of sensor process data may be expressed in the form of a graph as shown in (a) of FIG. 9. In Figure 9, the x-axis may correspond to time, and the y-axis may correspond to sensor values obtained during the process. The computing device 100 may identify the median value of sensor values of a plurality of sensor process data at each time point, and a combination of the median values may be a reference value. For a specific example, a large amount of sensor process data includes first sensor process data to fifth sensor process data, and corresponding to the first point in time, the sensor value of the first sensor process data is 3, and the sensor value of the second sensor process data is 3. When the sensor value is 8, the sensor value of the third sensor process data is 7, the sensor value of the fourth sensor process data is 10, and the sensor value of the fifth sensor process data is 90, the computing device 100 Among the values 3, 7, 8, 10, and 90, the median value of 8 can be identified as the median value corresponding to the first time point. In this way, the computing device 100 can obtain a reference value based on the median values for each time point.

일 실시예에서, 참조값을 중앙값을 통해 산출하는 경우, 이상값에 대한 영향이 최소화되는 것을 특징으로 할 수 있다.In one embodiment, when the reference value is calculated using the median, the influence of outlier values may be minimized.

예컨대, 로우 데이터에 포함된 복수의 센서 공정 데이터들은, 양불 판정이 되지 않은, 즉, 정상과 비정상에 관련한 센서 공정 데이터들을 모두 포함하고 있을 수 있다. 비정상 동작에 관련한 센서 공정 데이터를 포함하고 있기 때문에, 일반적인 센서값과 상이한 이상값들이 존재할 수 있다. 전술한 예시에서, 제1시점에 대응한 제1 내지 제5의 센서값이 3, 7, 8, 10, 90을 볼 때, 90에 관련한 센서값은 이상값에 관련한 것일 수 있다. 즉, 특정 센서값이 다른 센서값들과 큰 차이가 나는 이상값이 존재할 수 있다.For example, a plurality of sensor process data included in the raw data may include both sensor process data related to normal and abnormal conditions that have not been judged good or bad. Because it contains sensor process data related to abnormal operation, outlier values that are different from general sensor values may exist. In the above example, when the first to fifth sensor values corresponding to the first time point are 3, 7, 8, 10, and 90, the sensor value related to 90 may be related to an outlier. In other words, there may be an outlier where a specific sensor value is significantly different from other sensor values.

정상과 비정상에 관련한 센서 공정 데이터들이 혼재된 상황에서 평균값(average)을 이용하여 참조값을 획득하는 경우, 이상값에 대한 영향이 커질 수 있어, 참조값이 적정한 기준이 아니게 된다. 자세한 예를 들어, 전술한 예시의 센서값들의 평균값은 23.6((3+7+8+10+90)/5)인 반면, 중앙값은, 8일 수 있다.If a reference value is obtained using an average in a situation where sensor process data related to normal and abnormal are mixed, the influence of the outlier may increase, so the reference value is not an appropriate standard. For a detailed example, the average value of the sensor values in the above-described example may be 23.6 ((3+7+8+10+90)/5), while the median value may be 8.

즉, 평균값을 활용하는 경우, 하나의 이상값의 영향을 크게 받을 수 있어, 참조값으로 활용이 어려울 수 있다. 본 발명은, 로우 데이터에 정상 및 비정상에 관련한 센서 공정 데이터가 포함된 것을 고려하여, 이상치에 관련한 영향이 최소화되도록 중앙값을 활용하여 참조값을 획득할 수 있다. 이러한 방식으로 획득된 참조값은 이상값에 대한 영향이 최소화된 것이므로, 정상 데이터를 선별하는데 보다 적정한 기준으로 작용할 수 있다.In other words, when using the average value, it may be greatly influenced by one outlier, making it difficult to use it as a reference value. In the present invention, considering that raw data includes sensor process data related to normal and abnormal conditions, a reference value can be obtained using the median value to minimize the influence of outliers. Since the reference value obtained in this way has minimal influence on outliers, it can serve as a more appropriate standard for selecting normal data.

일 실시예에서, 컴퓨팅 장치(100)가 제1차 선별을 수행하는 단계에서, 복수의 센서 공정 데이터 각각의 진폭 변화 형태와 참조값의 진폭 변화 형태 간의 유사도에 기초하여 제1차 선별을 수행하는 단계를 수행할 수 있다. 컴퓨팅 장치(100)는 복수의 센서 공정 데이터들 중 참조값의 진폭 변화 형태와 유사한 진폭 변화 형태를 가진 센서 공정 데이터를 선별해낼 수 있다. 예컨대, 참조값의 진폭과 일정 이상의 유사도를 갖지 않은 센서 공정 데이터들은, 제1차 선별되지 않을 수 있다.In one embodiment, in performing the first selection, the computing device 100 performs the first selection based on the similarity between the amplitude change form of each of the plurality of sensor process data and the amplitude change form of the reference value. can be performed. The computing device 100 may select sensor process data having an amplitude change form similar to that of the reference value from among a plurality of sensor process data. For example, sensor process data that does not have a certain level of similarity to the amplitude of the reference value may not be selected in the first place.

구체적인 실시예에서, 각 진폭 변화 형태(예컨대, 각 파장) 간의 유사도는 동적 시간 와핑(DTW, Dynamic Time Wraping) 알고리즘을 통해 판단될 수 있다. 동적 시간 와핑은, 속도 또는 길이에 따라 움직임이 다른 두 시계열 데이터 간의 유사성(또는 유사도)을 측정하는 알고리즘일 수 있다. 동적 시간 와핑은 거리가 최소화되는 방향으로 매칭시켜 누적 거리가 최소가 되는 뒤틀림(warping) 경로를 찾는 것을 특징으로 할 수 있다.In a specific embodiment, the similarity between each amplitude change type (eg, each wavelength) may be determined through a Dynamic Time Wraping (DTW) algorithm. Dynamic time warping may be an algorithm that measures the similarity (or degree of similarity) between two time series data with different movements depending on speed or length. Dynamic time warping may be characterized by finding a warping path that minimizes the cumulative distance by matching in the direction where the distance is minimized.

일반적으로, 두 시계열 데이터 간의 유사성을 판별할 때, 유클리디안 거리 알고리즘을 활용할 수 있다. 유클리디안 거리 알고리즘은, 같은 시간선상에 대한 거리를 계산하여 양 데이터 간 유사성을 판별할 수 있다. 예컨대, 유클리드 거리를 활용하는 경우, 0초-0초, 1초-1초...등 각 시점 별로 매칭하여 유사성을 판별하게 된다. 다만, 이러한 유클리드 거리는, 전반적으로 패턴이 비슷해 보이는 데이터에서 떨림과 움직임이 심해질수록 유사성을 찾을 수 없으며, 길이가 짧은 시계열 데이터 간의 유사도 평가가 어렵다는 단점이 있다. 본 발명에서 활용되는 센서 공정 데이터들은 각각 동작 시간이 서로 상이할 수 있다. 다시 말해, 유클리디안 거리 알고리즘의 경우, 시간의 축이 뒤틀어진 상황에서 유사도의 계산 정확도가 매우 낮아질 수 있다. 즉 양 데이터 간 alignment가 맞지 않은 경우 거리 산출이 어려워져 유사도 평가가 어려울 수 있다.In general, when determining similarity between two time series data, the Euclidean distance algorithm can be used. The Euclidean distance algorithm can determine the similarity between both data by calculating the distance on the same time line. For example, when using Euclidean distance, similarity is determined by matching for each time point, such as 0 second - 0 second, 1 second - 1 second, etc. However, this Euclidean distance has the disadvantage that it is difficult to find similarities in data that appears to have similar overall patterns as tremor and movement become more severe, and that it is difficult to evaluate the similarity between short-length time series data. The sensor process data used in the present invention may each have different operating times. In other words, in the case of the Euclidean distance algorithm, the accuracy of calculating similarity can be very low in situations where the time axis is distorted. In other words, if the alignment between the two data is not correct, it may be difficult to calculate the distance and thus difficult to evaluate the similarity.

다른 실시예에서, 코사인 유사도 비교를 활용하여 양 데이터를 벡터화하고, 벡터 간 각도를 통해 거리를 산출하고 이에 기반하여 유사도를 평가할 수 있으나, 해당 코사인 유사도 비교는 시계열 데이터의 sequence가 반영되지 않는다는 단점이 있다.In another embodiment, cosine similarity comparison can be used to vectorize both data, calculate the distance through the angle between vectors, and evaluate similarity based on this. However, the cosine similarity comparison has the disadvantage of not reflecting the sequence of time series data. there is.

이에 따라, 본 발명은 동적 시간 와핑 알고리즘을 활용하여 양 데이터(즉, 참조값과 각 센서 공정 데이터) 간 유사성을 측정할 수 있다. 동적 시간 와핑 알고리즘은 두 개의 다른 속도의 시간축의 파장의 유사성을 평가하는 알고리즘일 수 있다. 동적 시간 와핑 알고리즘은, 동일한 시간선상의 데이터뿐만 아니라 주변 시점까지 비교 대상으로 하여 더 비슷한 요소에 매칭할 수 있다. 이러한 동적 시간 와핑 알고리즘을 활용하는 경우, 서로 다른 길이의 시계열 데이터 간의 유사성 평가를 가능하게 한다는 장점이 있다.Accordingly, the present invention can measure the similarity between both data (i.e., reference values and each sensor process data) by utilizing a dynamic time warping algorithm. A dynamic time warping algorithm may be an algorithm that evaluates the similarity of wavelengths of time bases at two different speeds. The dynamic time warping algorithm can match not only data on the same timeline but also surrounding time points to more similar elements. Using this dynamic time warping algorithm has the advantage of enabling similarity evaluation between time series data of different lengths.

일 실시예에서, 컴퓨팅 장치(100)는 참조값과 유사도가 일정 기준치 이하인 센서 공정 데이터들을 제거하고, 참조값과 유사도가 일정 기준치 이상인 센서 공정 데이터들을 선별(즉, 제1차 선별)할 수 있다. 도 9의 (b)는 제1차 선별이 수행된 결과에 관련한 그래프일 수 있다. 도 9의 (a) 및 도 9의 (b)를 참조하면, 참조값과 유사하지 않은 센서 공정 데이터들이 제거됨을 식별할 수 있다. 즉, 컴퓨팅 장치(100)는 참조값을 기준으로 하는 진폭 변화 양상(예컨대, 파장)의 유사도 판정에 기초하여 제1차 선별을 수행할 수 있다.In one embodiment, the computing device 100 may remove sensor process data whose similarity to a reference value is less than a certain threshold and select (i.e., first select) sensor process data whose similarity to a reference value is greater than or equal to a certain threshold. Figure 9(b) may be a graph related to the results of the first screening. Referring to Figures 9(a) and 9(b), it can be identified that sensor process data that is not similar to the reference value is removed. That is, the computing device 100 may perform the first selection based on determining the similarity of the amplitude change pattern (eg, wavelength) based on the reference value.

컴퓨팅 장치(100)가 학습 데이터를 구축하는 단계에서, 제1차 선별된 센서 공정 데이터들에 대응하여 제2차 선별을 수행하는 단계를 수행할 수 있다. 여기서, 제2차 선별은 센서 공정 데이터 간의 타이밍 유사도에 기초하여 수행되는 것을 특징으로 할 수 있다.In the step of constructing learning data, the computing device 100 may perform a second selection in response to the first selection of sensor process data. Here, the secondary selection may be performed based on timing similarity between sensor process data.

구체적인 실시예에서, 제2차 선별을 수행하는 단계는, 참조값과 제1차 선별된 각 센서 공정 데이터 간의 타이밍 유사도 비교를 통해 하나 이상의 이상 데이터를 식별하는 단계를 포함할 수 있다.In a specific embodiment, performing the second selection may include identifying one or more abnormal data through timing similarity comparison between a reference value and each first selected sensor process data.

구체적인 실시예에서, 각 데이터 간이 타이밍 유사도 비교는, 아핀 동작 시간 와핑(ADTW, Affine Dynamic Time Warping) 알고리즘을 활용하여 수행될 수 있다. 아핀 동작 시간 와핑 알고리즘은, 데이터 간의 시간 차이에 관련한 유사성을 평가할 수 있다. 아핀 동작 시간 와핑 알고리즘의 경우, DTR path로부터 시간 변이 정보를 추출하고, 이를 기반으로 ADTW 거리를 산출할 수 있다. 즉, 아핀 동작 시간 와핑 알고리즘을 활용하는 경우, 시간 변이 정보에 기반하여 데이터 간의 시간 차이(즉, 타이밍)에 관련한 유사성을 평가할 수 있다는 장점이 있다. 일 예로, 각 센서 공정 데이터는 시간의 변화에 따라 상이한 센서값을 가질 수 있다. 즉, 각 센서 공정 데이터는 특정 파장의 형태를 갖도록 그래프 상에 표현될 수 있다. 예컨대, 참조값과 동일한 파장 형태를 가진 센서 공정 데이터들을 제1차 선별 과정에서 선별될 수 있으며, 제2차 선별의 대상이 될 수 있다. 이 경우, 제2차 선별은, 참조값의 파장의 상승 또는 하강 등의 변화 타이밍이 유사한지 여부에 기초하여 수행되는 것일 수 있다.In a specific embodiment, timing similarity comparison between each data may be performed using an Affine Dynamic Time Warping (ADTW) algorithm. The affine operation time warping algorithm can evaluate the similarity related to the time difference between data. In the case of the affine operation time warping algorithm, time variation information can be extracted from the DTR path and the ADTW distance can be calculated based on this. In other words, when using the affine operation time warping algorithm, there is an advantage of being able to evaluate the similarity related to the time difference (i.e. timing) between data based on time variation information. As an example, each sensor process data may have different sensor values depending on changes in time. That is, each sensor process data can be expressed on a graph to have a specific wavelength. For example, sensor process data having the same wavelength as the reference value may be selected in the first selection process and may be subject to the second selection. In this case, the secondary selection may be performed based on whether the timing of changes, such as the rise or fall of the wavelength of the reference value, is similar.

즉, 참조값과 유사한 시점에 파장의 변화가 발생하는지 여부에 따라 제2차 선별이 진행될 수 있다. 실시예에서, 컴퓨팅 장치(100)는 참조값과 유사하지 않은 타이밍을 갖는 센서 공정 데이터들을 하나 이상의 이상 데이터로 식별할 수 있다. 하나 이상의 이상 데이터는 참조값과 상이한 타이밍에 관련한 센서 공정 데이터들일 수 있다.In other words, secondary screening may be performed depending on whether a change in wavelength occurs at a time similar to the reference value. In an embodiment, the computing device 100 may identify sensor process data with timing that is not similar to a reference value as one or more abnormal data. One or more abnormal data may be sensor process data related to a timing that is different from the reference value.

또한, 제2차 선별을 수행하는 단계는, 제1차 선별된 센서 공정 데이터에서 하나 이상의 이상 데이터를 제거하여 유효 데이터를 획득하는 단계를 포함할 수 있다.Additionally, performing the second selection may include obtaining valid data by removing one or more abnormal data from the first selection of sensor process data.

즉, 컴퓨팅 장치(100)는 참조값의 파형 변화 타이밍과 유사하지 않은 파형 변화 타이밍을 갖는 센서 공정 데이터들을 제거하고, 참조값과 파형 변화 타이밍과 유사한 파형 변화 타이밍을 갖는 센서 공정 데이터들을 선별(즉 제2차 선별)할 수 있다. 도 9의 (c)는 제2차 선별이 수행된 결과에 관련한 그래프일 수 있다. 도 9의 (a), 도 9의 (b) 및 도 9의 (c)를 참조하면, 제2차 선별 결과, 참조값과 유사하지 않은 타이밍에 관련한 센서 공정 데이터들이 제거됨을 식별할 수 있다. 즉, 컴퓨팅 장치(100)는 참조값을 기준으로 하는 파형 변화 타이밍의 유사도 판정에 기초하여 제2차 선별을 수행할 수 있다.That is, the computing device 100 removes sensor process data having waveform change timing that is not similar to the waveform change timing of the reference value and selects sensor process data having waveform change timing similar to the reference value and the waveform change timing (i.e., the second tea selection) can be done. Figure 9(c) may be a graph related to the results of secondary screening. Referring to Figure 9(a), Figure 9(b), and Figure 9(c), it can be seen that as a result of the second selection, sensor process data related to timing that is not similar to the reference value is removed. That is, the computing device 100 may perform the second selection based on determining the similarity of the waveform change timing based on the reference value.

전술한 바와 같이, 컴퓨팅 장치(100)는 로우 데이터에 해당하는 복수의 센서 공정 데이터들의 각 시점 별 중앙값들의 조합을 통해 참조값을 획득할 수 있으며, 해당 참조값을 기준으로 진폭 변화 형태의 유사도 판별을 통해 제1차 선별을 수행하며, 파형 변화 타이밍의 유사도 판별을 통해 제2차 선별을 수행하여 인공지능 모델의 학습에 적정한 유효 데이터를 획득할 수 있다. 이 경우, 복수의 센서 공정 데이터에 대응하는 참조값은, 이상치로부터 영향이 미치지 않는 중앙값에 기초하여 획득되는 것임에 따라 비정상 데이터를 제외시키는 데 적정한 기준을 제시할 수 있다.As described above, the computing device 100 can obtain a reference value through a combination of median values at each time point of a plurality of sensor process data corresponding to raw data, and determine the similarity of the amplitude change form based on the reference value. By performing the first selection and determining the similarity of the timing of waveform changes, the second selection can be performed to obtain valid data suitable for learning an artificial intelligence model. In this case, since the reference value corresponding to the plurality of sensor process data is obtained based on the median value that is not affected by outliers, it can provide an appropriate standard for excluding abnormal data.

또한, 로우 데이터에 대응하여 제1차 선별 및 제2차 선별이 수행되는 경우, 참조값과 유사한 진폭 변화 크기 및 변화 타이밍을 갖는 공정 센서 데이터들이 유효 데이터로써 선별될 수 있다. 예컨대, 참조값과 유사하지 않은 진폭 변화 크기 및 타이밍을 갖는 공정 센서 데이터들은 비정상 동작에 관련한 공정 센서 데이터들일 수 있다.Additionally, when the first selection and the second selection are performed in response to raw data, process sensor data having amplitude change magnitude and change timing similar to the reference value may be selected as valid data. For example, process sensor data having an amplitude change magnitude and timing that are not similar to a reference value may be process sensor data related to abnormal operation.

즉, 기준이 되는 참조값과 유사한 진폭 크기 및 유사한 타이밍을 가진 공정 센서 데이터들 만을 정상에 관련한 센서 공정 데이터로 판별하여 유효 데이터를 획득할 수 있다.That is, valid data can be obtained by determining only process sensor data with similar amplitude magnitude and similar timing as the standard reference value as normal sensor process data.

추가적인 실시예에서, 컴퓨팅 장치(100)는 제2차 선별된 센서 공정 데이터들에 기초하여 보정 참조값을 획득하고, 보정 참조값에 기초하여 제1차 선별 및 제2차 선별을 수행하는 것을 특징으로 할 수 있다. 보정 참조값을 획득하고 이를 기반으로 제1차 선별 및 제2차 선별이 재차 수행되는 경우, 선별의 정확도가 향상되는 것을 특징으로 할 수 있다.In a further embodiment, the computing device 100 may be characterized in that it obtains a correction reference value based on the second selected sensor process data and performs the first selection and the second selection based on the correction reference value. You can. When a correction reference value is obtained and the first screening and the second screening are performed again based on this, the accuracy of screening may be improved.

보다 자세히 설명하면, 유효 데이터를 획득하는 단계는, 제2차 선별된 센서 공정 데이터들에 기초하여 보정 참조값을 획득하는 단계 및 보정 참조값을 활용한 제1차 선별을 수행하는 단계 및 제1차 선별된 공정 센서 데이터들에 대응하여 제2차 선별을 수행하는 단계를 포함할 수 있다.In more detail, the step of acquiring valid data includes obtaining a correction reference value based on the second selected sensor process data, performing first selection using the correction reference value, and first selection It may include performing secondary selection in response to the processed sensor data.

전술한 바와 같이, 제2차 선별된 센서 공정 데이터들은 참조값과 유사한 진폭 변화 크기 및 변화 타이밍을 갖는 공정 센서 데이터들일 수 있다. 즉, 제2차 선별된 센서 공정 데이터들은, 정상에 관련한 센서 공정 데이터들로 판별된 것일 수 있다. 비정상에 관련한 센서 공정 데이터들이 걸러진 센서 공정 데이터들을 기반으로 재차 참조값(즉, 보정 참조값)이 생성되는 경우, 이는 최초 참조값에 비해 정상에 관련한 공정 센서 데이터를 선별하기 위한 더욱 정확한 기준이 될 수 있다. 예컨대, 재차 참조값(즉, 보정 참조값)을 생성하는 과정에서는, 진폭이나, 타이밍이 이상한 데이터들이 1차적으로 걸러진 센서 공정 데이터들을 활용하기 때문에, 정상 데이터를 선별하는데 높은 정확도를 가질 수 있다. 이에 따라, 해당 보정 참조값을 활용하여 진폭의 크기 및 타이밍에 관련한 제1차 선별 및 제2차 선별을 재차 수행하는 경우, 비정상의 관련한 센서 공정 데이터가 보다 더 적절히 제거될 수 있다.As described above, the second selected sensor process data may be process sensor data having amplitude change magnitude and change timing similar to the reference value. In other words, the second selected sensor process data may be determined to be normal sensor process data. If a reference value (i.e., a correction reference value) is generated again based on sensor process data from which sensor process data related to abnormalities have been filtered out, this can be a more accurate standard for selecting process sensor data related to normal compared to the initial reference value. For example, in the process of generating a reference value (i.e., a correction reference value) again, sensor process data in which data with abnormal amplitude or timing have been primarily filtered out is used, so it is possible to have high accuracy in selecting normal data. Accordingly, when the first screening and the second screening related to the magnitude and timing of the amplitude are performed again using the corresponding correction reference value, sensor process data related to abnormalities can be more appropriately removed.

실시예에서, 인공지능 모델을 학습시키기 위한 학습 데이터는 정상에 관련한 학습 데이터만을 포함하여야 한다. 예컨대, 소수의 비정상에 관련한 센서 공정 데이터가 학습 데이터에 포함되더라도 학습이 완료된 인공지능 모델의 출력 정확도를 크게 저하시킬 수 있다. 뿐만 아니라, 해당 인공지능 모델이 비정상에 관련한 공정 센서 데이터를 통해 학습됨에 따라, 실시간으로 비정상 시그널이나 불량 조건을 감지하지 못하는 경우, 웨이퍼 전량을 폐기해야 하는 등 큰 손실을 초래할 수 있다. 즉, 비정상 가능성이 있는 센서 공정 데이터들을 학습 데이터로써 포함되지 않게 구분하는 것은 매우 중요할 수 있다. 다시 말해, 명확한 기준을 통해 보다 정상에 관련한 센서 공정 데이터들만을 선별하여 학습 데이터를 구축하는 것이 중요할 수 있다. 본 발명은 보정 참조값을 활용함으로써, 보다 정확한 기준을 통해 정상에 관련한 센서 공정 데이터들만을 선별할 수 있다. In an embodiment, training data for training an artificial intelligence model should include only training data related to normality. For example, even if sensor process data related to a small number of abnormalities is included in the learning data, the output accuracy of the trained artificial intelligence model can be greatly reduced. In addition, as the artificial intelligence model is learned through process sensor data related to abnormalities, if it cannot detect abnormal signals or defective conditions in real time, it may result in large losses, such as having to discard the entire wafer. In other words, it can be very important to distinguish sensor process data that may be abnormal from being included as learning data. In other words, it may be important to construct learning data by selecting only sensor process data related to normal conditions through clear criteria. By using a correction reference value, the present invention can select only sensor process data related to normal using a more accurate standard.

전술한 설명에서는 제1차 선별 및 제2차 선별이 재차적으로 수행되어 보다 정확한 분류 기준이 되는 보정 참조값이 획득되는 것을 설명하였으나, 제1차 선별 및 제2차 선별을 3회차 및 4회차 혹은 그 이상으로 수행할 수 있으며, 이에 따라, 2차 보정 참조값 및 3차 보정 참조값이 획득될 수도 있음이 통상의 기술자에게 자명할 것이다. 즉, 제1차 선별 및 제2차 선별 과정이 복수회 반복될수록 n차 보정 참조값은 더욱 명확한 데이터 분류 기준을 갖게되어 비정상에 관련한 센서 공정 데이터를 거를 수 있는 정확도가 향상될 수 있다. 이는, 학습 데이터에 비정상에 관련한 센서 공정 데이터가 포함되지 않도록 할 수 있어 궁극적으로 인공지능 모델의 출력 정확도 향상시키는 효과를 야기시킬 수 있다.In the above description, it was explained that the first screening and the second screening are performed again to obtain a correction reference value that serves as a more accurate classification standard. However, the first screening and the second screening are performed in the third and fourth rounds or It will be apparent to those skilled in the art that more than that can be performed, and thus a secondary correction reference value and a tertiary correction reference value may be obtained. In other words, as the first selection and second selection processes are repeated a plurality of times, the nth correction reference value has a clearer data classification standard, and the accuracy of filtering sensor process data related to abnormalities can be improved. This can prevent learning data from including sensor process data related to abnormalities, ultimately resulting in improved output accuracy of the artificial intelligence model.

도 10은 본 발명의 일 실시예와 관련된 하나 이상의 네트워크 함수를 나타낸 개략도이다.Figure 10 is a schematic diagram showing one or more network functions related to one embodiment of the present invention.

본 명세서에 걸쳐, 인공지능 모델, 신경망, 네트워크 함수, 뉴럴 네트워크(neural network)는 동일한 의미로 사용될 수 있다. 신경망은 일반적으로 “노드”라 지칭될 수 있는 상호 연결된 계산 단위들의 집합으로 구성될 수 있다. 이러한 “노드”들은 “뉴런(neuron)”들로 지칭될 수도 있다.Throughout this specification, artificial intelligence model, neural network, network function, and neural network may be used with the same meaning. A neural network can generally consist of a set of interconnected computational units, which can be referred to as “nodes”. These “nodes” may also be referred to as “neurons.”

딥 뉴럴 네트워크(DNN: deep neural network, 심층신경망)는 입력 레이어와 출력 레이어 외에 복수의 히든 레이어를 포함하는 신경망을 의미할 수 있다. 딥 뉴럴 네트워크를 이용하면 데이터의 잠재적인 구조(latent structures)를 파악할 수 있다. 즉, 사진, 글, 비디오, 음성, 음악의 잠재적인 구조(예를 들어, 어떤 물체가 사진에 있는지, 글의 내용과 감정이 무엇인지, 음성의 내용과 감정이 무엇인지 등)를 파악할 수 있다. 딥 뉴럴 네트워크는 컨볼루션 뉴럴 네트워크(CNN: convolutional neural network), 리커런트 뉴럴 네트워크(RNN: recurrent neural network), 오토 인코더(auto encoder), GAN(Generative Adversarial Networks), 제한 볼츠만 머신(RBM: restricted boltzmann machine), 심층 신뢰 네트워크(DBN: deep belief network), Q 네트워크, U 네트워크, 샴 네트워크 등을 포함할 수 있다. 전술한 딥 뉴럴 네트워크의 기재는 예시일 뿐이며 본 발명은 이에 제한되지 않는다.A deep neural network (DNN) may refer to a neural network that includes multiple hidden layers in addition to the input layer and output layer. Deep neural networks allow you to identify latent structures in data. In other words, it is possible to identify the potential structure of a photo, text, video, voice, or music (e.g., what object is in the photo, what the content and emotion of the text are, what the content and emotion of the voice are, etc.) . Deep neural networks include convolutional neural networks (CNN), recurrent neural networks (RNN), auto encoders, generative adversarial networks (GAN), and restricted Boltzmann machines (RBM). machine), deep belief network (DBN), Q network, U network, Siamese network, etc. The description of the deep neural network described above is only an example and the present invention is not limited thereto.

뉴럴 네트워크는 교사 학습(supervised learning), 비교사 학습(unsupervised learning), 및 반교사학습(semi supervised learning) 중 적어도 하나의 방식으로 학습될 수 있다. 뉴럴 네트워크의 학습은 출력의 오류를 최소화하기 위한 것이다. 뉴럴 네트워크의 학습에서 반복적으로 학습 데이터를 뉴럴 네트워크에 입력시키고 학습 데이터에 대한 뉴럴 네트워크의 출력과 타겟의 에러를 계산하고, 에러를 줄이기 위한 방향으로 뉴럴 네트워크의 에러를 뉴럴 네트워크의 출력 레이어에서부터 입력 레이어 방향으로 역전파(backpropagation)하여 뉴럴 네트워크의 각 노드의 가중치를 업데이트 하는 과정이다. 교사 학습의 경우 각각의 학습 데이터에 정답이 라벨링 되어있는 학습 데이터를 사용하며(즉, 라벨링된 학습 데이터), 비교사 학습의 경우는 각각의 학습 데이터에 정답이 라벨링 되어있지 않을 수 있다. 즉, 예를 들어 데이터 분류에 관한 교사 학습의 경우의 학습 데이터는 학습 데이터 각각에 카테고리가 라벨링 된 데이터 일 수 있다. 라벨링된 학습 데이터가 뉴럴 네트워크에 입력되고, 뉴럴 네트워크의 출력(카테고리)과 학습 데이터의 라벨이 비교함으로써 오류(error)가 계산될 수 있다. 다른 예로, 데이터 분류에 관한 비교사 학습의 경우 입력인 학습 데이터가 뉴럴 네트워크 출력과 비교됨으로써 오류가 계산될 수 있다. 계산된 오류는 뉴럴 네트워크에서 역방향(즉, 출력 레이어에서 입력 레이어 방향)으로 역전파 되며, 역전파에 따라 뉴럴 네트워크의 각 레이어의 각 노드들의 연결 가중치가 업데이트 될 수 있다. 업데이트 되는 각 노드의 연결 가중치는 학습률(learning rate)에 따라 변화량이 결정될 수 있다. 입력 데이터에 대한 뉴럴 네트워크의 계산과 에러의 역전파는 학습 사이클(epoch)을 구성할 수 있다. 학습률은 뉴럴 네트워크의 학습 사이클의 반복 횟수에 따라 상이하게 적용될 수 있다. 예를 들어, 뉴럴 네트워크의 학습 초기에는 높은 학습률을 사용하여 뉴럴 네트워크가 빠르게 일정 수준의 성능을 확보하도록 하여 효율성을 높이고, 학습 후기에는 낮은 학습률을 사용하여 정확도를 높일 수 있다.A neural network may be trained in at least one of supervised learning, unsupervised learning, and semi-supervised learning. Learning of a neural network is intended to minimize errors in output. In neural network learning, learning data is repeatedly input into the neural network, the output of the neural network and the error of the target for the learning data are calculated, and the error of the neural network is transferred from the output layer of the neural network to the input layer in the direction of reducing the error. This is the process of updating the weight of each node in the neural network through backpropagation. In the case of teacher learning, learning data in which the correct answer is labeled in each learning data is used (i.e., labeled learning data), and in the case of non-teacher learning, the correct answer may not be labeled in each learning data. That is, for example, in the case of teacher learning regarding data classification, the learning data may be data in which each learning data is labeled with a category. Labeled training data is input to the neural network, and the error can be calculated by comparing the output (category) of the neural network and the label of the training data. As another example, in the case of non-teachable learning for data classification, the error can be calculated by comparing the input training data with the neural network output. The calculated error is backpropagated in the reverse direction (i.e., from the output layer to the input layer) in the neural network, and the connection weight of each node in each layer of the neural network can be updated according to backpropagation. The amount of change in the connection weight of each updated node may be determined according to the learning rate. The neural network's calculation of input data and backpropagation of errors can constitute a learning cycle (epoch). The learning rate may be applied differently depending on the number of repetitions of the learning cycle of the neural network. For example, in the early stages of neural network training, a high learning rate can be used to increase efficiency by allowing the neural network to quickly achieve a certain level of performance, and in the later stages of training, a low learning rate can be used to increase accuracy.

뉴럴 네트워크의 학습에서 일반적으로 학습 데이터는 실제 데이터(즉, 학습된 뉴럴 네트워크를 이용하여 처리하고자 하는 데이터)의 부분집합일 수 있으며, 따라서, 학습 데이터에 대한 오류는 감소하나 실제 데이터에 대해서는 오류가 증가하는 학습 사이클이 존재할 수 있다. 과적합(overfitting)은 이와 같이 학습 데이터에 과하게 학습하여 실제 데이터에 대한 오류가 증가하는 현상이다. 예를 들어, 노란색 고양이를 보여 고양이를 학습한 뉴럴 네트워크가 노란색 이외의 고양이를 보고는 고양이임을 인식하지 못하는 현상이 과적합의 일종일 수 있다. 과적합은 머신러닝 알고리즘의 오류를 증가시키는 원인으로 작용할 수 있다. 이러한 과적합을 막기 위하여 다양한 최적화 방법이 사용될 수 있다. 과적합을 막기 위해서는 학습 데이터를 증가시키거나, 레귤라이제이션(regularization), 학습의 과정에서 네트워크의 노드 일부를 생략하는 드롭아웃(dropout) 등의 방법이 적용될 수 있다.In the learning of neural networks, the training data can generally be a subset of real data (i.e., the data to be processed using the learned neural network), and thus the error for the training data is reduced, but the error for the real data is reduced. There may be an incremental learning cycle. Overfitting is a phenomenon in which errors in actual data increase due to excessive learning on training data. For example, a phenomenon in which a neural network that learned a cat by showing a yellow cat fails to recognize that it is a cat when it sees a non-yellow cat may be a type of overfitting. Overfitting can cause errors in machine learning algorithms to increase. To prevent such overfitting, various optimization methods can be used. To prevent overfitting, methods such as increasing the learning data, regularization, or dropout, which omits some of the network nodes during the learning process, can be applied.

본 발명의 구성 요소들은 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 애플리케이션)으로 구현되어 매체에 저장될 수 있다. 본 발명의 구성 요소들은 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있으며, 이와 유사하게, 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다.The components of the present invention may be implemented as a program (or application) and stored in a medium in order to be executed in conjunction with a hardware computer. Components of the invention may be implemented as software programming or software elements, and similarly, embodiments may include various algorithms implemented as combinations of data structures, processes, routines or other programming constructs, such as C, C++, , may be implemented in a programming or scripting language such as Java, assembler, etc. Functional aspects may be implemented as algorithms running on one or more processors.

본 발명의 기술 분야에서 통상의 지식을 가진 자는 여기에 개시된 실시예들과 관련하여 설명된 다양한 예시적인 논리 블록들, 모듈들, 프로세서들, 수단들, 회로들 및 알고리즘 단계들이 전자 하드웨어, (편의를 위해, 여기에서 "소프트웨어"로 지칭되는) 다양한 형태들의 프로그램 또는 설계 코드 또는 이들 모두의 결합에 의해 구현될 수 있다는 것을 이해할 것이다. 하드웨어 및 소프트웨어의 이러한 상호 호환성을 명확하게 설명하기 위해, 다양한 예시적인 컴포넌트들, 블록들, 모듈들, 회로들 및 단계들이 이들의 기능과 관련하여 위에서 일반적으로 설명되었다. 이러한 기능이 하드웨어 또는 소프트웨어로서 구현되는지 여부는 특정한 애플리케이션 및 전체 시스템에 대하여 부과되는 설계 제약들에 따라 좌우된다. 본 발명의 기술 분야에서 통상의 지식을 가진 자는 각각의 특정한 애플리케이션에 대하여 다양한 방식들로 설명된 기능을 구현할 수 있으나, 이러한 구현 결정들은 본 발명의 범위를 벗어나는 것으로 해석되어서는 안 될 것이다.Those skilled in the art will understand that various illustrative logical blocks, modules, processors, means, circuits and algorithm steps described in connection with the embodiments disclosed herein can be used in electronic hardware, (for convenience) It will be understood that the implementation may be implemented by various forms of program or design code (referred to herein as “software”) or a combination of both. To clearly illustrate this interoperability of hardware and software, various illustrative components, blocks, modules, circuits and steps have been described above generally with respect to their functionality. Whether this functionality is implemented as hardware or software depends on the specific application and design constraints imposed on the overall system. A person skilled in the art may implement the described functionality in various ways for each specific application, but such implementation decisions should not be construed as departing from the scope of the present invention.

여기서 제시된 다양한 실시예들은 방법, 장치, 또는 표준 프로그래밍 및/또는 엔지니어링 기술을 사용한 제조 물품(article)으로 구현될 수 있다. 용어 "제조 물품"은 임의의 컴퓨터-판독가능 장치로부터 액세스 가능한 컴퓨터 프로그램, 캐리어, 또는 매체(media)를 포함한다. 예를 들어, 컴퓨터-판독가능 매체는 자기 저장 장치(예를 들면, 하드 디스크, 플로피 디스크, 자기 스트립, 등), 광학 디스크(예를 들면, CD, DVD, 등), 스마트 카드, 및 플래쉬 메모리 장치(예를 들면, EEPROM, 카드, 스틱, 키 드라이브, 등)를 포함하지만, 이들로 제한되는 것은 아니다. 또한, 여기서 제시되는 다양한 저장 매체는 정보를 저장하기 위한 하나 이상의 장치 및/또는 다른 기계-판독가능한 매체를 포함한다. 용어 "기계-판독가능 매체"는 명령(들) 및/또는 데이터를 저장, 보유, 및/또는 전달할 수 있는 무선 채널 및 다양한 다른 매체를 포함하지만, 이들로 제한되는 것은 아니다.The various embodiments presented herein may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques. The term “article of manufacture” includes a computer program, carrier, or media accessible from any computer-readable device. For example, computer-readable media include magnetic storage devices (e.g., hard disks, floppy disks, magnetic strips, etc.), optical disks (e.g., CDs, DVDs, etc.), smart cards, and flash memory. Includes, but is not limited to, devices (e.g., EEPROM, cards, sticks, key drives, etc.). Additionally, various storage media presented herein include one or more devices and/or other machine-readable media for storing information. The term “machine-readable media” includes, but is not limited to, wireless channels and various other media capable of storing, retaining, and/or transmitting instruction(s) and/or data.

제시된 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조는 예시적인 접근들의 일례임을 이해하도록 한다. 설계 우선순위들에 기반하여, 본 발명의 범위 내에서 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조가 재배열될 수 있다는 것을 이해하도록 한다. 첨부된 방법 청구항들은 샘플 순서로 다양한 단계들의 엘리먼트들을 제공하지만 제시된 특정한 순서 또는 계층 구조에 한정되는 것을 의미하지는 않는다.It is to be understood that the specific order or hierarchy of steps in the processes presented is an example of illustrative approaches. It is to be understood that the specific order or hierarchy of steps in processes may be rearranged within the scope of the present invention, based on design priorities. The appended method claims present elements of the various steps in a sample order but are not meant to be limited to the particular order or hierarchy presented.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.Above, embodiments of the present invention have been described with reference to the attached drawings, but those skilled in the art will understand that the present invention can be implemented in other specific forms without changing its technical idea or essential features. You will be able to understand it. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive.

Claims

A method performed by a computing device comprising at least one processor, comprising:
Monitoring sensor data of each of one or more devices using a learned anomaly detection model to detect abnormalities in the equipment;
When the abnormality detection model detects abnormal data in specific sensor data acquired from specific equipment, obtaining preventive maintenance information related to the specific equipment;
Recognizing whether preventive maintenance has been performed on the specific equipment based on the preventive maintenance information;
When it is recognized that preventive maintenance has been performed on the specific equipment, performing a consistency test on the specific equipment;
Based on the consistency check, recognizing whether the abnormal data detected by the abnormality detection model was detected due to preventive maintenance or due to a defect in the specific equipment; and
When recognizing that the abnormal data detected by the anomaly detection model is detected by preventive maintenance, updating the anomaly detection model based on the abnormal data;
Including,
The step of performing the consistency check is,
Obtaining first sensor data measured in the specific equipment on which preventive maintenance was performed, obtaining second sensor data measured in the specific equipment before the preventive maintenance was performed, and operating with the same recipe as the specific equipment Obtaining third sensor data measured from each of the above devices;
sampling the sensor data and performing alignment on the sensor data acquired from each device;
determining minimum, maximum, and median values for each device from the sampled sensor data;
A plurality of similarity values are generated based on the first sensor data, the second sensor data, and the third sensor data, which are data measured by each equipment while producing the preset number of wafers with the recipe for producing wafers. calculating step; and
calculating a plurality of normalization values by dividing each of the plurality of similarity values by the largest value among the plurality of similarity values; and
determining whether the specific equipment on which the preventive maintenance was performed is normal based on the plurality of normalization values calculated based on the similarity;
Including,
A method for maintaining the accuracy of equipment anomaly detection models.

delete

◈Claim 4 was abandoned upon payment of the setup registration fee.◈

According to claim 1,
Based on the consistency check, the step of recognizing whether the abnormal data detected by the abnormality detection model was detected due to preventive maintenance or due to a defect in the specific equipment is,
If it is determined that the specific equipment is normal by performing the consistency check, it is recognized that the abnormal data detected by the anomaly detection model was detected by preventive maintenance, and it is determined that the specific equipment is not normal by performing the consistency check. In this case, recognizing that the abnormal data detected by the abnormality detection model is detected due to a defect in a specific equipment;
Including,
A method for maintaining the accuracy of equipment anomaly detection models.

According to claim 1,
The step of updating the anomaly detection model is,
Obtaining normalized data by scaling the abnormal data;
Transfer learning a fully connected layer using the normalized data while maintaining the feature extraction layer of the anomaly detection model;
Including,
A method for maintaining the accuracy of equipment anomaly detection models.

According to clause 5,
The step of transfer learning a fully connected layer using the normalized data while maintaining the feature extraction layer of the anomaly detection model includes:
Comparing a first maximum value for detecting an anomaly in the anomaly detection model and a second maximum value of the normalized data; and
If the second maximum value is greater than the first maximum value, changing the first maximum value for detecting an anomaly in the anomaly detection model to the second maximum value;
Including,
A method for maintaining the accuracy of equipment anomaly detection models.

◈Claim 7 was abandoned upon payment of the setup registration fee.◈

According to claim 1,
The updated anomaly detection model is,
Updated to increase the range of data for detecting abnormalities,
In order to maintain the accuracy of the anomaly detection model, initializing the anomaly detection model to an initially learned model at a preset period,
A method for maintaining the accuracy of equipment anomaly detection models.

According to claim 1,
The step of monitoring sensor data of each of one or more devices using a learned anomaly detection model to detect anomalies in the device is:
Obtaining process sensor data and prior knowledge data corresponding to the process sensor data;
Generating reconstruction process sensor data corresponding to the process sensor data using a deep learning model;
calculating a reconstruction rate error based on the process sensor data and the reconstruction process sensor data; and
detecting abnormal operation based on comparison of the reconstruction rate error with a reference threshold;
Includes,
The deep learning model is,
a first sub-model for extracting feature information corresponding to the process sensor data;
a second sub-model that extracts interaction relationship information between each process sensor data based on the process sensor data and the prior knowledge data;
an attention module that generates feature information by combining outputs of the first sub-model and the second sub-model; and
a dimensional reconstruction model that restores the feature information to generate the reconstruction process sensor data;
Including,
A method for maintaining the accuracy of equipment anomaly detection models.

A memory that stores one or more instructions; and
A processor that executes the one or more instructions stored in the memory
Contains,
The processor executes the one or more instructions,
An apparatus for performing the method of claim 1.

A computer program combined with a computer as hardware and stored on a computer-readable recording medium so as to perform the method of claim 1.