KR20230165026A

KR20230165026A - A factory simulator-based scheduling neural network learning system with lot history reproduction function

Info

Publication number: KR20230165026A
Application number: KR1020220064905A
Authority: KR
Inventors: 이호열; 김태환
Original assignee: 주식회사 뉴로코어
Priority date: 2022-05-26
Filing date: 2022-05-26
Publication date: 2023-12-05

Abstract

시뮬레이터에 의해 생성된 모의 데이터 중에서 특정 공정의 직전 로트 이력을 기록하고, 기록된 로트 이력에서 다양한 경우 수를 생성하여 해당 공정의 신경망을 학습하는, 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 관한 것으로서, 공장 시뮬레이터를 이용하여 상기 공장 워크플로우를 모의하고, 모의 데이터를 수집하는 모의 실행부; 상기 모의 실행부를 통해, 특정 공정의 이전 까지의 공정들(이하 제1 스테이지)을 모의하게 하여, 특정 공정의 로트 이력 정보를 기록하는, 로트이력 기록부; 상기 로트 이력 정보를 이용하여 다수의 제2 스테이지의 생산 에피소드를 생성하고, 생산된 생산 에피소드에 따라 상기 특정 공정 이후 공정들(이하 제2 스테이지)을 상기 모의 실행부를 통해 모의하게 하는, 에피소드 재현부; 및, 상기 제2 스테이지의 모의결과로부터 학습 데이터를 생성하여 해당 공정의 스케줄링 신경망을 학습시키는, 학습 실행부를 포함하는 구성을 마련한다.
상기와 같은 시스템에 의하여, 시뮬레이터에 의한 모의 데이터 중에서 특정 공정의 직전 로트 이력을 기록하여 해당 공정의 신경망 학습에 이용함으로써, 직전 로트 이력으로부터 바로 다양한 경우 수를 생성하여 학습 데이터를 생성할 수 있고, 이를 통해, 직전 공정 까지의 모의 과정을 반복하지 않아 빠르게 학습 과정을 진행할 수 있다.
Scheduling neural network learning based on a factory simulator with a lot history reproduction function that records the lot history immediately preceding a specific process among the simulated data generated by the simulator, generates various cases from the recorded lot history, and learns the neural network for the process. It relates to a system, comprising: a simulation execution unit that simulates the factory workflow using a factory simulator and collects simulated data; A lot history recording unit that simulates processes (hereinafter referred to as first stages) prior to a specific process through the simulation execution unit and records lot history information of the specific process; an episode reproduction unit that generates a plurality of second stage production episodes using the lot history information and simulates processes (hereinafter referred to as second stages) after the specific process according to the produced production episodes through the simulation execution unit; and a learning execution unit that generates learning data from the simulation results of the second stage and trains the scheduling neural network of the corresponding process.
By using the system as described above, the immediately preceding lot history of a specific process is recorded among the simulated data by the simulator and used for neural network learning of the corresponding process, so that learning data can be generated by generating various case numbers directly from the immediately preceding lot history, Through this, the learning process can proceed quickly by not repeating the simulation process up to the previous process.

Description

A factory simulator-based scheduling neural network learning system with lot history reproduction function }

본 발명은 일련의 공정으로 구성되는 공장 워크플로우에서 각 공정의 스케줄링을 위한 공정별 신경망을 학습하되, 시뮬레이터에 의해 공장 워크플로우를 모의하여 모의 데이터로 신경망을 학습하는, 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 관한 것이다.The present invention learns a process-specific neural network for scheduling each process in a factory workflow consisting of a series of processes, and has a lot history reproduction function that simulates the factory workflow using a simulator and learns the neural network with simulated data. This study is about a factory simulator-based scheduling neural network learning system.

특히, 본 발명은 시뮬레이터에 의해 생성된 모의 데이터 중에서 특정 공정의 직전 로트 이력을 기록하고, 기록된 로트 이력에서 다양한 경우 수를 생성하여 해당 공정의 신경망을 학습하는, 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 관한 것이다.In particular, the present invention records the lot history immediately preceding a specific process among the simulated data generated by a simulator, and generates the number of various cases from the recorded lot history to learn the neural network of the process, a factory equipped with a lot history reproduction function. This is about a simulator-based scheduling neural network learning system.

일반적으로, 제조 공정 관리는 원료나 재료로부터 제품이 완성되기까지 제조 과정에서 행하여지는 일련의 공정을 관리하는 활동을 말한다. 특히, 각 제품의 제조에 필요한 공정과 작업 순서를 결정하고, 각 공정에 필요한 재료나 시간 등을 결정한다.Generally, manufacturing process management refers to activities that manage a series of processes performed during the manufacturing process from raw materials or materials to completion of a product. In particular, the processes and work sequences required to manufacture each product are determined, and the materials and time required for each process are determined.

특히, 제품을 생산하는 공장에는 각 공정 작업을 처리하는 장비들이 해당 공정의 작업 공간에 배치되어 구비된다. 해당 장비들에는 특정 작업을 처리하기 위한 로트들이 공급되도록 구성될 수 있다. 또한, 장비들 사이 또는 작업 공간들 사이에는 컨베이어 등 이송 장치 등이 설치되어, 장비에 의해 특정 공정이 완료되면 처리된 로트가 다음 공정으로 이동되도록 구성된다. 즉, 하나의 로트는 일련의 공정을 거쳐 완성된 제품으로 생산된다.In particular, factories that produce products are equipped with equipment that handles each process task, arranged in the work space for that process. The equipment can be configured to be supplied with lots to process specific tasks. In addition, transfer devices such as conveyors are installed between equipment or work spaces, so that the processed lot is moved to the next process when a specific process is completed by the equipment. In other words, one lot is produced as a finished product through a series of processes.

또한, 특정 공정을 수행하기 위해 유사/동일 기능의 다수의 장비들이 설치되어, 동일하거나 유사한 공정 작업을 분담하여 처리될 수 있다. 이와 같은 제조 라인에서 공정 또는 각 작업을 스케줄링하는 것은 공장 효율화를 위해 매우 중요한 문제이다. 종래에는 대부분 스케줄링을 각 조건에 따른 규칙 기반(rule-based) 형식으로 스케줄링 하였으나, 평가 척도가 명확하지 않아 만들어진 스케줄링 결과에 대한 성능 평가가 모호하였다.Additionally, in order to perform a specific process, multiple pieces of equipment with similar/same functions may be installed and the same or similar process tasks may be divided and processed. Scheduling the process or each task in such a manufacturing line is a very important issue for factory efficiency. Conventionally, most scheduling was done in a rule-based format according to each condition, but the performance evaluation of the scheduling results was ambiguous because the evaluation scale was not clear.

또한, 최근에는 제조 공정에 인공지능 기법을 도입하여 작업을 스케줄링하는 기술들이 제시되고 있다[특허문헌 1]. 상기 선행기술은 인공지능 기술 중 유전자 알고리즘이라는 기계학습 알고리즘을 사용했으나, 공작 기계의 작업을 스케줄링에 한정하고 있다. 또한, 다수 설비의 공정에 대한 신경망 학습 방법을 적용한 기술도 제시되고 있다[특허문헌 2]. 그러나 상기 선행기술은 과거의 데이터를 기반으로, 주어진 상황에서 최적 제어방법을 찾는 기술로서, 과거에 축적된 데이터가 없다면 작동하지 않는다는 명확한 한계가 존재한다.Additionally, recently, technologies for scheduling work by introducing artificial intelligence techniques into the manufacturing process have been proposed [Patent Document 1]. The above prior art used a machine learning algorithm called a genetic algorithm among artificial intelligence technologies, but limited the work of machine tools to scheduling. In addition, a technology applying a neural network learning method to processes of multiple facilities is also proposed [Patent Document 2]. However, the above prior art is a technology that finds the optimal control method in a given situation based on past data, and has a clear limitation in that it does not work without data accumulated in the past.

상기와 같은 문제점을 해결하기 위하여, 본 출원인은 공장 시뮬레이터를 이용하여 공정을 모의하고 모의된 데이터를 이용하여 각 공정의 신경망을 학습하는 기술을 제시하고 있다[특허문헌 3]. 그러나 상기 선행기술은 학습을 위한 모든 경우의 수를 위하여 공장 시뮬레이터로 수많은 모의 작업을 수행해야 한다. 특히, 학습 대상인 특정 공정(학습 대상의 신경망에 대응하는 공정)의 이전 공정들의 과정에 대하여, 해당 공정 입장에서 다양한 선택의 경우 수가 존재한다. 따라서 상기 선행기술은 이러한 다수의 경우 수 만큼 이전 공정들의 과정을 반복하여 모의해야 한다는 문제점이 있다.In order to solve the above problems, the present applicant proposes a technology for simulating the process using a factory simulator and learning a neural network for each process using the simulated data [Patent Document 3]. However, the prior art requires performing numerous simulation tasks with a factory simulator to cover all cases for learning. In particular, with respect to the processes preceding the specific process that is the subject of learning (the process corresponding to the neural network of the subject of learning), there are a number of different choices from the perspective of the process. Therefore, the prior art has a problem in that the previous processes must be repeated and simulated as many times as these cases.

한국 등록특허공보 제10-1984460호(2019.05.30.공고)Korean Patent Publication No. 10-1984460 (announced on May 30, 2019) 한국 등록특허공보 제10-2035389호(2019.10.23.공고)Korean Patent Publication No. 10-2035389 (announced on October 23, 2019) 한국 등록특허공보 제10-2338304호(2021.12.13.공고)Korean Patent Publication No. 10-2338304 (announced on December 13, 2021)

V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, p. 529, 2015. V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, p. 529, 2015. The Goal: A Process of Ongoing Improvement, Eliyahu M. Goldratt 1984 The Goal: A Process of Ongoing Improvement, Eliyahu M. Goldratt 1984

본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 시뮬레이터에 의해 생성된 모의 데이터 중에서 특정 공정의 직전 로트 이력을 기록하고, 기록된 로트 이력에서 다양한 경우 수를 생성하여 해당 공정의 신경망을 학습하는, 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템을 제공하는 것이다.The purpose of the present invention is to solve the problems described above, by recording the immediately preceding lot history of a specific process among the simulated data generated by the simulator, and generating the number of various cases from the recorded lot history to create a neural network for the process. It provides a factory simulator-based scheduling neural network learning system with a lot history reproduction function.

상기 목적을 달성하기 위해 본 발명은 공장 워크플로우의 각 공정의 스케줄링 신경망을 학습시키되, 상기 공장 워크플로우를 모의하는 공장 시뮬레이터의 모의결과로 상기 각 공정의 스케줄링 신경망을 학습시키는, 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 관한 것으로서, 상기 공장 시뮬레이터를 이용하여 상기 공장 워크플로우를 모의하고, 모의 데이터를 수집하는 모의 실행부; 상기 모의 실행부를 통해, 특정 공정의 이전 까지의 공정들(이하 제1 스테이지)을 모의하게 하여, 특정 공정의 로트 이력 정보를 기록하는, 로트이력 기록부; 상기 로트 이력 정보를 이용하여 다수의 제2 스테이지의 생산 에피소드를 생성하고, 생산된 생산 에피소드에 따라 상기 특정 공정 이후 공정들(이하 제2 스테이지)을 상기 모의 실행부를 통해 모의하게 하는, 에피소드 재현부; 및, 상기 제2 스테이지의 모의결과로부터 학습 데이터를 생성하여 해당 공정의 스케줄링 신경망을 학습시키는, 학습 실행부를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention provides a lot history reproduction function that learns the scheduling neural network for each process of the factory workflow and learns the scheduling neural network for each process with the simulation results of a factory simulator that simulates the factory workflow. It relates to a factory simulator-based scheduling neural network learning system, comprising: a simulation execution unit that simulates the factory workflow using the factory simulator and collects simulated data; A lot history recording unit that simulates processes (hereinafter referred to as first stages) prior to a specific process through the simulation execution unit and records lot history information of the specific process; an episode reproduction unit that generates a plurality of second stage production episodes using the lot history information and simulates processes (hereinafter referred to as second stages) after the specific process according to the produced production episodes through the simulation execution unit; and a learning execution unit that generates learning data from the simulation results of the second stage and trains a scheduling neural network for the corresponding process.

또한, 본 발명은 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 있어서, 상기 각 공정의 스케줄링 신경망은 공장 워크플로우의 상태를 입력받으면 해당 상태에서 다음에 처리할 작업 유형을 출력하도록 구성되는 것을 특징으로 한다.In addition, the present invention is a factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the scheduling neural network of each process is configured to receive the state of the factory workflow and output the type of work to be processed next in that state. It is characterized by being

또한, 본 발명은 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 있어서, 상기 로트 이력 정보는 각 로트가 해당 공정에 도착하여 투입가능한 상태로 되는 시간(이하 도착 시간)을 포함하는 것을 특징으로 한다.In addition, the present invention is a factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the lot history information includes the time when each lot arrives at the corresponding process and becomes available for input (hereinafter referred to as arrival time). It is characterized by

또한, 본 발명은 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 있어서, 상기 각 공정은 다수의 작업 유형의 작업을 수행할 수 있고, 하나의 작업 유형에서 다른 작업 유형으로 작업을 전환하기 위해서는 사전에 정해진 작업 교체 시간을 소요해야 하는 것을 특징으로 한다.In addition, the present invention is a factory simulator-based scheduling neural network learning system with lot history reproduction function, wherein each process can perform tasks of multiple task types and switch tasks from one task type to another task type. In order to do this, it is characterized by requiring a predetermined work replacement time.

또한, 본 발명은 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 있어서, 상기 에피소드 재현부는 상기 로트 이력 정보에서 도착했거나 도착하려는 로트를 다음 작업으로 선택하는 것에 따라 생산 에피소드를 생성하는 것을 특징으로 한다.In addition, the present invention is a factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the episode reproduction unit generates a production episode according to selecting the lot that has arrived or is about to arrive as the next task from the lot history information. It is characterized by

또한, 본 발명은 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 있어서, 상기 각 공정의 스케줄링 신경망은 강화학습에 의한 신경망인 것을 특징으로 한다.In addition, the present invention is a factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the scheduling neural network for each process is a neural network based on reinforcement learning.

상술한 바와 같이, 본 발명에 따른 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템에 의하면, 시뮬레이터에 의한 모의 데이터 중에서 특정 공정의 직전 로트 이력을 기록하여 해당 공정의 신경망 학습에 이용함으로써, 직전 로트 이력으로부터 바로 다양한 경우 수를 생성하여 학습 데이터를 생성할 수 있고, 이를 통해, 직전 공정 까지의 모의 과정을 반복하지 않아 빠르게 학습 과정을 진행할 수 있는 효과가 얻어진다.As described above, according to the factory simulator-based scheduling neural network learning system with a lot history reproduction function according to the present invention, the immediately previous lot history of a specific process is recorded among the simulated data by the simulator and used for neural network learning of the process, Learning data can be generated by generating various case numbers directly from the previous lot history, and this allows the learning process to proceed quickly without repeating the simulation process up to the previous process.

도 1은 본 발명의 일실시예에 따른는 공장 워크플로우의 모델을 도시한 예시도.
도 2는 본 발명의 일실시예에 따른 공정의 구성에 대한 블록도.
도 3은 본 발명의 일실시예에 따른 공정의 장비 구성을 예시한 도면.
도 4는 본 발명의 일실시예에 따른 작업 교체 시간을 나타낸 예시 표.
도 5는 본 발명을 실시하기 위한 전체 시스템에 대한 구성도.
도 6은 본 발명에서 사용하는 강화학습의 기본 작동 구조도.
도 7은 본 발명의 일실시예에 따른 공장 워크플로우의 상태를 예시한 표.
도 8은 본 발명의 일실시예에 따른 공장 시뮬레이터 기반 스케줄링 시스템의 구성에 대한 블록도.
도 9는 본 발명의 일실시예에 따른 특정 공정에 따른 스테이지의 구분을 나타낸 예시도.
도 10은 본 발명의 일실시예에 따른 로트 이력 정보를 예시한 표.1 is an exemplary diagram showing a model of a factory workflow according to an embodiment of the present invention.
Figure 2 is a block diagram of the configuration of a process according to an embodiment of the present invention.
Figure 3 is a diagram illustrating the equipment configuration of a process according to an embodiment of the present invention.
Figure 4 is an example table showing job replacement times according to an embodiment of the present invention.
Figure 5 is a configuration diagram of the entire system for implementing the present invention.
Figure 6 is a basic operational structure diagram of reinforcement learning used in the present invention.
7 is a table illustrating the status of factory workflow according to an embodiment of the present invention.
Figure 8 is a block diagram of the configuration of a factory simulator-based scheduling system according to an embodiment of the present invention.
Figure 9 is an exemplary diagram showing the division of stages according to a specific process according to an embodiment of the present invention.
10 is a table illustrating lot history information according to an embodiment of the present invention.

이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다.Hereinafter, specific details for implementing the present invention will be described with reference to the drawings.

또한, 본 발명을 설명하는데 있어서 동일 부분은 동일 부호를 붙이고, 그 반복 설명은 생략한다.In addition, in explaining the present invention, like parts are given the same reference numerals, and repeated description thereof is omitted.

먼저, 본 발명에서 사용하는 공장 워크플로우 모델의 구성에 대하여 도 1 내지 도 5를 참조하여 설명한다.First, the configuration of the factory workflow model used in the present invention will be described with reference to FIGS. 1 to 5.

도 1에서 보는 바와 같이, 공장 워크플로우는 일련의 다수의 공정으로 구성되고, 하나의 공정은 다른 공정과 연결된다. 또한, 연결된 공정은 선후 관계를 가진다. 공정을 하나의 노드로 볼 때, 전체 공장 워크플로우는 방향성 그래프를 형성한다. 이하에서 설명의 편의를 위하여, 공정을 공정 노드와 혼용한다.As shown in Figure 1, a factory workflow consists of a series of multiple processes, and one process is connected to another process. Additionally, connected processes have a precedence relationship. When viewing a process as a single node, the entire factory workflow forms a directed graph. Below, for convenience of explanation, process is used interchangeably with process node.

도 1의 예에서, 공장 워크플로우는 공정 P0, P1, P2, ..., P4로 구성되고, 공정 P0로 시작되어 공정 P4로 종료된다. 공정 P0가 완료되면 다음 공정 P1이 시작되고, 공정 P1이 완료되면 다음 공정 P2가 시작된다.In the example of Figure 1, the factory workflow consists of processes P0, P1, P2, ..., P4, starting with process P0 and ending with process P4. When process P0 is completed, the next process P1 starts, and when process P1 is completed, the next process P2 starts.

한편, 하나의 로트(lot)는 특정 작업물로서, 공장 워크플로우의 각 공정 P0, P1, P2, ..., P4를 거쳐 완성된 제품으로 생산된다. 특정 작업물(=로트)에 대해, 공정 P0가 완료되면 해당 로트를 전달받아 다음 공정 P1를 시작할 수 있다. 즉, 공정 P0에서 처리가 완료된 로트(LOT)가 공정 P1에 제공되면, 공정 P1은 해당 로트를 이어받아 추가 작업을 지속 진행한다. 이와 같이 공장 플로우의 모든 일련의 공정이 처리되면, 해당 로트의 제품이 생산된다.Meanwhile, one lot is a specific workpiece and is produced as a finished product through each process P0, P1, P2, ..., P4 of the factory workflow. For a specific workpiece (=lot), when process P0 is completed, the lot can be delivered and the next process P1 can be started. In other words, when a lot (LOT) that has been processed in process P0 is provided to process P1, process P1 takes over the lot and continues to perform additional work. When all series of processes in the factory flow are processed in this way, the product of the corresponding lot is produced.

또한, 공장 워크플로우는 하나의 제품 종류(또는 제품 유형, 제품군)만을 생산하는 것이 아니라 다수 유형의 제품이 처리되어 생산될 수 있다. 따라서 각 로트는 제품 종류에 따라 유형이 달라진다. 예를 들어, 로트 1은 '제품 A' 유형이면, 로트 1의 최종 생산 제품은 제품 A가 생산된다. 또한, 로트 2는 '제품 B' 유형이면, 로트 2의 최종 생산 제품은 제품 B가 생산된다.Additionally, the factory workflow does not produce only one product type (or product type, product family), but multiple types of products can be processed and produced. Therefore, each lot has a different type depending on the product type. For example, if Lot 1 is of type 'Product A', then the final product of Lot 1 is Product A. Additionally, if Lot 2 is of type 'Product B', Product B is produced as the final product of Lot 2.

한편, 바람직하게는, 공장 워크플로우는 생산 제품군 별로 각기 상이할 수 있다.Meanwhile, preferably, the factory workflow may be different for each production group.

또한, 각 공정은 동시에 구동될 수 있다. 예를 들어, 공정 P4에서 로트 8(제품 B)를 처리하고 있을 때, 동시에 공정 P0에서 로트 1(또는 제품 A)을 중간 처리하고, 공정 P1에서 로트 2(제품 B)를 처리하고 있을 수 있다.Additionally, each process can be run simultaneously. For example, when lot 8 (product B) is being processed in process P4, lot 1 (or product A) may be intermediately processed in process P0 at the same time, and lot 2 (product B) may be processed in process P1. .

다음으로, 공정의 작업 유형에 대하여 설명한다.Next, the operation types of the process are explained.

또한, 하나의 공정은 다수의 작업 유형을 선택적으로 수행할 수 있다. 해당 공정은 투입 로트를 대상으로 진행 가능한 "작업유형 중 하나"를 작업한다. 이때, 로트(이하 투입 로트)가 해당 공정에 투입되고, 공정의 작업이 수행됨에 따라 처리된 로트(이하 완료 로트)가 출력(산출)된다. 즉, 작업이 완료된 로트는 다음 공정 Pn+1의 작업 가능 대상인 투입로트가 된다.Additionally, one process can selectively perform multiple task types. The process works on “one of the work types” that can be performed on the input lot. At this time, a lot (hereinafter referred to as input lot) is input into the process, and as the work of the process is performed, the processed lot (hereinafter referred to as completed lot) is output (calculated). In other words, the lot on which work has been completed becomes the input lot that can be worked on in the next process Pn+1.

도 2의 예에서, 공정 Pn 은 작업유형 n-1, 작업유형 n-2, ..., 작업유형 n-M 등 다수의 작업 유형을 갖는다. 공정 Pn은 M개의 작업 중에서 하나의 작업을 선택하여 수행한다. 그때 환경이나 요청에 따라 다수의 작업 중 하나가 선택되어 수행된다. 특히, 바람직하게는, 해당 공정에 스케줄링 신경망이 있는 경우, 해당 공정의 신경망에서 '하나의 작업'을 선택하여 스케줄링 한다.In the example of Figure 2, process Pn has multiple operation types, such as operation type n-1, operation type n-2, ..., operation type n-M. Process Pn selects and performs one task among M tasks. At that time, one of multiple tasks is selected and performed depending on the environment or request. In particular, preferably, if the process has a scheduling neural network, 'one task' is selected from the neural network of the process and scheduled.

일례로서, 공정 Pn (n=0,1,2,3…)는 여러 작업 유형들을 통칭하는 작업(예: "조립")을 의미하며, 각 작업 유형은 투입 로트의 종류에 따라 달라진다(예: 투입 로트 유형이 의자이면 "다리 조립", 투입 로트 유형이 테이블이면 "상판 조립", 투입 로트 유형이 서랍장이면 "서랍 조립" 등).As an example, process Pn (n=0,1,2,3…) refers to an operation that collectively refers to several operation types (e.g. “assembly”), each operation type depending on the type of input lot (e.g. “Assembling legs” if the input lot type is chair, “Assembling top” if the input lot type is table, “Assembling drawers” if the input lot type is chest of drawers, etc.)

다음으로, 공정의 장비에 대하여 설명한다.Next, the equipment for the process will be explained.

또한, 공정 Pn에 배치된 장비는 여러 대가 있을 수 있으며, 작업 유형 별로 투입 로트에 대해서는 특정 "장비" 가 하나 지정된다. 이러한 지정 가능한 관계정의를 도 3에 나타내고 있다. 즉, 특정 장비 별로 진행할 수 있는 작업 유형 및 로트 유형이 다르며, 장비 별로 하나 또는 여러 작업 유형이 진행될 수 있다.Additionally, there may be multiple pieces of equipment deployed in process Pn, and one specific “equipment” is designated for each input lot for each work type. These designable relationship definitions are shown in Figure 3. In other words, the work types and lot types that can be performed for each specific piece of equipment are different, and one or multiple work types can be performed for each piece of equipment.

도 3의 예에서, 공정 P0는 3가지의 작업 유형을 가지고, 각 작업 유형 P0-1, P0-2, P0-3은 각각 제품 A, 제품 B, 제품 C를 가공하기 위한 작업이다. 또한, 작업 유형 P0-1(또는 제품 A)을 작업하기 위한 장비는 장비1과 장비2 등 2개이다. 또한, 작업 유형 P0-2,P03을 작업하기 위한 장비는 각각 장비 2와, 장비1이다.In the example of Figure 3, process P0 has three operation types, and each operation type P0-1, P0-2, and P0-3 are operations for processing product A, product B, and product C, respectively. Additionally, there are two pieces of equipment, Equipment 1 and Equipment 2, for working on work type P0-1 (or Product A). Additionally, the equipment for working types P0-2 and P03 are Equipment 2 and Equipment 1, respectively.

공정 Pn 내에 여러 개의 로트가 존재할 수 있으며, 로트의 상태는 현재 작업 중인 로트와, 작업 시작을 기다리는 로트(대기중인 로트)로 구분될 수 있다. 대기중 로트 상태는 해당 공정에서 이미 장비가 다른 로트를 처리 중이거나, 해당 로트를 가공할 수 있는 장비의 작업 준비가 완료되지 않은 상태인 경우를 나타낸다.There may be multiple lots within the process Pn, and the lot status can be divided into a lot currently being worked on and a lot waiting for work to begin (waiting lot). The waiting lot status indicates that the equipment is already processing another lot in the process, or the equipment that can process the lot is not ready for work.

다음으로, 공정의 작업교체(job change)에 대하여 설명한다.Next, job change in the process will be explained.

공정 Pn에서, 특정 장비가 이전 로트와 다음 로트의 작업유형이 달라지는 경우, 이전 로트의 작업(또는 작업 유형)에서 다음 로트의 작업 유형을 처리하기 위하여, 준비 작업을 수행하고 장비설정을 변경하는 과정을 수행한다. 이를 작업교체(job change)라 부르기로 한다. 또한, 작업 교체를 위해 일정한 시간이 소요된다.In process Pn, when specific equipment has different operation types for the previous lot and the next lot, the process of performing preparatory work and changing equipment settings in order to process the operation type of the next lot from the operation (or operation type) of the previous lot. Perform. This is called job change. Additionally, it takes a certain amount of time to change jobs.

예를 들어, 작업 교체는 오일 교체, 장비 구동 프로그램 교체, 장비 내 공구 교체, 장비 예열/냉각작업 등일 수 있다.For example, job replacement may be oil replacement, equipment drive program replacement, tool replacement within the equipment, equipment preheating/cooling work, etc.

이러한 작업 교체 소요시간이 도 4에 도시되고 있다. 도 4의 예에서, 현재 작업유형 P0-1인 로트를 생산하던 장비가, 작업유형 P0-3인 로트를 생산하려면, 12시간이 소요되는 작업교체 작업을 수행해야 한다. 즉, 12시간의 작업교체 후 다음 작업 유형 P0-3의 로트를 처리할 수 있다.The time required to replace this task is shown in Figure 4. In the example of FIG. 4, if the equipment that is currently producing a lot of job type P0-1 wants to produce a lot of job type P0-3, a job replacement operation that takes 12 hours must be performed. That is, after a 12-hour work change, the lot of the next work type P0-3 can be processed.

작업교체 시에는 장비에 투입 가능한 로트가 존재하더라도 생산(처리)될 수 없다. 따라서 작업교체가 자주 일어나면, 작업교체 시간이 많이 소요되므로, 장비 활용율이 저하된다.When changing work, even if there is a lot that can be put into the equipment, it cannot be produced (processed). Therefore, if job changes occur frequently, it takes a lot of time to change jobs, so the equipment utilization rate decreases.

공정 Pn에서 작업교체를 하지 않고 특정 로트 유형만 생산하는 작업 유형만 지속할 수 있다. 이 경우, 작업 대기 중인 다른 로트 유형은 완료 로트로 진행될 수 없기 때문에, 고객이 요구한 로트 유형이 여러 개이면, 반드시 적절한 "작업교체 시점"을 결정하여 작업교체를 수행해야 한다.In process Pn, only operation types that produce specific lot types can be continued without switching operations. In this case, since other lot types awaiting work cannot proceed to the completed lot, if there are multiple lot types requested by the customer, the appropriate “job change point” must be determined and work change must be performed.

다음으로, 본 발명을 실시하기 위한 전체 시스템의 구성을 도 5를 참조하여 설명한다.Next, the configuration of the entire system for implementing the present invention will be described with reference to FIG. 5.

도 5에서 보는 바와 같이, 본 발명을 실시하기 위한 전체 시스템은 신경망(11)으로 구성되는 신경망 에이전트(10), 공장의 워크플로우를 시뮬레이션하는 공장 시뮬레이터(20), 및, 신경망 에이전트(10)를 학습시키는 학습 시스템(30)으로로 구성된다. 추가적으로, 학습 데이터 등을 저장하는 데이터베이스(40)를 더 포함하여 구성될 수 있다.As shown in Figure 5, the entire system for implementing the present invention includes a neural network agent 10 consisting of a neural network 11, a factory simulator 20 that simulates the workflow of the factory, and a neural network agent 10. It consists of a learning system 30 for learning. Additionally, it may be configured to further include a database 40 that stores learning data, etc.

먼저, 신경망 에이전트(10)는 워크플로우의 상태(또는 로트 상태)를 입력받으면 특정 공정의 다음 작업(또는 작업 행위)을 출력하는 적어도 하나의 신경망(11)으로 구성된다.First, the neural network agent 10 is composed of at least one neural network 11 that outputs the next task (or task action) of a specific process when the state of the workflow (or lot state) is input.

특히, 하나의 신경망(11)은 하나의 공정에 대한 다음 작업(또는 작업 유형)을 결정하도록 구성된다. 즉, 바람직하게는, 해당 공정에서 다음으로 수행할 수 있는 다수의 작업(또는 작업 유형) 중에서 하나를 선택한다. 일례로서, 신경망(11)의 출력은 모든 작업 유형에 해당하는 노드들로 구성되고, 각 노드의 출력은 확률값을 출력하며, 가장 큰 확률값의 노드에 해당하는 작업이 다음 작업으로 선택된다.In particular, one neural network 11 is configured to determine the next task (or task type) for one process. That is, preferably, one of a number of operations (or types of operations) that can be performed next in the process is selected. As an example, the output of the neural network 11 consists of nodes corresponding to all task types, the output of each node outputs a probability value, and the task corresponding to the node with the largest probability value is selected as the next task.

즉, 각 공정의 스케줄링 인공 신경망(11)은 "다음 작업로트"를 택1하는 의사결정을 수행하며, 작업교체 시간을 감수하고 다른 작업유형을 선택하거나, 동일한 작업유형의 로트를 다시 선택할 수 있다.In other words, the scheduling artificial neural network 11 of each process makes a decision to select the “next work lot,” and can choose a different work type at the expense of work change time, or select a lot of the same work type again. .

또한, 다수의 공정들의 다음 작업을 결정하기 위하여, 다수 공정들 각각에 대한 다수의 신경망(11)을 구성할 수 있다. 도 1의 예에서, 공정이 5개이면, 각각의 공정에 대응되는 신경망(11)을 구성하여 모두 5개를 구성할 수 있다. 그러나, 공정 내에서 선택하는 작업이 하나만 있는 경우 등 선택이 필요없거나 단순한 공정에 대해 신경망을 구성하지 않는다.Additionally, in order to determine the next task of the multiple processes, multiple neural networks 11 may be configured for each of the multiple processes. In the example of FIG. 1, if there are five processes, a total of five processes can be configured by configuring the neural network 11 corresponding to each process. However, a neural network is not constructed for simple processes that do not require selection, such as when there is only one operation to select within the process.

신경망 및 그 신경망의 최적화는 DQN(Deep-Q Network) 등 통상의 강화학습 기반의 신경망 방식을 이용한다[비특허문헌 1]Neural networks and their optimization use typical reinforcement learning-based neural network methods such as DQN (Deep-Q Network) [Non-patent Document 1]

또한, 신경망 에이전트(10)는 워크플로우 상태(S_t)와, 해당 상태에서의 작업(a_t), 해당 작업에 의해 수행된 후의 워크플로우 상태(S_t+1), 그리고 해당 상태에서의 작업에 대한 보상(r_t)을 입력받아, 해당 공정의 신경망(11)의 파라미터를 최적화 한다.In addition, the neural network agent 10 determines the workflow state (S _t ), the task in that state (a _t ), the workflow state after being performed by the task (S _t+1 ), and the task in that state. By receiving the compensation (r _t ) for , the parameters of the neural network (11) of the process are optimized.

또한, 신경망(11)이 최적화 되면(학습되면), 신경망 에이전트(10)는 워크플로우 상태(S_t)를 최적화된 신경망(11)에 적용하여 다음 작업(a_t)을 출력하게 한다.In addition, when the neural network 11 is optimized (learned), the neural network agent 10 applies the workflow state (S _t ) to the optimized neural network 11 to output the next task (a _t ).

한편, 워크플로우 상태(S_t)는 t시점에서의 워크플로우 상태(또는 공장 상태)를 나타낸다. 바람직하게는, 워크플로우 상태는 워크플로우 내의 각 공정의 상태와, 공장 전체에 해당하는 공장 상태로 구성된다. 예를 들어, 워크플로우 상태(S_t)는 의사결정 대상 제품 유형에 해당하는 로트의 공장 내 분포, 의사결정을 하고자 하는 생산 설비의 현재 작업 제품 유형 등 각 공정 또는 공장 상태를 나타내는 정보로 구성된다.Meanwhile, the workflow status (S _t ) represents the workflow status (or factory status) at time t. Preferably, the workflow state consists of the state of each process within the workflow and the factory state corresponding to the entire factory. For example, the workflow status (S _t ) consists of information representing the status of each process or factory, such as distribution within the factory of lots corresponding to the product type for which the decision is to be made and the current work product type of the production facility for which the decision is to be made. .

또한, 바람직하게는, 워크플로우 상태는 워크플로우 내의 일부 공정의 상태들만 포함할 수 있다. 이때, 워크플로우 내에서 병목 현상을 유발하는 공정 등 핵심적인 공정들만을 대상으로, 해당 공정들의 상태들만 포함할 수 있다. 또한, 워크플로우 상태는 워크플로우의 과정에서 변화되는 요소를 대상으로 설정된다. 즉, 워크플로우가 진행되어도 변하지 않는 구성요소는 상태로 설정되지 않는다.Additionally, preferably, the workflow state may include only the states of some processes within the workflow. At this time, only core processes, such as those that cause bottlenecks within the workflow, can be targeted, and only the states of those processes can be included. Additionally, the workflow state is set to elements that change during the workflow. In other words, components that do not change as the workflow progresses are not set to status.

각 공정의 상태(또는 공정 상태)는 투입 로트, 각 공정 장비의 상태 등으로 구성된다. 또한, 공정 상태는 제품의 생산 목표량, 달성된 현황 등 전체 공정에서의 상태를 나타낸다.The status of each process (or process status) consists of the input lot, the status of each process equipment, etc. In addition, the process status indicates the status of the entire process, such as the product production target and achieved status.

한편, 위와 같이, 상태는 전체 워크플로우 상태로 설정하고, 행위는 해당 공정에서의 작업으로 설정하고 있다. 즉, 공정 상태는 전체 워크플로우 내에 있는 로트(Lot)들의 배치상태, 장비상태들을 포함하나, 행위(또는 작업)는 특정 공정 노드(Node)에 국한된다. 작업 또는 행위 a_p,t는 공정 p의 시간 t에서의 행위 또는 작업으로서, 의사결정을 수행하는 작업, 즉, 선택한 제품 유형 등을 나타낸다.Meanwhile, as above, the status is set to the overall workflow state, and the action is set to the task in the corresponding process. In other words, the process state includes the placement status and equipment status of lots within the entire workflow, but the action (or task) is limited to a specific process node. A task or action a _p,t is an action or task at time t in process p, and represents the task of making a decision, i.e. the type of product selected, etc.

즉, 공정 상태는 전체 워크플로우 내에 있는 로트(Lot)들의 배치상태, 장비상태들을 포함하나, 행위(또는 작업)는 특정 공정 노드(Node)에 국한된다. 공장에서는 가장 생산능력의 병목이 되거나, 의사결정이 필요한 특정 공정 노드(Node)를 최적 스케줄링 할 경우, 연계된 전후 공정 노드(Node)의 문제는 개의치 않겠다는 제약이론(TOC, Theory of Constraint)[비특허문헌 2]이 전제된다. 이는 마치 신호등이나 교차로, 인터체인지와 같은 주요 관리 포인트에서 주요 의사결정을 진행하되, 이를 위해서 연결된 모든 전후 도로들의 트래픽 상황을 상태 (State)로 반영해야 하는 것과 같다.In other words, the process state includes the placement status and equipment status of lots within the entire workflow, but the action (or task) is limited to a specific process node. In factories, the Theory of Constraint (TOC) states that when optimally scheduling a specific process node that is the bottleneck of production capacity or requires decision-making, the problems of related process nodes before and after are not concerned. Non-patent Document 2] is assumed. This is like making major decisions at major management points such as traffic lights, intersections, and interchanges, but for this purpose, the traffic situation on all connected roads before and after must be reflected as the state.

한편, 보상(r_t)는 공장에서 의사결정으로 최적화 하고자 하는 보상을 나타낸다. 예를 들어, 보상은 설비 가동율(= 설비가 작업한 시간/전체 시간), 납기 만족율(= 제품 별 작업완료를 해야 하는 목표시간 준수 생산량 / 전체 작업완료를 해야하는 생산량) 등으로 구성된다. 즉, 이들 요소를 가중 합하여 보상치를 구성할 수 있다.Meanwhile, compensation (r _t ) represents the compensation to be optimized through decision making in the factory. For example, compensation consists of facility operation rate (= time worked by the facility/total time), delivery satisfaction rate (= production volume that complies with the target time to complete work for each product/production amount to complete all work), etc. In other words, the compensation value can be formed by weighting and summing these factors.

다음으로, 공장 시뮬레이터(20)는 공장 워크플로우를 시뮬레이션하는 통상의 시뮬레이터이다.Next, the factory simulator 20 is a typical simulator that simulates factory workflow.

공장 워크플로우는 앞서 도 1과 같은 워크플로우 모델을 사용한다. 즉, 시뮬레이션의 공장 워크플로우 모델은 공정을 나타내는 다수의 노드로 구성된 방향성 그래프로 모델링된다. 그러나 시뮬레이션의 각 공정 모델은 실제 현장의 설비 현황으로 모델링된다.The factory workflow uses the same workflow model as shown in Figure 1 above. In other words, the factory workflow model in the simulation is modeled as a directed graph consisting of a number of nodes representing the process. However, each process model in the simulation is modeled with the actual facility status on site.

즉, 도 2와 같이, 공정 모델은 해당 공정에 투입되는 로트(LOT), 해당 공정에서 완료되는 로트(LOT), 다수의 작업 유형, 각 작업 유형을 위한 장비, 각 작업 유형에 따른 작업의 처리 속도, 이전 작업 유형과 새로운 작업 유형을 교체하기 위한 작업교체 시간 등 설비 구성과 처리 능력을 모델링 변수로 모델링된다.That is, as shown in Figure 2, the process model includes the lot (LOT) input to the process, the lot (LOT) completed in the process, multiple work types, equipment for each work type, and processing of work according to each work type. Equipment configuration and processing capabilities, such as speed and work changeover time to replace the previous work type with the new work type, are modeled as modeling variables.

상기와 같은 공장 시뮬레이터는 통상의 시뮬레이션 기술을 채용한다. 따라서 더 구체적인 설명은 생략한다.The above factory simulator employs conventional simulation technology. Therefore, more detailed explanations are omitted.

다음으로, 학습 시스템(30)은 공장 시뮬레이터(20)를 이용하여 시뮬레이션을 수행하고, 시뮬레이션 결과로부터 학습 데이터를 추출하고, 추출된 학습 데이터로 신경망 에이전트(10), 즉, 각 공정의 신경망(11)을 학습시킨다.Next, the learning system 30 performs a simulation using the factory simulator 20, extracts learning data from the simulation results, and uses the extracted learning data to create a neural network agent 10, that is, the neural network 11 for each process. ) is learned.

즉, 학습 시스템(30)은 공장 시뮬레이터(20)로 다수의 생산 에피소드를 시뮬레이션한다. 생산 에피소드는 최종 제품(또는 로트)을 생산하는 전체 과정을 의미한다. 이때, 각 생산 에피소드는 각 처리과정이 상이하다.That is, the learning system 30 simulates multiple production episodes with the factory simulator 20. A production episode refers to the entire process of producing a final product (or lot). At this time, each production episode has a different processing process.

예를 들어, 빨간색 볼펜 100자루와 파랑색 볼펜 50자루를 생산하는 시뮬레이션을 한번 수행하는 것이 하나의 생산 에피소드이다. 이때, 공장 워크플로우 내에서 처리하는 세부 공정이 서로 다를 수 있다. 세부 공정을 다르게 하면 또 다른 하나의 생산 에피소드가 생성된다. 예를 들어, 특정 상태일때 공정 2에서 빨간색 볼펜의 로트를 처리하는 것과, 파랑색 볼펜의 로트를 처리하는 것 등은 서로 다른 생산 에피소드이다.For example, performing a simulation to produce 100 red ballpoint pens and 50 blue ballpoint pens once is one production episode. At this time, the detailed processes processed within the factory workflow may be different. If the detailed process is different, another production episode is created. For example, processing a lot of red ballpoint pens and processing a lot of blue ballpoint pens in process 2 under a certain state are different production episodes.

바람직하게는, 학습 시스템(30)은 도 6과 같은 강화 학습을 이용하여 학습할 수 있다[비특허문헌 1]. 즉, 각 공정 상태(또는 로트 상태)(S_t)에서의 보상(r_t)은 강화학습 방식에 의하여 산출한다. 각 상태(S_t)에서의 보상(r_t)은 해당 생산 에피소드의 최종 결과(또는 최종 성과)로부터 산출한다. 즉, 최종 결과(또는 최종 성과)은 해당 공정 또는 전체 워크플로우의 생산 설비(장비)의 가동효율, 작업물의 작업시간(TAT: Turn-Around Time), 생산목표 달성율 등 공장 관리에서 사용하는 주요 KPI(Key Performance Index, 주요 성능 지수) 등에 의해 산출된다.Preferably, the learning system 30 can learn using reinforcement learning as shown in FIG. 6 [Non-patent Document 1]. That is, the reward (r _t ) in each process state (or lot state) (S _t ) is calculated using a reinforcement learning method. The reward (r _t ) in each state (S _t ) is calculated from the final result (or final performance) of the corresponding production episode. In other words, the final result (or final performance) is the main KPI used in factory management, such as the operation efficiency of the production equipment (equipment) of the process or overall workflow, the turn-around time (TAT) of the workpiece, and the production target achievement rate. It is calculated by (Key Performance Index), etc.

또한, 생산 에피소드로부터 시간 순에 따른 상태(S_t)와 작업(a_p,t), 보상(r_t)을 추출하면, 트랜지션(transition)들을 추출할 수 있다. 즉, 트랜지션은 현재 상태(S_t)와 작업(a_p,t)에서 다음 상태(S_t+1)와 보상(r_t)으로 구성된다. 이것은 현재 상태(S_t)에서 특정 공정의 작업(a_p,t)이 수행되면 다음 상태(S_t+1)로 전환되고 보상(r_t)의 가치를 얻는 것을 의미한다. 여기서의 보상(r_t)은 작업(a_p,t)이 수행된 경우의 현재 상태(S_t)에 대한 가치를 의미한다.Additionally, transitions can be extracted by extracting the state (S _t ), task (a _{p, t} ), and reward (r _t ) according to time order from the production episode. In other words, a transition consists of the current state (S _t ) and task (a _p,t ) to the next state (S _t+1 ) and reward (r _t ). This means that when a specific process task (a _p,t ) is performed in the current state (S _t ), it transitions to the next state (S _t+1 ) and obtains the value of reward (r _t ). Here, reward (r _t ) refers to the value of the current state (S _t ) when the task (a _p,t ) is performed.

위와 같이, 학습 시스템(30)은 생산 에피소드에 따라 시뮬레이터(10)로 시뮬레이션(모의) 하고, 모의 결과로부터 트랜지션들을 추출하여 학습 데이터를 구축한다. 이때, 하나의 에피소드에서도 다수의 트랜지션들이 추출된다. 바람직하게는, 다수의 에피소드에 따라 시뮬레이션을 수행하고, 이로부터 다량의 트랜지션을 추출한다.As above, the learning system 30 simulates (mocks) the simulator 10 according to the production episode, extracts transitions from the simulation results, and constructs learning data. At this time, multiple transitions are extracted from one episode. Preferably, simulation is performed according to multiple episodes and a large amount of transitions are extracted from them.

또한, 학습 시스템(30)은 시뮬레이터(10)로 모의한 모의 데이터에서 특정 공정 직전의 로트 상태의 이력을 기록하고, 기록된 로트 상태의 이력 데이터로부터 다수의 생산 에피소드를 생성할 수 있고, 생산 에피소드에 따라 모의하고, 모의 결과에 의한 학습 데이터(또는 트랜지션)를 추출한다.In addition, the learning system 30 records the history of the lot state immediately before a specific process in the simulated data simulated by the simulator 10, can generate a number of production episodes from the recorded lot state history data, and produces the production episodes. Simulate according to and extract learning data (or transition) based on the simulation results.

그리고 학습 시스템(30)은 추출된 학습 데이터(또는 트랜지션)을 신경망 에이전트(10)에 적용하여 학습시킨다.Then, the learning system 30 applies the extracted learning data (or transitions) to the neural network agent 10 to learn it.

이때, 일례로서, 학습 데이터(또는 트랜지션)을 시간 순에 의해 순차적으로 학습시킬 수 있다. 바람직하게는, 전체 트랜지션에서 랜덤하게 트랜지션을 샘플링하고, 샘플링된 트랜지션들로 신경망 에이전트(10)를 학습시킨다.At this time, as an example, learning data (or transitions) can be learned sequentially in chronological order. Preferably, transitions are randomly sampled from all transitions, and the neural network agent 10 is trained using the sampled transitions.

또한, 신경망 에이전트(10)가 다수의 신경망을 구성한 경우, 각 신경망에 대응되는 공정의 학습 데이터(또는 트랜지션 데이터)를 이용하여, 해당 신경망을 학습시킨다.Additionally, when the neural network agent 10 configures multiple neural networks, the corresponding neural network is trained using learning data (or transition data) of the process corresponding to each neural network.

다음으로, 데이터베이스(40)는 신경망 에이전트(10)를 학습시키기 위한 학습 데이터를 저장하는 학습데이터DB(41), 로트 이력 등을 저장하는 로트이력DB(42) 등으로 구성된다. 그러나 상기 데이터베이스(40)의 구성은 바람직한 일실시예일 뿐이며, 구체적인 장치를 개발하는데 있어서, 접근 및 검색의 용이성 및 효율성 등을 감안하여 데이터베이스 구축이론에 의하여 다른 구조로 구성될 수 있다.Next, the database 40 is composed of a learning data DB 41 that stores learning data for training the neural network agent 10, a lot history DB 42 that stores lot history, etc. However, the configuration of the database 40 is only a preferred embodiment, and when developing a specific device, it may be configured in a different structure according to database construction theory, taking into account ease of access and search and efficiency.

바람직하게는, 학습 데이터는 다수의 트랜지션으로 구성된다. 특히, 트랜지션 데이터는 공정별로 구분될 수 있다. 앞서와 같이, 학습 시스템(30)은 다수의 에피소드를 시뮬레이터(20)로 시뮬레이션 하거나, 로트 이력으로부터 다수의 에피소드를 생성하여 모의하면, 다양한 대량의 트랜지션 데이터를 수집할 수 있다.Preferably, the training data consists of multiple transitions. In particular, transition data can be classified by process. As before, the learning system 30 can collect a large amount of diverse transition data by simulating multiple episodes using the simulator 20 or by creating and simulating multiple episodes from lot history.

다음으로, 본 발명의 일실시예에 따른 시뮬레이터(10)에 의해 공장 워크플로우가 모의되는 예를 도 7을 참조하여 설명한다.Next, an example in which a factory workflow is simulated by the simulator 10 according to an embodiment of the present invention will be described with reference to FIG. 7.

도 7은 시뮬레이터(10)로 모의한 결과로서, 공장 워크플로우의 상태 정보를 나타낸다. 공장 워크플로우의 상태 정보는 시간에 따라 변화될 것이다. 특정 시점의 공장 워크플로우의 상태는 로트의 위치 및 상태로 구성된다.Figure 7 is a result of simulation with the simulator 10 and shows status information of the factory workflow. Status information in factory workflows will change over time. The state of the factory workflow at a specific point in time consists of the location and status of the lot.

특히, 도 7은 도 1의 공장 워크플로우에 대한 공장 상태(또는 각 공정 상태) 중 하나의 시점에서의 공장 상태를 나타내고 있다. 도 7에서 보는 바와 같이, 전체 공장 상태는 공정 별 로트 유형, 로트 넘버(로트 식별정보), 로트 상태, 작업 진행 시간 등으로 구성된다. 로트 유형은 해당 로트가 어떤 제품(또는 제품 유형)에 대한 로트인지를 나타낸다. 또한, 로트 넘버는 로트 식별정보로서 식별하기 위한 정보이다. 또한, 로트 상태는 해당 로트가 대기 중인지 작업 중인지를 나타낸다. 또한, 작업 진행시간은 해당 작업을 시작한 후 진행 시간으로서, 작업중이면 작업을 시작한지 얼마나 시간이 소요되었는지를 나타낸다.In particular, FIG. 7 shows the factory state at one point among the factory states (or each process state) for the factory workflow of FIG. 1. As shown in Figure 7, the overall factory status consists of lot type for each process, lot number (lot identification information), lot status, work progress time, etc. The lot type indicates which product (or product type) the lot is for. Additionally, the lot number is information for identification as lot identification information. Additionally, lot status indicates whether the lot is waiting or in progress. In addition, the task progress time is the progress time after starting the task, and if the task is in progress, it indicates how much time it has taken since the task was started.

상기와 같은 상태에서, 공정 P4의 신경망의 학습을 위한 생산 에피소드를 생성하는 과정을 설명한다. 즉, 공정 P4의 신경망은 P4공정에서 다음 작업물을 선택하는 의사결정을 수행하는 신경망이다. 워크플로우의 선행 공정 P0~P3에는 별도의 의사결정이 불필요하거나, 이미 의사결정 수행이 가능한 인공 신경망이 존재한다.In the above state, the process of generating production episodes for learning the neural network of process P4 will be described. In other words, the neural network of process P4 is a neural network that makes decisions to select the next workpiece in process P4. In the preceding processes P0 to P3 of the workflow, separate decisions are not required, or an artificial neural network that can already make decisions exists.

공정 P4는 현재 로트 9(제품 B)을 작업 중이며, P4에서 로트 9 이후에 다음 작업물의 선택에 따라 다양한 생산 에피소드를 생성할 수 있다. 특히, 바람직하게는, 다음 3가지의 생산 에피소드를 생성할 수 있다.Process P4 is currently working on lot 9 (product B), and P4 can generate various production episodes after lot 9 depending on the selection of the next workpiece. In particular, preferably, the following three production episodes can be generated.

[생산 에피소드 1] 로트 7[Production Episode 1] Lot 7

[생산 에피소드 2] 로트 8[Production Episode 2] Lot 8

[생산 에피소드 3] 로트 6[Production Episode 3] Lot 6

즉, 생산 에피소드 1과 2는 각각 즉시 가용한 로트 7, 로트 8를 선택하는 에피소드이다. 또한, 생산 에피소드 3은 로트 6이 도착할 때를 대비하여 공정 P4의 장비에서 작업을 수행하지 않고 대기하는 에피소드이다.That is, production episodes 1 and 2 are episodes that select immediately available lot 7 and lot 8, respectively. Additionally, production episode 3 is an episode in which no work is performed on the equipment of process P4 and the production waits in preparation for the arrival of lot 6.

생산 에피소드 1과 2의 경우, 즉, 공정 P4가 LOT 7이나 LOT 8을 선택할 경우, 작업 유형이 변경되므로 작업교체(JOB CHANGE)가 유발된다. 즉, 공정 P4에서 제품 B에서 제품 A 또는 제품 C로 변경되므로, 생산 장비의 교체 작업(JOB CHANGE)이 발생한다.In the case of production episodes 1 and 2, that is, when process P4 selects LOT 7 or LOT 8, the job type changes, causing a JOB CHANGE. In other words, in process P4, product B is changed to product A or product C, so a replacement of production equipment (JOB CHANGE) occurs.

생산 에피소드 3의 경우는 LOT 7이나 LOT 8을 선택하지 않고 약간의 시간을 대기하는 경우이다. 이 경우, 로트 6은 제품 B의 유형을 가지므로, 교체 작업(JOB CHANGE)이 발생하지 않는다. 즉, 해당 에피소드는 동일한 제품인 LOT 6, LOT 4, LOT 2를 연속 처리할 수 있다.In the case of Production Episode 3, it is a case of waiting some time without selecting LOT 7 or LOT 8. In this case, lot 6 has product type B, so no JOB CHANGE occurs. In other words, the episode can sequentially process the same products, LOT 6, LOT 4, and LOT 2.

따라서 생산 에피소드 3의 경우가 생산 에피소드 1과 2 보다 효율적이고 더 생산성이 좋은 결과가 나올 것이다. 즉, 공정 P4에서 다음 작업물로 LOT 7을 선택함과 동시에 작업 교체가 발생한다. 따라서 공정 P3에 대한 시뮬레이션 결과로 LOT 6의 공정 P4 도착시점이 계산되면, LOT7을 선택한 것이 올바른 결정인지, 장비 대기 후 LOT6 을 선택한 것이 옳았던 것인지를 학습을 통하여 판단할 수 있다.Therefore, production episode 3 will produce more efficient and productive results than production episodes 1 and 2. In other words, job replacement occurs at the same time that LOT 7 is selected as the next work in process P4. Therefore, if the arrival time of process P4 of LOT 6 is calculated as a result of the simulation for process P3, it is possible to determine through learning whether it was the right decision to select LOT7 or whether it was the right decision to select LOT6 after waiting for the equipment.

그러나, 공정 P4 입장에서는 LOT 6, LOT 4, LOT 2가 공정 P4로 들어오는 시점(공정 P4에 가용 상태로 되는 시점)을 미리 알 수 없다. 즉, 공정 P0~P3에 존재하는 LOT 별 완료 시간 및 P4 공정 작업 가능 시간은 공정 P0~P3에 대한 시뮬레이션(모의)을 모두 완료해야만 사후에 알 수 있다.However, from the perspective of process P4, it is not possible to know in advance when LOT 6, LOT 4, and LOT 2 enter process P4 (when they become available for process P4). In other words, the completion time for each LOT existing in processes P0~P3 and the workable time for process P4 can only be known after completing all simulations for processes P0~P3.

또한, 이런 학습과정에서는 공정 P4입장에서는 다양한 생산 에피소드의 경우의 수가 있으며, 이 경우의 생산 에피소드를 모두 모의하기 위하여 다수의 "공장 워크플로우의 공정 P0~P4의 모의"가 필요하며, 과다한 시간이 소요된다. 앞서의 예에서, 모두 3개의 생산 에피소드를 모의하기 위하여, 공정 P0~P3의 동일한 과정을 반복적으로 3번의 모의해야 한다. 공정이 복잡한 경우, 매우 많은 에피소드의 경우 수 만큼 동일한 일부 과정을 반복하여 모의해야 한다.Additionally, in this learning process, there are a number of cases of various production episodes from the perspective of process P4, and in order to simulate all the production episodes in this case, multiple “simulation of processes P0 to P4 of the factory workflow” is required, and excessive time is required. It takes. In the previous example, in order to simulate all three production episodes, the same process of processes P0 to P3 must be simulated repeatedly three times. If the process is complex, some of the same processes must be simulated over and over again for a very large number of episodes.

다음으로, 본 발명의 일실시예에 따른 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템의 구성을 도 8을 참조하여 설명한다.Next, the configuration of a factory simulator-based scheduling neural network learning system with a lot history reproduction function according to an embodiment of the present invention will be described with reference to FIG. 8.

도 8에서 보는 바와 같이, 본 발명의 일실시예에 따른 로트 이력 재현 기능을 구비한 공장 시뮬레이터 기반 스케줄링 신경망 학습 시스템(30)은 시뮬레이터(10)를 모의하여 모의 데이터를 수집하는 모의 실행부(31), 특정 공정의 로트 상태의 이력을 기록하는 로트이력 기록부(32), 로트 이력으로부터 특정 공정에 대한 다수의 에피소드를 생성하고 생성된 에피소드에 따라 모의하게 하는 에피소드 재현부(33), 및, 모의 결과로부터 학습 데이터를 생성하여 특정 공정의 신경망을 학습시키는 학습부(34)로 구성된다.As shown in FIG. 8, the factory simulator-based scheduling neural network learning system 30 with a lot history reproduction function according to an embodiment of the present invention includes a simulation execution unit 31 that collects simulated data by simulating the simulator 10. ), a lot history recorder 32 that records the history of the lot status of a specific process, an episode reproduction unit 33 that generates a number of episodes for a specific process from the lot history and simulates them according to the generated episodes, and simulation results. It consists of a learning unit 34 that generates learning data from and trains a neural network for a specific process.

먼저, 모의 실행부(31)는 시뮬레이터(10)를 실행시켜 모의 데이터를 수집한다. 시뮬레이터(10)로 모의한 결과로서, 공장 워크플로우의 상태 정보를 수집할 수 있다. 공장 워크플로우의 상태 정보는 시간에 따라 변화될 것이다. 특정 시점의 공장 워크플로우의 상태는 로트의 위치 및 상태로 구성된다.First, the simulation execution unit 31 runs the simulator 10 to collect simulation data. As a result of simulation with the simulator 10, status information of the factory workflow can be collected. Status information in factory workflows will change over time. The state of the factory workflow at a specific point in time consists of the location and status of the lot.

특히, 모의 실행부(31)는 시뮬레이터(10)를 실행시켜, 앞서 도 7과 같은 공장 워크플로우에 대한 공장 상태(또는 각 공정 상태)를 수집할 수 있다.In particular, the simulation execution unit 31 can run the simulator 10 to collect the factory status (or each process status) for the factory workflow as shown in FIG. 7.

다음으로, 로트이력 기록부(32)는 모의 실행부(31)를 통해, 특정 공정의 이전 까지의 공정들을 모의하게 하여, 특정 공정의 로트 상태의 이력을 기록한다.Next, the lot history recording unit 32 simulates the processes up to the specific process through the simulation execution unit 31 and records the history of the lot status of the specific process.

도 9에서 보는 바와 같이, 특정 공정의 이전 까지의 공정들을 제1 스테이지라고 부르기로 하고, 특정 공정 이후의 공정들을 제2 스테이지라 부르기로 한다. 도 9의 예에서 제1 스테이지는 공정 P0, P1, P2, P3으로 구성되고, 제2 스테이지는 P4로 구성된다. 공정 P4 이후에 P5 등 공정이 존재한다면, 제2 스테이지에는 공정 P5 등이 포함될 것이다.As shown in FIG. 9, processes prior to a specific process will be referred to as the first stage, and processes following the specific process will be referred to as the second stage. In the example of FIG. 9, the first stage consists of processes P0, P1, P2, and P3, and the second stage consists of processes P4. If a process such as P5 exists after process P4, the second stage will include process P5, etc.

즉, 로트이력 기록부(32)는 제1 스테이지의 공정들(특정 공정의 이전 까지의 공정들)에 대하여, 모의 실행부(31)를 통해 모의하게 하고, 특정 공정의 투입 로트에 대한 이력을 기록한다.That is, the lot history recording unit 32 simulates the processes of the first stage (processes prior to a specific process) through the simulation execution unit 31 and records the history of the input lot of the specific process. do.

특정 공정의 투입 로트에 대한 이력 정보가 도 10에 예시되고 있다.History information for the input lot of a specific process is illustrated in FIG. 10.

도 10에서 보는 바와 같이, 특정 공정의 투입 로트의 이력 정보는 제1 스테이지의 각 공정의 로트들이 해당 공정에 도착하는 시간으로 구성된다. 즉, 로트 이력 정보는 로트 식별정보(로트 넘버), 해당 로트의 로트 유형, 해당 공정의 도착 시간(투입가능 시간) 등으로 구성된다.As shown in Figure 10, the history information of the input lot of a specific process consists of the time when the lots of each process in the first stage arrive at the process. In other words, lot history information consists of lot identification information (lot number), lot type of the corresponding lot, arrival time of the corresponding process (possible input time), etc.

또한, 바람직하게는, 로트이력 기록부(32)는 제1 스테이지에 대한 모의 결과도 저장한다.Additionally, preferably, the lot history recording unit 32 also stores simulation results for the first stage.

다음으로, 에피소드 재현부(33)는 로트 이력 정보로부터 제2 스테이지(특정 공정 이후 공정들)에 대한 생산 에피소드를 생성하고, 생산 에피소드에 따른 모의를 수행하게 한다.Next, the episode reproduction unit 33 generates a production episode for the second stage (processes after a specific process) from the lot history information and performs a simulation according to the production episode.

구체적으로, 에피소드 재현부(33)는 로트 이력 정보를 이용하여 제2 스테이지(또는 특정 공정)에서의 투입 로트의 도착을 모의하고, 도착된 투입 로트에 따른 다양한 생산 에피소드를 생성하고, 생산 에피소드에 의한 제2 스테이지의 공정들(특정 공정의 이후의 공정들)에 대하여 모의 실행부(31)를 통해 모의하게 한다.Specifically, the episode reproduction unit 33 simulates the arrival of the input lot at the second stage (or a specific process) using lot history information, generates various production episodes according to the arrived input lot, and generates various production episodes according to the production episode. The processes of the second stage (processes following a specific process) are simulated through the simulation execution unit 31.

즉, 에피소드 재현부(33)는 로트 이력 정보의 각 로트의 도착 시점을 재현하여, 특정 공정에서의 투입 로트의 도착을 모의한다. 그리고 에피소드 재현부(33)는 투입 로트의 도착에 따라 다수의 생산 에피소드를 생성하고, 생산 에피소드에 따라 특정 공정 이후의 공정들을 모의 실행부(31)를 통해 모의하게 한다. 또한, 에피소드 재현부(33)는 제2 스테이지의 모의 결과를 수집한다.That is, the episode reproduction unit 33 reproduces the arrival time of each lot in the lot history information and simulates the arrival of the input lot in a specific process. In addition, the episode reproduction unit 33 generates a plurality of production episodes according to the arrival of the input lot, and simulates processes after a specific process according to the production episode through the simulation execution unit 31. Additionally, the episode reproduction unit 33 collects the simulation results of the second stage.

에피소드 재현부(33)는 신경망 학습을 위한 시뮬레이션 구동 시, 제2 스테이지에 대해서 도착 시점별로 로트를 생성하여, 마치 제1 스테이지의 시뮬레이션 결과로 로트가 도착하는 시점을 재현하는 역할을 수행한다. When running a simulation for neural network learning, the episode reproduction unit 33 generates lots by arrival time for the second stage, and plays the role of reproducing the arrival time of the lot as a simulation result of the first stage.

공정 P4의 의사결정을 위한 인공신경망을 학습 시에는 제1 스테이지에 대한 추가적인 시뮬레이션 수행은 불필요하며, 도 10과 같은 로트 이력 기록에 따라 필요 시점에 로트를 생성해주고 제2 스테이지에 대한 시뮬레이션만 반복하면 된다.When learning the artificial neural network for decision-making in process P4, it is unnecessary to perform additional simulations for the first stage. Lots are created at the necessary time according to the lot history record as shown in Figure 10, and only the simulation for the second stage is repeated. do.

수백~수만번의 시뮬레이션이 필요한 인공 신경망 학습 단계에서, 제1 스테이지 전체에 대한 시뮬레이션을 수행하지 않아도 되므로, 시뮬레이션 수행시간을 단축할 수 있고, 신경망 학습 완료시점을 단축할 수 있다. 즉, 워크플로우의 후행 공정인 P4공정에 장착될 인공신경망 학습 소요 시간을 대폭 단축시킬 수 있다.In the artificial neural network learning stage, which requires hundreds to tens of thousands of simulations, there is no need to simulate the entire first stage, so the simulation execution time can be shortened and the completion time of neural network learning can be shortened. In other words, the time required to learn the artificial neural network to be installed in the P4 process, which is the subsequent process of the workflow, can be significantly shortened.

다음으로, 학습 실행부(34)는 수집된 모의결과로부터 학습 데이터를 생성하여 해당 공정의 신경망에 적용하여 학습시킨다.Next, the learning execution unit 34 generates learning data from the collected simulation results and applies it to the neural network of the corresponding process to learn it.

특히, 학습 실행부(34)는 제1 스테이지의 생산 에피소드에 제2 스테이지의 생산 에피소드를 결합하여 생산 에피소드를 생성하고, 생성된 최종 생산 에피소드에 따른 모의 결과를 해당 공정의 신경망에 적용한다. 앞서 예에서, 제1 스테이지의 생산 에피소드는 1개이고, 제2 스테이지의 에피소드는 3개이다. 1개의 제1 스테이지의 생산 에피소드가 3개의 제2 스테이지의 에피소드에 결합하여, 최종 생산 에피소드는 모두 3개가 생산된다. 그리고 제1 및 제2 스테이지의 모의결과는 최종 생산 에피소드의 모의 결과이기도 하다.In particular, the learning execution unit 34 generates a production episode by combining the production episode of the first stage with the production episode of the second stage, and applies the simulation result according to the generated final production episode to the neural network of the corresponding process. In the previous example, the first stage has one production episode, and the second stage has three episodes. One first stage production episode is combined with three second stage episodes, so a total of three final production episodes are produced. And the simulation results of the first and second stages are also the simulation results of the final production episode.

이상, 본 발명자에 의해서 이루어진 발명을 실시 예에 따라 구체적으로 설명하였지만, 본 발명은 실시 예에 한정되는 것은 아니고, 그 요지를 이탈하지 않는 범위에서 여러 가지로 변경 가능한 것은 물론이다.Above, the invention made by the present inventor has been described in detail based on examples, but the present invention is not limited to the examples and, of course, can be changed in various ways without departing from the gist of the invention.

10 : 신경망 에이전트 11 : 신경망
20 : 공장 시뮬레이터 30 : 학습 시스템
31 : 모의 실행부 32 : 로트이력 기록부
33 : 에피소드 재현부 34 : 학습 실행부
40 : 데이터베이스 41 : 학습데이터DB
42 : 로트이력DB10: Neural network agent 11: Neural network
20: Factory Simulator 30: Learning System
31: Mock execution unit 32: Lot history record unit
33: Episode reproduction unit 34: Learning execution unit
40: Database 41: Learning data DB
42: Lot history DB

Claims

A factory simulator-based scheduling neural network learning system with a lot history reproduction function that learns the scheduling neural network of each process of the factory workflow and learns the scheduling neural network of each process using the simulation results of a factory simulator that simulates the factory workflow. In
a simulation execution unit that simulates the factory workflow using the factory simulator and collects simulated data;
A lot history recording unit that simulates processes (hereinafter referred to as first stages) prior to a specific process through the simulation execution unit and records lot history information of the specific process;
an episode reproduction unit that generates a plurality of second stage production episodes using the lot history information and simulates processes (hereinafter referred to as second stages) after the specific process according to the produced production episodes through the simulation execution unit; and,
A factory simulator-based scheduling neural network learning system with a lot history reproduction function, comprising a learning execution unit that generates learning data from the simulation results of the second stage and trains the scheduling neural network of the corresponding process.

According to paragraph 1,
A scheduling neural network learning system based on a factory simulator with a lot history reproduction function, wherein the scheduling neural network for each process is configured to output the type of work to be processed next in that state when the state of the factory workflow is input.

According to paragraph 2,
A factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the lot history information includes the time when each lot arrives at the relevant process and becomes ready for input (hereinafter referred to as arrival time).

According to paragraph 3,
Each of the above processes can perform multiple work types, and is equipped with a lot history reproduction function, which requires a predetermined work change time to switch from one work type to another work type. A scheduling neural network learning system based on a factory simulator.

According to clause 4,
A factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the episode reproduction unit generates a production episode by selecting a lot that has arrived or is about to arrive as a next task from the lot history information.

According to any one of claims 1 to 5,
A factory simulator-based scheduling neural network learning system with a lot history reproduction function, wherein the scheduling neural network for each process is a neural network based on reinforcement learning.