KR102026301B1

KR102026301B1 - Distributed parallel processing system for preventing data loss and method thereof

Info

Publication number: KR102026301B1
Application number: KR1020170184773A
Authority: KR
Inventors: 박성철; 박재환; 서승현
Original assignee: 주식회사 포스코아이씨티
Priority date: 2017-12-29
Filing date: 2017-12-29
Publication date: 2019-09-27
Also published as: KR20190081912A

Abstract

본 발명은 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법에 관한 것이다.
본 발명의 일 형태에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템은, 데이터 수집 시스템으로부터 제1 이벤트에 상응하는 제1 데이터를 수신하여 빅데이터 저장 시스템에 저장한 후 상기 제1 이벤트의 식별정보를 전송하는 제1 분산 병렬 처리 장치; 상기 데이터 수집 시스템으로부터 제2 이벤트에 상응하는 제2 데이터를 수신하여 상기 빅데이터 저장 시스템에 저장한 후 상기 제2 이벤트의 식별정보를 전송하는 제2 분산 병렬 처리 장치; 및 상기 제1 및 제2 분산 병렬 처리 장치 중 어느 하나의 분산 병렬 처리 장치에 오류가 발생한 경우 나머지 분산 병렬 처리 장치에 대해 상기 제1 및 제2 이벤트의 식별정보를 참조하여 데이터 처리를 지시하는 클러스터링 관리 장치를 포함하는 것을 특징으로 한다.The present invention relates to a distributed parallel processing system and method having a data loss prevention function.
In the distributed parallel processing system having a data loss prevention function of one embodiment of the present invention, the first event corresponding to the first event is received from the data collection system and stored in the big data storage system, and the first event is identified. A first distributed parallel processing apparatus for transmitting information; A second distributed parallel processing apparatus for receiving second data corresponding to a second event from the data collection system, storing the second data in the big data storage system, and transmitting identification information of the second event; And clustering for instructing data processing by referring to identification information of the first and second events with respect to the remaining distributed parallel processing units when an error occurs in any one of the first and second distributed parallel processing units. And a management device.

Description

Distributed parallel processing system for preventing data loss and method

본 발명은 분산 병렬 처리 시스템 및 방법에 관한 것으로, 보다 상세하게는 스마트 팩토리(Smart Factory) 구현 시 각 공정에서 실시간으로 수집된 데이터를 유실 없이 처리하여 빅데이터 저장소에 저장하는 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법에 관한 것이다.The present invention relates to a distributed parallel processing system and method, and more particularly, has a data loss prevention function for processing data collected in real time in each process without loss and storing in a big data store when a smart factory is implemented. One distributed parallel processing system and method is disclosed.

원재료를 이용하여 완제품을 생성하기 위한 복수개의 공정들이 연속적으로 수행되고, 각 공정의 산출물들이 서로 혼합되거나 특정 공정의 산출물의 상태가 변화하여 후속 공정으로 공급되는 것과 같이, 각 공정들이 서로 관련되어 있는 생산 방식을 연속공정 생산방식이라 한다. 철강, 에너지, 제지, 정유 등이 연속공정 생산방식이 적용되는 대표적인 산업들이다.Each process is related to each other such that a plurality of processes for producing a finished product using raw materials are performed in succession, and the outputs of each process are mixed with each other or the state of the output of a specific process is supplied to a subsequent process. The production method is called continuous process production method. Steel, energy, paper, and oil refining are typical industries where continuous process production is applied.

이러한 연속공정 생산방식이 적용되는 산업의 경우, 많은 부품이 반제품으로 만들어지고 다시 하나의 제품으로 완성되는 조립공정 생산방식이 적용되는 산업(예, 자동차)과 달리, 원재료 또는 중간재가 고속으로 이동하기 때문에 데이터 수집 주기가 짧고 데이터의 양이 많을 뿐만 아니라, 소음, 먼지, 수분 등이 많은 공장 환경에서 제품이 생산되기 때문에 계측 이상이 자주 발생하고, 작업 방법에 따라 중간재들이 서로 혼합되거나 소재의 위치가 이동하며 상태가 변하는 특성이 있다.In industries where this continuous process is applied, raw materials or intermediates are moving at high speed, unlike industries where the assembly process is applied where many parts are made of semi-finished products and finished as a single product. As a result, the data collection cycle is short and the amount of data is not only high.In addition, the product is produced in a factory environment with a lot of noise, dust, and moisture. It has the characteristic of moving and changing state.

따라서, 조립공정의 경우 1개의 부품 또는 중간재에 불량이 발생하면 1개의 완제품을 폐기하면 되지만, 연속공정의 경우 1개의 중간재에 불량이 발생하면 해당 중간재를 이용하여 생산된 대량의 완제품을 폐기해야 하기 때문에 품질판정에 대한 정확도와 판정시기가 조립공정보다 더 중요하다. 이에 따라, 연속공정 생산방식이 적용되는 산업의 경우 많은 양의 데이터를 실시간으로 처리할 수 있고 각 공정 별로 발생된 데이터들을 연계하여 처리할 수 있는 시스템이 절실히 요구된다.Therefore, in the case of assembly process, if a defect occurs in one part or intermediate material, one finished product may be discarded, but in the case of continuous process, if a defect occurs in one intermediate material, a large amount of finished product produced using the intermediate material should be discarded. Therefore, accuracy and timing of quality judgment are more important than assembly process. Accordingly, in the case of an industry to which continuous process production is applied, a system capable of processing a large amount of data in real time and processing the data generated by each process is urgently required.

한편, 이와 같이 각 공정 별로 실시간 수집된 많은 양의 데이터는 분산 병렬 처리 시스템에서 처리되어 빅데이터 저장 시스템에 저장되는데, 종래기술의 경우 분산 병렬 처리 시스템은 하나의 데이터 처리 장치(분산 병렬 처리 장치)로 구현되었다.Meanwhile, a large amount of data collected in real time for each process is processed in a distributed parallel processing system and stored in a big data storage system. In the prior art, a distributed parallel processing system includes one data processing device (distributed parallel processing device). Was implemented.

따라서, 종래기술의 경우, 분산 병렬 처리 시스템에 장애가 발생하게 되면 실시간(Real-time) 처리 시스템으로부터 전송되어 현재 분산 병렬 처리 시스템에서 처리되고 있는 데이터는 빅데이터 저장 시스템에 저장되지 못한 채 유실되는 문제점이 있었다. 이는 특히, 데이터의 일부 유실이 해당 공정에만 영향을 미치는 조립공정 생산방식이 적용되는 산업과는 달리, 연속공정 생산방식이 적용되는 산업에서는 모든 공정에 영향을 미쳐 전체 공정이 마비되는 치명적인 피해를 가져왔다.Therefore, in the related art, when a distributed parallel processing system fails, data transmitted from a real-time processing system and currently being processed in the distributed parallel processing system is lost without being stored in the big data storage system. There was this. This is especially true in industries where assembly process production is applied where some loss of data affects only that process, and in industries where continuous process production is applied, all processes are affected and the entire process is paralyzed. come.

한국 공개특허공보 제10-2016-0124475호Korean Unexamined Patent Publication No. 10-2016-0124475 한국 공개특허공보 제10-2017-0090114호Korean Unexamined Patent Publication No. 10-2017-0090114

본 발명은 전술한 바와 같은 문제점을 해결하기 위해 창안된 것으로, 본 발명의 목적은 실시간 수집된 많은 양의 데이터를 유실 없이 빅데이터 저장 시스템에 저장할 수 있는 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법을 제공하는 것이다.The present invention was devised to solve the above problems, and an object of the present invention is a distributed parallel processing system having a data loss prevention function capable of storing a large amount of data collected in real time in a big data storage system without loss. And a method.

본 발명의 다른 목적은 연속적으로 처리가 필요한 데이터를 유실 없이 안정적이고 빠르게 처리할 수 있는 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법을 제공하는 것이다.Another object of the present invention is to provide a distributed parallel processing system and method having a data loss prevention function that can stably and quickly process data that requires continuous processing without loss.

본 발명의 또 다른 목적은 분산 병렬 처리 시스템을 클러스터링 구조로 구현하여 하나의 분산 병렬 처리 장치에 오류가 발생하더라도 나머지 분산 병렬 처리 장치가 데이터를 유실 없이 처리할 수 있는 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법을 제공하는 것이다.Another object of the present invention is to implement a distributed parallel processing system in a clustering structure, even if an error occurs in one distributed parallel processing unit distributed with a data loss prevention function that the remaining distributed parallel processing unit can process the data without loss It is to provide a parallel processing system and method.

본 발명의 또 다른 목적은 분산 병렬 처리 시스템의 데이터 저장 성능을 향상시키고 고 가용성(High Availability)을 보장할 수 있는 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법을 제공하는 것이다.It is still another object of the present invention to provide a distributed parallel processing system and method having a data loss prevention function that can improve data storage performance of a distributed parallel processing system and ensure high availability.

본 발명의 또 다른 목적은 분산 병렬 처리 시스템에서 처리된 데이터의 이력을 관리함으로써 데이터의 추적이 가능하고 만약 장애가 발생할 경우 데이터의 이력을 확인하여 하나의 데이터도 유실 없이 처리할 수 있는 유실 방지 기능을 구비한 분산 병렬 처리 시스템 및 방법을 제공하는 것이다.Another object of the present invention is to manage the history of the data processed in a distributed parallel processing system to track the data, and if a failure occurs, a loss prevention function that can process a single data without loss by checking the data history It is to provide a distributed parallel processing system and method.

상기 목적을 위하여, 본 발명의 일 형태에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템은, 데이터 수집 시스템으로부터 제1 이벤트에 상응하는 제1 데이터를 수신하여 빅데이터 저장 시스템에 저장한 후 상기 제1 이벤트의 식별정보를 전송하는 제1 분산 병렬 처리 장치; 상기 데이터 수집 시스템으로부터 제2 이벤트에 상응하는 제2 데이터를 수신하여 상기 빅데이터 저장 시스템에 저장한 후 상기 제2 이벤트의 식별정보를 전송하는 제2 분산 병렬 처리 장치; 및 상기 제1 및 제2 분산 병렬 처리 장치 중 어느 하나의 분산 병렬 처리 장치에 오류가 발생한 경우 나머지 분산 병렬 처리 장치에 대해 상기 제1 및 제2 이벤트의 식별정보를 참조하여 데이터 처리를 지시하는 클러스터링 관리 장치를 포함하는 것을 특징으로 한다.For the above purposes, a distributed parallel processing system having a data loss prevention function of one embodiment of the present invention receives first data corresponding to a first event from a data collection system and stores the first data in a big data storage system. A first distributed parallel processing apparatus transmitting identification information of the first event; A second distributed parallel processing apparatus for receiving second data corresponding to a second event from the data collection system, storing the second data in the big data storage system, and transmitting identification information of the second event; And clustering for instructing data processing by referring to identification information of the first and second events with respect to the remaining distributed parallel processing units when an error occurs in any one of the first and second distributed parallel processing units. And a management device.

그리고, 본 발명의 일 형태에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 장치는, 클러스터링 관리 장치로부터 처리할 이벤트에 대한 이벤트 오프셋을 수신하는 이벤트 오프셋 관리부; 상기 이벤트 오프셋에 기초하여, 데이터 수집 시스템으로부터 상기 처리할 이벤트를 수신하는 이벤트 수신부; 상기 데이터 수집 시스템으로부터 상기 처리할 이벤트에 상응하는 데이터를 수신하는 데이터 페치부; 상기 데이터에 상응하는 빅데이터 저장용 파일을 생성하는 파일 생성부; 및 상기 파일 생성부에서 생성된 파일이 빅데이터 저장 시스템에 저장된 후, 상기 이벤트 오프셋을 수신하여 저장하는 이벤트 오프셋 저장부를 포함하는 것을 특징으로 한다.A distributed parallel processing apparatus having a data loss prevention function of one embodiment of the present invention includes an event offset management unit that receives an event offset for an event to be processed from a clustering management apparatus; An event receiver configured to receive the event to be processed from a data collection system based on the event offset; A data fetch unit for receiving data corresponding to the event to be processed from the data collection system; A file generation unit generating a file for storing big data corresponding to the data; And an event offset storage unit for receiving and storing the event offset after the file generated by the file generator is stored in the big data storage system.

또한, 본 발명의 일 형태에 따른 복수개의 분산 병렬 처리 장치를 관리하는 클러스터링 관리 장치는, 복수개의 분산 병렬 처리 장치와 세션을 설정하여 상기 복수개의 분산 병렬 처리 장치의 오류 여부를 감지하는 오류 감지부; 및 데이터 수집 시스템으로부터 처리할 이벤트에 대한 정보를 수신하고, 상기 처리할 이벤트를 상기 복수개의 분산 병렬 처리 장치에 분배하며, 상기 복수개의 분산 병렬 처리 장치 중 어느 하나의 분산 병렬 처리 장치에 오류가 발생한 경우 나머지 분산 병렬 처리 장치에 대해 상기 오류가 발생한 분산 병렬 처리 장치가 담당하는 이벤트를 재분배하는 밸런싱부를 포함하는 것을 특징으로 한다.A clustering management device for managing a plurality of distributed parallel processing devices of one embodiment of the present invention includes an error detection unit configured to detect a failure of the plurality of distributed parallel processing devices by setting a session with the plurality of distributed parallel processing devices. ; And receiving information about an event to be processed from a data collection system, distributing the event to be processed to the plurality of distributed parallel processing units, and wherein an error occurs in any one of the plurality of distributed parallel processing units. In this case, it is characterized in that it comprises a balancing unit for redistributing the event in charge of the distributed parallel processing unit in which the error occurs for the remaining distributed parallel processing unit.

한편, 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 방법은, 복수개의 분산 병렬 처리 장치가 데이터 수집 시스템으로부터 각각 처리할 이벤트에 상응하는 데이터를 수신하여 빅데이터 저장 시스템에 저장한 후 각각 처리한 이벤트의 식별정보를 전송하는 단계; 및 상기 복수개의 분산 병렬 처리 장치 중 어느 하나의 분산 병렬 처리 장치에 오류가 발생한 경우, 클러스터링 관리 장치가 나머지 분산 병렬 처리 장치에 대해 상기 처리된 이벤트의 식별정보를 참조하여 데이터 처리를 지시하는 단계를 포함하는 것을 특징으로 한다.On the other hand, the distributed parallel processing method having a data loss prevention function according to an embodiment of the present invention, a plurality of distributed parallel processing apparatus receives data corresponding to each event to be processed from the data collection system to the big data storage system Transmitting the identification information of each processed event after storing; And when an error occurs in any one of the plurality of distributed parallel processing devices, the clustering management device instructing data processing with respect to the remaining distributed parallel processing devices by referring to identification information of the processed event. It is characterized by including.

바람직하게는, 상기 지시하는 단계 이전에, 이벤트 오프셋 저장 장치가 상기 복수개의 분산 병렬 처리 장치로부터 상기 처리된 이벤트의 식별정보를 수신하는 단계를 더 포함한다.Preferably, prior to the indicating step, the event offset storage device further comprises the step of receiving identification information of the processed event from the plurality of distributed parallel processing device.

본 발명에 따르면, 분산 병렬 처리 시스템을 클러스터링 구조로 구현하여 하나의 분산 병렬 처리 장치에 오류가 발생하더라도 나머지 분산 병렬 처리 장치가 데이터를 유실 없이 처리할 수 있는 효과를 가진다.According to the present invention, by implementing a distributed parallel processing system in a clustering structure, even if an error occurs in one distributed parallel processing unit, the other distributed parallel processing unit has an effect of processing data without loss.

이에 따라, 실시간 수집된 많은 양의 데이터를 유실 없이 빅데이터 저장 시스템에 저장할 수 있으며, 연속적으로 처리가 필요한 데이터를 유실 없이 안정적이고 빠르게 처리할 수 있어, 분산 병렬 처리 시스템의 데이터 저장 성능을 향상시키고 고 가용성(High Availability)을 보장할 수 있다.As a result, a large amount of data collected in real time can be stored in a big data storage system without loss, and data that needs to be processed continuously can be stably and quickly processed without loss, thereby improving data storage performance of a distributed parallel processing system. High availability can be guaranteed.

그리고, 본 발명에 따르면, 분산 병렬 처리 시스템에서 처리된 데이터의 이력을 관리함으로써 데이터의 추적이 가능하고 만약 장애가 발생할 경우 데이터의 이력을 확인하여 하나의 데이터도 유실 없이 처리할 수 있는 효과를 가진다.In addition, according to the present invention, it is possible to track the data by managing the history of the data processed in the distributed parallel processing system, and if a failure occurs, it is possible to process a single data without loss by checking the history of the data.

도 1은 본 발명이 적용될 수 있는 스마트 팩토리 아키텍쳐를 예시한 도면이다.
도 2는 도 1의 스마트 팩토리 아키텍쳐 중 플랫폼 레이어에서 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템이 구현된 예를 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템의 상세 구성을 나타낸 도면이다.
도 4는 도 3의 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템을 이벤트 오프셋 처리 관점에서 재구성한 도면이다.1 is a diagram illustrating a smart factory architecture to which the present invention can be applied.
FIG. 2 is a diagram illustrating an example implementation of a distributed parallel processing system having a data loss prevention function according to an embodiment of the present invention in a platform layer of the smart factory architecture of FIG. 1.
3 is a diagram illustrating a detailed configuration of a distributed parallel processing system having a data loss prevention function according to an embodiment of the present invention.
4 is a diagram illustrating a reconfigured distributed parallel processing system having a data loss prevention function according to an exemplary embodiment of FIG. 3 from an event offset processing perspective.

이하에서는 첨부 도면 및 바람직한 실시예를 참조하여 본 발명을 상세히 설명한다. 참고로, 하기 설명에서 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments. For reference, detailed descriptions of well-known functions and configurations that may unnecessarily obscure the subject matter of the present invention will be omitted in the following description.

먼저, 도 1은 본 발명이 적용될 수 있는 스마트 팩토리 아키텍쳐를 예시한 도면이다.First, FIG. 1 is a diagram illustrating a smart factory architecture to which the present invention can be applied.

도 1을 참조하면, 본 발명이 적용될 수 있는 스마트 팩토리 아키텍쳐는 디바이스 레이어(Device Layer; 10), 네트워크 레이어(Network Layer; 20), 플랫폼 레이어(Platform Layer; 30), 어플리케이션 레이어(Application Layer; 40) 등으로 구성될 수 있다.Referring to FIG. 1, the smart factory architecture to which the present invention can be applied includes a device layer (10), a network layer (20), a platform layer (30), and an application layer (40). ) And the like.

디바이스 레이어(10)는 각 공정에서 생성된 마이크로 데이터(Micro Data)를 수집하기 위한 다양한 계측기, 센서, 액추에이터(Actuator) 등을 포함하며, 이러한 장치들의 데이터를 통합하거나 제어하는 P/C(Process/Computer), PLC(Programmable Logic Controller), PDS(Process Data Server), DCS(Distributed Control System) 등을 포함하는 계층이다.The device layer 10 includes various instruments, sensors, actuators, and the like for collecting micro data generated in each process, and integrates or controls data of these devices. The layer includes a computer, a programmable logic controller (PLC), a process data server (PDS), a distributed control system (DCS), and the like.

네트워크 레이어(20)는 디바이스 레이어(10)에서 발생한 데이터를 플랫폼 레이어(30)까지 전달하기 위한 네트워크 케이블, 게이트웨이(Gateway), 라우터(Router), 무선 AP(Access Point) 등을 포함하는 계층이다.The network layer 20 is a layer including a network cable, a gateway, a router, a wireless access point, and the like, for transferring data generated from the device layer 10 to the platform layer 30.

플랫폼 레이어(30)는 디바이스 레이어(10)에서 수집된 대량의 정형/비정형 마이크로 데이터를 수신하여 가공하고, 실시간 처리하며, 이에 기초하여 설비, 재료 등의 이상 유무를 판단하고, 향후 분석을 위해 빅데이터 저장소에 저장하며, 저장된 데이터에 대해 다양한 조회 및 분석 서비스를 제공하는 계층으로, 일련의 데이터 처리 및 저장을 위한 IT 플랫폼이라 할 수 있다.The platform layer 30 receives and processes a large amount of structured and atypical micro data collected from the device layer 10, processes it in real time, and determines whether there is an abnormality in facilities, materials, and the like for the future analysis. It is a layer that stores in data storage and provides various inquiry and analysis services for stored data. It can be called an IT platform for processing and storing a series of data.

보다 구체적으로, 플랫폼 레이어(30)는 예컨대 인터페이스(Interface; 31), 분석(Analytics; 32), 서비스(Service; 33), 보안(Security; 34), 관리(Management; 35) 등을 담당하는 기능부(functional part) 또는 시스템으로 구성될 수 있다.More specifically, the platform layer 30 is responsible for, for example, interfaces 31, analytics 32, services 33, security 34, management 35, and the like. It may consist of a functional part or system.

인터페이스 시스템(31)은 레벨(Level) 0~2의 이기종 장치들의 다양한 프로토콜에 대한 연결 수단을 제공하며, 데이터에 대한 레이아웃(Layout)을 해석하고, 항목표준화 등 마이크로 데이터에 대한 전처리 작업을 수행한다.The interface system 31 provides a means of connecting various protocols of heterogeneous devices of level 0 to 2, interprets layout of data, and performs preprocessing of micro data such as item standardization. .

분석 시스템(32)은 조업 현장의 실시간 의사결정을 지원하기 위한 실시간 프로세싱(Real-time Processing) 시스템과 조업-설비-품질 등 다양한 마이크로 데이터(Micro Data) 및 매크로 데이터(Macro Data)를 연계 분석하기 위한 비실시간 프로세싱(Non Real-time Processing) 시스템, 대용량 마이크로 데이터 및 매크로 데이터를 저장하기 위한 빅데이터 저장소 등을 포함한다.The analysis system 32 is a real-time processing system for supporting real-time decision-making on the shop floor, and the micro-data and macro data such as operation-facility-quality, etc. Non real-time processing systems, and big data repositories for storing large amounts of micro and macro data.

서비스 시스템(33)은 표준화된 처리 프로세스와 업무 기준을 서비스로 재활용하는 구조로, 비즈니스 노하우를 리포지토리(Repository)화하여 기능 단위로 정의된 서비스 간 연결을 통해 계획-실행-제어 간의 연계를 용이하게 하는 시스템이다.The service system 33 is a structure that recycles standardized processing processes and business standards into a service. The service system 33 repositories the business know-how to facilitate the connection between the plan, the execution, and the control through the connection between the services defined in functional units. It is a system.

보안 시스템(34)은 사용자에 대한 인증, 인가, 접근제어를 수행하며, 데이터 자체에 대한 보안 및 전송 통로에 대한 보안을 관리하는 시스템이다.The security system 34 performs authentication, authorization, and access control on the user, and manages the security of the data itself and the security of the transmission path.

관리 시스템(35)은 플랫폼 레이어(30)에 속한 개별 시스템에 대한 관리, UI/UX에 대한 관리, 데이터 수집을 위한 장치들에 대한 설정파일 관리, 각 시스템 개별 모니터링 및 설정값들 간의 연계정보 관리, 전체 시스템의 처리 성능 및 통합 모니터링을 제공하는 시스템이다.The management system 35 manages individual systems belonging to the platform layer 30, manages UI / UX, manages configuration files for devices for data collection, manages individual systems, and manages linkage information between set values. It is a system that provides processing performance and integrated monitoring of the entire system.

한편, 어플리케이션 레이어(40)는 플랫폼 레이어(30)를 기반으로 사용자에게 업무적으로 필요한 화면과 데이터를 가공하여 제공하는 계층이다.On the other hand, the application layer 40 is a layer that processes and provides the screen and data necessary for the business based on the platform layer 30.

참고로, 본 발명에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템은 전술한 스마트 팩토리 아키텍쳐 중 플랫폼 레이어에 적용될 수 있으며, 반드시 이에 한정되는 것은 아니다.For reference, the distributed parallel processing system having a data loss prevention function according to the present invention may be applied to the platform layer of the smart factory architecture described above, but is not necessarily limited thereto.

도 2는 도 1의 스마트 팩토리 아키텍쳐 중 플랫폼 레이어에서, 보다 구체적으로는 플랫폼 레이어의 분석 시스템에서 본 발명의 일 실시예에 따른 분산 병렬 처리 시스템이 구현되는 예를 나타낸 도면이다.2 is a diagram illustrating an example in which a distributed parallel processing system according to an embodiment of the present invention is implemented in a platform layer of the smart factory architecture of FIG. 1, more specifically, in an analysis system of a platform layer.

도 2를 참조하면, 플랫폼 레이어(30)의 분석 시스템(32)은 실시간 프로세싱(Real-time Processing) 시스템과 비실시간 프로세싱(Non Real-time Processing) 시스템으로 구현될 수 있다.Referring to FIG. 2, the analysis system 32 of the platform layer 30 may be implemented as a real-time processing system and a non real-time processing system.

실시간 프로세싱 시스템은 디바이스 레이어(10)에서 실시간으로 발생된 마이크로 데이터를 네트워크 레이어(20)를 통해 수신하여 저장하는 시스템으로, 예컨대 데이터 수집 시스템(100)의 형태로 구현될 수 있다. 그리고, 비실시간 프로세싱 시스템은 실시간 프로세싱 시스템에서 수집된 데이터를 처리하여 가공하고, 빅데이터 저장소에 분산 저장하며, 어플리케이션 레이어(40)에서 요청 시 필요한 데이터를 제공하는 시스템으로, 예컨대 분산 병렬 처리 시스템(200), 빅데이터 저장 시스템(300), 하둡 쿼리 시스템(400) 등으로 구현될 수 있다.The real-time processing system is a system for receiving and storing micro data generated in real time in the device layer 10 through the network layer 20, for example, in the form of a data collection system 100. In addition, the non-real-time processing system is a system that processes and processes the data collected by the real-time processing system, distributed and stores the data in the big data store, and provides data on demand in the application layer 40, for example, a distributed parallel processing system ( 200, the big data storage system 300, the Hadoop query system 400, or the like.

데이터 수집 시스템(100)은 디바이스 레이어(10)에서 실시간으로 발생된 마이크로 데이터를 네트워크 레이어(20)를 통해 수신하여 저장하고 이에 기초하여 설비, 품질 등의 이상 여부를 감지하는 시스템으로, 본 발명의 일 실시예에 따르면, 부하 데이터 저장 장치(110), 무부하 데이터 저장 장치(112), 이벤트 저장 장치(120), 이상감지 결과 저장 장치(130) 등으로 구성될 수 있다.The data collection system 100 is a system for receiving and storing micro data generated in real time in the device layer 10 through the network layer 20 and detecting abnormalities such as facilities and quality based on the data. According to an exemplary embodiment, the load data storage device 110, the no-load data storage device 112, the event storage device 120, and the abnormality detection result storage device 130 may be configured.

부하 데이터 저장 장치(110)는 디바이스 레이어(10)에서 실시간으로 발생된 부하 데이터를 수신하여 저장하는 기능을 수행하고, 무부하 데이터 저장 장치(112)는 디바이스 레이어(10)에서 실시간으로 발생된 무부하 데이터를 수신하여 저장하는 기능을 수행한다. 본 발명의 경우 실시간으로 수신되는 데이터를 부하 데이터와 무부하 데이터로 구분하여 처리하는데, 여기서 부하 데이터는 주로 작업이 진행되는 상태에서 수신되는 데이터로 값의 변화가 많으며 변화의 폭도 큰 특성을 갖는 데이터가 해당되고, 무부하 데이터는 주로 작업이 이루어 지지 않는 상태에서 측정되는 데이터로 동일한 값이 연속적으로 발생되는 특성을 갖는 데이터가 해당된다. 물론, 부하 데이터와 무부하 데이터 구분없이 부하 데이터 저장 장치(110)와 무부하 데이터 저장 장치(112)는 하나의 데이터 저장 장치로 구현될 수 있다.The load data storage device 110 performs a function of receiving and storing load data generated in real time in the device layer 10, and the no load data storage device 112 generates no-load data generated in real time in the device layer 10. Receives and stores the function. In the case of the present invention, the data received in real time is divided into load data and no-load data, and the load data is data that is mainly received while the work is in progress. In this case, the no-load data is mainly measured data in a state where no work is performed, and the data having the characteristic that the same value is continuously generated correspond to the data. Of course, the load data storage device 110 and the no-load data storage device 112 may be implemented as one data storage device without distinguishing the load data from the unloaded data.

부하/무부하 데이터가 각각 부하/무부하 데이터 저장 장치(110, 112)에 저장되어 부하/무부하 데이터의 수집이 완료되면, 부하/무부하 데이터를 빅데이터 저장 시스템에 저장하는 이벤트(Event)가 생성되고, 이벤트 저장 장치(120)는 이를 수신하여 저장한다. 본 발명의 일 실시예에 따르면, 부하/무부하 데이터 수집 완료에 상응하여 생성되는 이벤트는 부하/무부하 데이터를 식별할 수 있는 식별정보(이하, '데이터 식별정보'라 함)와 해당 이벤트를 식별할 있는 식별정보(이하 '이벤트 식별정보'라 함)를 포함하고, 여기서 데이터 식별정보는 키(key) 형태로 이벤트 식별정보는 이벤트 오프셋(Event Offset) 형태로 이벤트에 포함된다. 그리고, 이상감지 결과 저장 장치(130)는 실시간으로 수집된 데이터를 기초로 설비, 품질 등에 이상이 감지되면 그 결과를 이상감지 데이터 형태로 저장하여 관리한다.When the load / no load data is stored in the load / no load data storage devices 110 and 112, respectively, and the collection of the load / no load data is completed, an event for storing the load / no load data in the big data storage system is generated. The event storage device 120 receives and stores this. According to an embodiment of the present invention, the event generated corresponding to the completion of the load / no load data collection is the identification information (hereinafter referred to as 'data identification information') that can identify the load / no load data and the corresponding event to identify The identification information (hereinafter referred to as 'event identification information'), wherein the data identification information in the form of a key (key), the event identification information is included in the event (Event Offset) in the event. The abnormality detection result storage device 130 stores and manages the result in the form of abnormality detection data when an abnormality is detected in a facility, quality, or the like based on the data collected in real time.

분산 병렬 처리 시스템(200)은 데이터 수집 시스템(100)에서 실시간으로 수집된 데이터를 빅데이터 저장 시스템(300)에 저장하기 위해 여러 개의 파일로 분산 병렬 처리하는 시스템으로, 본 발명의 일 실시예에 따르면, 복수개의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n), 클러스터링 관리 장치(280), 이벤트 오프셋 저장 장치(290) 등으로 구성될 수 있다. 이에 대하여는 도 3 및 도 4를 참조하여 하기에서 상세 설명하기로 한다.The distributed parallel processing system 200 is a system for distributing parallel processing to multiple files for storing the data collected in real time in the data collection system 100 in the big data storage system 300, according to an embodiment of the present invention. According to an embodiment, the plurality of distributed parallel processing units 200-1, 200-2,..., 200-n, the clustering management unit 280, and the event offset storage unit 290 may be configured. This will be described in detail below with reference to FIGS. 3 and 4.

빅데이터 저장 시스템(300)은 분산 병렬 처리 시스템(200)에서 분산 병렬 처리된 파일 형태의 데이터를 저장하는 곳으로, 예컨대 하둡 분산 파일 시스템(HDFS; Hadoop Distributed File System)으로 구현될 수 있다. 본 발명의 일 실시예에 따르면, 빅데이터 저장 시스템(300)은 작업(Job)을 관리하고 메타 데이터(Meta data)를 관리하는 마스터 노드(Master Node)와 데이터를 저장하고 조회하는 데이터 노드(Data Node)로 구성될 수 있으며, 데이터 노드는 히스토리컬 데이터(Historical Data) 저장 장치(310)와 모델(Model) 저장 장치(320) 등을 포함한다.The big data storage system 300 stores data in the form of distributed parallel processing in the distributed parallel processing system 200. For example, the big data storage system 300 may be implemented as a Hadoop Distributed File System (HDFS). According to an embodiment of the present invention, the big data storage system 300 is a master node that manages a job and manages meta data, and a data node that stores and retrieves data. The data node may include a historical data storage device 310, a model storage device 320, and the like.

히스토리컬 데이터 저장 장치(310)는 실시간으로 수집된 대량의 마이크로 데이터와 일정 시간마다 파생항목으로 생성한 매크로 데이터를 저장하며, 필요 시 매크로 데이터는 관계형 데이터베이스(RDB; Relational Database)에 저장할 수 있다. 그리고, 모델 저장 장치(320)는 설비, 제품, 재료 등에 대한 분석 모델과 모델 실행 결과를 저장한다.The historical data storage device 310 stores a large amount of micro data collected in real time and macro data generated as derivatives every predetermined time, and when necessary, the macro data may be stored in a relational database (RDB). In addition, the model storage device 320 stores an analysis model and a model execution result for equipment, products, materials, and the like.

하둡 쿼리 시스템(400)은 빅데이터 저장 시스템(300)에 저장된 데이터를 쿼리(Query) 형태로 검색조건을 만들어 조회하여 데이터를 리턴해 주는 시스템으로, 본 발명의 일 실시예에 따르면, 쿼리 수신 장치(410), 쿼리 스케줄링 장치(420), 쿼리 실행 장치(430), 쿼리 결과 전송 장치(440) 등으로 구성될 수 있다.Hadoop query system 400 is a system that returns the data by making a search condition in the form of a query (Query) data stored in the big data storage system 300, according to an embodiment of the present invention, the query receiving apparatus 410, the query scheduling device 420, the query execution device 430, the query result transmission device 440, and the like.

쿼리 수신 장치(410)는 클라이언트(Client)에서 요청된 쿼리를 수신하여 수신된 쿼리 구문을 해석하는 기능을 수행하고, 쿼리 스케줄링 장치(420)는 요청되어 온 쿼리에 대해 메타 데이터를 기반으로 쿼리 실행 작업(Job)을 스케줄링하는 기능을 수행한다. 그리고, 쿼리 실행 장치(430)는 쿼리를 실행하여 빅데이터 저장 시스템(300)에서 원하는 데이터를 추출하는 기능을 수행하고, 쿼리 결과 전송 장치(440)는 쿼리 수행 결과를 클라이언트에게 전송하는 기능을 수행한다.The query receiving device 410 receives a query requested from a client and interprets the received query syntax, and the query scheduling device 420 executes a query based on metadata for the requested query. Performs the function of scheduling a job. In addition, the query execution apparatus 430 performs a function of extracting desired data from the big data storage system 300 by executing a query, and the query result transmission apparatus 440 transmits a query execution result to a client. do.

도 3은 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템의 상세 구성을 나타낸 도면이다.3 is a diagram illustrating a detailed configuration of a distributed parallel processing system having a data loss prevention function according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 일 실시예에 따른 분산 병렬 처리 시스템(200)은 복수개의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n), 클러스터링 관리 장치(280), 이벤트 오프셋 저장 장치(290) 등을 포함한다.Referring to FIG. 3, a distributed parallel processing system 200 according to an embodiment of the present invention may include a plurality of distributed parallel processing devices 200-1, 200-2,..., 200-n, and a clustering management device ( 280, event offset storage 290, and the like.

분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)는 데이터 수집 시스템(100)에서 실시간으로 수집된 데이터를 여러 개의 파일로 분산 병렬 처리하여 빅데이터 저장 시스템(300)에 저장하는 장치이다. 본 발명에 따른 분산 병렬 처리 시스템(200)은 복수개의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 클러스터링 구조로 형성되며, 각각의 개별 분산 병렬 처리 장치(200-1)는 부하 데이터 페치부(210-1), 무부하 데이터 페치부(212-2), 이벤트 수신부(220-1), 이상감지 데이터 수신부(230-1), 데이터 분할부(240-1), 메모리(250-1), 복수개의 파일 생성부(260a-1, 260b-1, ..., 260n-1), 이벤트 오프셋 관리부(270-1), 이벤트 오프셋 저장부(275-1) 등을 포함한다.The distributed parallel processing apparatuses 200-1, 200-2,..., 200-n distribute and parallelize data collected in real time from the data acquisition system 100 into a plurality of files, thereby storing the big data storage system 300. Is a device to store in. In the distributed parallel processing system 200 according to the present invention, a plurality of distributed parallel processing devices 200-1, 200-2,..., 200-n are formed in a clustering structure, and each individual distributed parallel processing device ( The load data fetch unit 210-1, the no load data fetch unit 212-2, the event receiver 220-1, the abnormality detection data receiver 230-1, and the data divider 240-1 are provided. ), A memory 250-1, a plurality of file generators 260a-1, 260b-1, ..., 260n-1, an event offset management unit 270-1, and an event offset storage unit 275-1. And the like.

데이터 수집 시스템(100)에서 실시간으로 부하/무부하 데이터가 수집되어 부하/무부하 데이터 저장 장치(110/112)에 저장되면 이벤트 저장 장치(120)에는 해당 부하/무부하 데이터 수집 완료에 상응하는 이벤트(즉, 부하/무부하 데이터를 빅데이터 저장 시스템에 저장하는 이벤트)가 저장된다.When the load / no load data is collected in real time in the data acquisition system 100 and stored in the load / no load data storage device 110/112, the event storage device 120 includes an event corresponding to the completion of the load / no load data collection. , An event for storing the load / no load data in the big data storage system).

이벤트 수신부(220-1)는 이벤트 저장 장치(120)를 모니터링하고 있다가, 이벤트 오프셋 관리부(270-1)에서 전송된 이벤트 식별정보(예, 이벤트 오프셋)에 해당하는 이벤트가 있으면 이를 수신하여 부하/무부하 데이터 페치부(210-1, 212-1)로 전달한다.The event receiver 220-1 monitors the event storage device 120 and receives an event corresponding to the event identification information (eg, an event offset) transmitted from the event offset manager 270-1 and loads the event. / No-load data fetch unit 210-1, 212-1.

부하 데이터 페치부(210-1)는 이벤트 수신부(220-1)로부터 부하 데이터 수집 완료에 상응하는 이벤트를 전달받으면 해당 이벤트 내의 키(key)(이는 데이터 식별정보를 포함하고 있음)를 이용하여 페치(Fetch)할 부하 데이터를 식별하고 데이터 수집 시스템(100)의 부하 데이터 저장 장치(110)로부터 해당 부하 데이터를 수신하여 데이터 분할부(240-1)로 전송한다.When the load data fetching unit 210-1 receives an event corresponding to the completion of load data collection from the event receiving unit 220-1, the load data fetching unit 210-1 fetches using a key (which includes data identification information) in the corresponding event. The load data to be fetched is identified and the load data is received from the load data storage device 110 of the data collection system 100 and transmitted to the data divider 240-1.

그리고, 무부하 데이터 페치부(212-1)는 이벤트 수신부(220-1)로부터 무부하 데이터 수집 완료에 상응하는 이벤트를 전달받으면 해당 이벤트 내의 키(key)(이는 데이터 식별정보를 포함하고 있음)를 이용하여 페치(Fetch)할 무부하 데이터를 식별하고 데이터 수집 시스템(100)의 무부하 데이터 저장 장치(112)로부터 해당 무부하 데이터를 수신하여 데이터 분할부(240-1)로 전송한다.When the no-load data fetching unit 212-1 receives an event corresponding to the completion of the no-load data collection from the event receiving unit 220-1, a key in the corresponding event (which includes data identification information) is used. By identifying the no-load data to be fetched, and receives the no-load data from the no-load data storage device 112 of the data collection system 100 and transmits to the data partitioning unit 240-1.

데이터 분할부(240-1)는 부하/무부하 데이터 페치부(210-1, 212-1)로부터 전송된 부하/무부하 데이터를 여러 개의 데이터셋(data set)으로 분할하여 메모리(250-1)에 저장한다. 참고로, 이는 일시에 대용량의 데이터가 메모리에 들어올 경우 메모리 부족(Out of Memory) 현상을 방지하기 위함이다.The data dividing unit 240-1 divides the load / no load data transmitted from the load / no load data fetching units 210-1 and 212-1 into a plurality of data sets and stores the data in the memory 250-1. Save it. For reference, this is to prevent out of memory when a large amount of data enters the memory at one time.

이상감지 데이터 수신부(230-1)는 데이터 수집 시스템(100)의 이상감지 결과 저장 장치(130)를 모니터링하고 있다가 설비, 품질 등에 이상이 감지되어 이상감지 데이터가 생성되면 이를 수신하여 메모리(250-1)로 전송한다.The abnormality detection data receiver 230-1 monitors the abnormality detection result storage device 130 of the data collection system 100, and receives an abnormality detection data when the abnormality detection data is generated by detecting an abnormality in equipment, quality, and the like. -1).

메모리(250-1)는 데이터 분할부(240-1)에서 수신된 부하/무부하 데이터와 이상감지 데이터 수신부(230-1)에서 수신된 이상감지 데이터를 임시 저장하고, 이를 복수개의 파일 생성부(260a-1, 260b-1, ..., 260n-1)로 전송한다. 본 발명의 일 실시예에 따르면, 메모리(250-1)는 큐(Queue) 형태로 구현될 수 있다.The memory 250-1 temporarily stores the load / no-load data received by the data divider 240-1 and the abnormality detection data received by the abnormality detection data receiver 230-1, and stores the plurality of file generation units ( 260a-1, 260b-1, ..., 260n-1). According to an embodiment of the present invention, the memory 250-1 may be implemented in the form of a queue.

파일 생성부(260a-1, 260b-1, ..., 260n-1)는 클러스터링 구조로 복수개로 구현되며, 메모리(250-1)에 저장된 데이터를 물리적인 파일로 생성하여 빅데이터 저장 시스템(300)의 히스토리컬 데이터 저장 장치(310)에 저장한다. 그리고, 파일 생성부(260a-1, 260b-1, ..., 260n-1)는 부하/무부하 데이터에 기초하여 생성된 파일이 모두 히스토리컬 데이터 저장 장치(310)에 저장되어 이벤트의 처리가 완료되면, 처리가 완료된 이벤트의 오프셋(이는 이벤트 식별정보를 포함하고 있음)이 이벤트 오프셋 저장부(275-1)에 저장되도록 한다.The file generators 260a-1, 260b-1,..., 260n-1 are implemented in plural in a clustering structure. The file generators 260a-1, 260b-1,. The data is stored in the historical data storage device 310 of 300. The file generators 260a-1, 260b-1,..., 260n-1 store all the files generated based on the load / no load data in the historical data storage device 310 to process the event. Upon completion, the offset of the completed event (which includes event identification information) is stored in the event offset storage unit 275-1.

이벤트 오프셋 저장부(275-1)는 파일 생성부(260a-1, 260b-1, ..., 260n-1)에 의해 전송된 이벤트 오프셋을 수신하여 저장하고, 이를 또한 이벤트 오프셋 저장 장치(290)로 전송한다.The event offset storage unit 275-1 receives and stores an event offset transmitted by the file generators 260a-1, 260b-1,..., 260n-1, and also stores the event offset storage device 290. To send).

이벤트 오프셋 관리부(270-1)는 클러스터링 관리 장치(280)의 밸런싱부(284)로부터 처리해야 할 이벤트에 대한 식별정보(예, 이벤트 오프셋)를 수신하고, 이벤트 오프셋 저장 장치(290)에 저장된 이벤트 오프셋을 조회하여 처리가 완료된 이벤트를 확인한 후, 미처리된 이벤트에 대한 식별정보(예, 이벤트 오프셋)를 이벤트 수신부(220-1)로 전송한다. The event offset management unit 270-1 receives identification information (eg, an event offset) of an event to be processed from the balancing unit 284 of the clustering management device 280, and stores the event stored in the event offset storage device 290. After checking the offset to check the event is completed, the identification information (eg, event offset) for the unprocessed event is transmitted to the event receiver 220-1.

한편, 클러스터링 관리 장치(280)는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)에 대한 오류 여부를 체크하고 임의의 분산 병렬 처리 장치에 오류가 발생한 경우 나머지 분산 병렬 처리 장치가 미처리된 이벤트를 분담하여 처리하도록 관리하는 장치로, 본 발명의 일 실시예에 따르면 오류 감지부(282), 밸런싱부(284) 등을 포함하여 구성된다.On the other hand, the clustering management device 280 checks whether or not each distributed parallel processing unit (200-1, 200-2, ..., 200-n) error, and if any distributed parallel processing unit has an error The other distributed parallel processing apparatus manages to share and process unprocessed events, and according to an embodiment of the present invention, includes an error detector 282 and a balancer 284.

오류 감지부(282)는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)에 대해 오류 여부를 체크하고 그 결과를 밸런싱부(284)에 전송한다.The error detector 282 checks each distributed parallel processing unit 200-1, 200-2,..., 200-n for an error and transmits the result to the balancer 284.

구체적으로, 오류 감지부(282)는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)에 대해 세션을 생성하여 일정 주기로 상태 확인 요청을 보내고, 만약 세션이 끊기거나 상태 확인 요청에 대한 응답이 없는 경우, 또는 임의의 기능에 대해 오류 발생을 응답 받는 경우에는 해당 분산 병렬 처리 장치에 장애가 발생한 것으로 판단하여 이를 밸런싱부(284)에 통지한다.In detail, the error detector 282 generates a session for each distributed parallel processing apparatus 200-1, 200-2,..., 200-n and sends a status check request at a predetermined interval. If there is no response to the disconnection or status check request, or if an error occurs for any function, it is determined that a failure occurs in the distributed parallel processing apparatus, and the balancing unit 284 is notified.

예컨대, 오류 감지부(282)는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)에 대해 세션을 생성하여 일정 주기로 상태 확인 요청을 보내고, 만약 세션이 끊기거나 상태 확인 요청에 대한 응답이 없는 경우, 오류 감지부(282)는 해당 분산 병렬 처리 장치에 대하여 전체 오류 발생을 밸런싱부(284)로 통지한다.For example, the error detector 282 creates a session for each distributed parallel processing unit 200-1, 200-2,..., 200-n to send a status check request at a predetermined period, and if the session is terminated. If there is no response to the status check request, the error detector 282 notifies the balancing unit 284 of the entire error occurrence to the distributed parallel processing apparatus.

그리고, 만약 임의의 분산 병렬 처리 장치(200-1)로부터 임의의 기능에 대해 오류 발생을 응답 받는 경우, 오류 감지부(282)는 오류가 발생한 기능에 대한 중요도를 판단하고 이에 따라 해당 분산 병렬 처리 장치(200-1)에 대하여 일부 또는 전체 오류 발생을 밸런싱부(284)로 통지한다.In addition, if an error occurs in response to an arbitrary function from any distributed parallel processing apparatus 200-1, the error detector 282 determines the importance of the function in which the error occurs and accordingly the corresponding distributed parallel processing. The balancer 284 notifies the apparatus 200-1 of the partial or total error occurrence.

예컨대, 오류 감지부(282)는, 이벤트 수신부(220-1)에서 이벤트 수신 간격이 기 설정된 시간을 초과하는 경우, 부하/무부하 데이터 페치부(210-1, 212-1)에서 부하/무부하 데이터 저장 장치(110, 112)로부터 데이터를 요청하여 받기까지의 시간이 기 설정된 시간을 초과하는 경우, 메모리(250-1)에서 지정된 메모리의 크기를 초과하여 저장되는 경우, 파일 생성부(260a-1, 260b-1, ..., 260n-1)에서 히스토리컬 데이터 저장 장치(310)의 이상으로 파일 생성에 실패하는 경우, 이벤트 오프셋 저장부(275-1)에서 이벤트 오프셋 저장 장치(290)에 저장된 오프셋 보다 작은 값의 오프셋이 들어오는 경우 등의 기능 오류에 대해서는 중요도를 상대적으로 낮게 판단하고, 해당 분산 병렬 처리 장치(200-1)에 대하여 낮은 중요도의 일부 오류 발생을 밸런싱부(284)로 통지한다.For example, when the event receiving interval exceeds a preset time in the event receiving unit 220-1, the error detecting unit 282 load / no load data in the load / no load data fetching units 210-1 and 212-1. When the time from the storage device 110, 112 to request and receive data exceeds the preset time, and is stored beyond the size of the memory specified in the memory 250-1, the file generator 260a-1 If the file generation fails due to the abnormality of the historical data storage device 310 at 260b-1, ..., 260n-1), the event offset storage unit 275-1 transmits the file to the event offset storage device 290. Regarding a functional error such as an offset of a value smaller than the stored offset, the importance is relatively low, and the balancing unit 284 notifies the balancing unit 284 of the occurrence of some error of low importance to the distributed parallel processing apparatus 200-1. do.

그리고, 오류 감지부(282)는, 부하/무부하 데이터 페치부(210-1, 212-1)에서 부하/무부하 데이터 저장 장치(110, 112)로부터 지정되지 않은 타입의 데이터를 수신하는 경우(예, 부하 데이터 페치부에서 무부하 데이터를 수신하는 경우), 데이터 분할부(240-1)에서 이벤트의 데이터 건수와 분할된 데이터 건수가 상이한 경우, 메모리(250-1)에서 메모리가 차지 않았는데도 데이터가 저장되지 않는 경우, 파일 생성부(260a-1, 260b-1, ..., 260n-1)에서 동일한 이름의 파일이 생성되는 경우 등의 기능 오류에 대해서는 중요도를 상대적으로 높게 판단하고, 해당 분산 병렬 처리 장치(200-1)에 대하여 높은 중요도의 일부 오류 발생을 밸런싱부(284)로 통지한다.The error detector 282 receives data of an unspecified type from the load / no load data storage devices 110 and 112 in the load / no load data fetch units 210-1 and 212-1 (eg, When no load data is received by the load data fetch unit), and when the number of data of the event and the number of divided data are different in the data divider 240-1, the data is stored even though the memory 250-1 does not occupy the memory. If not, the file generators 260a-1, 260b-1, ..., 260n-1 determine a relatively high importance for functional errors, such as when a file of the same name is generated, and the corresponding distributed parallel. The balancer 284 notifies the processing apparatus 200-1 of the occurrence of some error of high importance.

그리고, 오류 감지부(282)는 낮은 중요도의 기능 오류 및 높은 중요도의 기능 오류가 각각 기 설정된 회수 및/또는 시간을 초과하는 경우에는 해당 분산 병렬 처리 장치(200-1)에 대하여 전체 오류 발생을 밸런싱부(284)로 통지한다.In addition, the error detector 282 generates a total error for the distributed parallel processing apparatus 200-1 when the low importance function error and the high importance function error exceed a preset number of times and / or time, respectively. The balancing unit 284 is notified.

밸렁싱부(284)는 이벤트 저장 장치(120)와 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n) 사이의 연결을 확인하고, 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)의 대수, 처리량, 오류에 따른 부하를 기준으로 이벤트 저장 장치(120)에 저장된 이벤트를 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 어떻게 분담하여 처리할 것인지를 결정한다. 예컨대, 밸런싱부(284)는 이벤트 저장 장치(120)에 저장된 이벤트에 대한 식별정보(예, 이벤트 오프셋)를 수신하여 분산 병렬 처리 시스템(200)이 처리해야 할 이벤트를 확인하고, 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 분담하여 처리할 이벤트를 결정한 후, 특정 분산 병렬 처리 장치(200-1)가 처리해야 할 이벤트에 대한 식별정보(예, 이벤트 오프셋)를 해당 분산 병렬 처리 장치(200-1)의 이벤트 오프셋 관리부(270-1)로 전송함으로써, 이벤트 처리를 분배한다. 그리고, 밸런싱부(284)는 만약 임의의 분산 병렬 처리 장치에 전체 또는 일부 오류가 발생한 경우 나머지 분산 병렬 처리 장치가 데이터를 처리할 수 있도록 작업을 재분배 한다. 예컨대, 전체 오류가 발생한 분산 병렬 처리 장치가 담당하고 있는 이벤트를 나머지 분산 병렬 처리 장치로 재분배시키고, 재분배된 이벤트의 식별정보(예, 이벤트 오프셋)를 각각 해당 분산 병렬 처리 장치의 이벤트 오프셋 관리부)로 전송한다. 그리고, 일부 오류가 발생한 분산 병렬 처리 장치에 대해서는 나머지 분산 병렬 처리 장치보다 이벤트를 작게 분배한다. The balancing unit 284 checks the connection between the event storage device 120 and each distributed parallel processing unit 200-1, 200-2,..., 200-n, and the distributed parallel processing unit 200-. The distributed parallel processing units 200-1 and 200-2 each of the events stored in the event storage unit 120 based on the number, throughput, and error load of 1, 200-2,. , ..., 200-n) determine how to share and handle. For example, the balancer 284 receives identification information (eg, an event offset) of an event stored in the event storage device 120 to identify an event to be processed by the distributed parallel processing system 200, and each distributed parallel. After the processing apparatuses 200-1, 200-2,..., 200-n share a decision to determine an event to process, identification information about an event to be processed by a specific distributed parallel processing apparatus 200-1 (eg, Event offset) by transmitting the event offset to the event offset management unit 270-1 of the distributed parallel processing apparatus 200-1. Then, the balancing unit 284 redistributes the work so that the remaining distributed parallel processing unit can process the data if any or all errors occur in any distributed parallel processing unit. For example, the event distributed by the distributed parallel processing unit having a total error is redistributed to the remaining distributed parallel processing units, and the identification information (for example, the event offset) of the redistributed event is respectively transferred to the event offset management unit of the distributed parallel processing unit. send. In addition, the distributed parallel processing apparatus in which some errors occur are distributed less events than the other distributed parallel processing apparatus.

이벤트 오프셋 저장 장치(290)는 개별 분산 병렬 처리 장치(200-1)의 이벤트 오프셋 저장부(275-1)로부터 전송된 이벤트 오프셋을 수신하여 저장하고, 개별 분산 병렬 처리 장치(200-1)의 이벤트 오프셋 관리부(270-1)로부터 이벤트 오프셋 조회가 있으면 그 결과를 전송한다. 참고로, 개별 분산 병렬 처리 장치(200-1)의 이벤트 오프셋 저장부(275-1)는 해당 분산 병렬 처리 장치(200-1)에서 처리된 이벤트의 오프셋을 저장하고, 이벤트 오프셋 저장 장치(290)는 분산 병렬 처리 시스템(200), 즉 복수개의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)에서 처리된 이벤트의 오프셋을 모두 저장한다.The event offset storage unit 290 receives and stores the event offset transmitted from the event offset storage unit 275-1 of the individual distributed parallel processing unit 200-1, and stores the event offsets of the individual distributed parallel processing unit 200-1. If there is an event offset inquiry from the event offset management unit 270-1, the result is transmitted. For reference, the event offset storage unit 275-1 of the individual distributed parallel processing unit 200-1 stores the offset of the event processed by the distributed parallel processing unit 200-1, and the event offset storage unit 290. ) Stores all the offsets of the events processed by the distributed parallel processing system 200, that is, the plurality of distributed parallel processing devices 200-1, 200-2,..., 200-n.

이하에서는 도 3 및 도 4를 참조하여 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템의 정상 상태 및 장애발생 상태의 프로세싱 방식을 설명한다.Hereinafter, a processing method of a normal state and a failure state of a distributed parallel processing system having a data loss prevention function according to an embodiment of the present invention will be described with reference to FIGS. 3 and 4.

도 4는 도 3의 본 발명의 일 실시예에 따른 데이터 유실 방지 기능을 구비한 분산 병렬 처리 시스템을 이벤트 오프셋 처리 관점에서 재구성한 도면이다.4 is a diagram illustrating a reconfigured distributed parallel processing system having a data loss prevention function according to an exemplary embodiment of FIG. 3 from an event offset processing perspective.

먼저, 도 3 및 도 4를 참조하여, 모든 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 정상적으로 동작하는 상태인 경우의 프로세싱 방식을 설명한다.First, referring to FIGS. 3 and 4, a processing scheme in the case where all of the distributed parallel processing apparatuses 200-1, 200-2,..., 200-n are normally operated will be described.

클러스터링 관리 장치(280)의 밸런싱부(284)는 데이터 수집 시스템(100)의 이벤트 저장 장치(120)와 연동하여 분산 병렬 처리 시스템(200)에서 처리해야 할 이벤트를 확인한다. 예컨대, 클러스터링 관리 장치(280)의 밸런싱부(284)는 데이터 수집 시스템(100)의 이벤트 저장 장치(120)로부터 이벤트 오프셋(이는 이벤트 식별정보를 포함함)을 수신하여 처리해야 할 이벤트를 확인한다.The balancing unit 284 of the clustering management device 280 checks an event to be processed in the distributed parallel processing system 200 in cooperation with the event storage device 120 of the data collection system 100. For example, the balancing unit 284 of the clustering management device 280 receives an event offset (which includes event identification information) from the event storage device 120 of the data collection system 100 to identify an event to be processed. .

클러스터링 관리 장치(280)의 밸런싱부(284)는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 분담하여 처리할 이벤트를 결정한 후, 해당 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 처리해야 할 이벤트에 대한 식별정보(예, 이벤트 오프셋)를 해당 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)의 이벤트 오프셋 관리부(270-1, 270-2, ..., 270-n)로 각각 전송하여 이벤트 처리를 분배한다.The balancing unit 284 of the clustering management device 280 determines the events to be distributed and processed by each of the distributed parallel processing units 200-1, 200-2,. The device 200-1, 200-2, ..., 200-n, the identification information (e.g., event offset) for the event to be processed by the distributed parallel processing unit 200-1, 200-2, ... , 200-n) to the event offset management units 270-1, 270-2, ..., 270-n to distribute event processing.

그러면, 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)의 이벤트 오프셋 관리부(270-1, 270-2, ..., 270-n)는 클러스터링 관리 장치(280)의 밸런싱부(284)로부터 전송된 이벤트의 식별정보를 수신하고, 또한 이벤트 오프셋 저장 장치(290)에 저장된 이미 처리된 이벤트의 식별정보를 조회하여, 해당 분산 병렬 처리 장치가 처리해야 할 이벤트를 확인한다.Then, the event offset management unit 270-1, 270-2, ..., 270-n of each distributed parallel processing unit 200-1, 200-2, ..., 200-n is a clustering management device. Receive identification information of the event transmitted from the balancing unit 284 of 280, and also query the identification information of the already processed event stored in the event offset storage device 290, the corresponding distributed parallel processing device to process Check the event.

그리고, 해당 분산 병렬 처리 장치가 처리해야 할 이벤트에 대한 식별정보는 이벤트 수신부(220-1, 220-2, ..., 220-n)로 전달되고, 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)는 해당 이벤트를 처리한 후 이벤트 오프셋 저장부(275-1, 275-2, ..., 275-n)에 처리된 이벤트의 오프셋(이는 이벤트 식별정보를 포함함)을 저장한다. 그리고, 이벤트 오프셋 저장부(275-1, 275-2, ..., 275-n)는 이벤트 오프셋을 이벤트 오프셋 저장 장치(290)로 전송한다.The identification information on the event to be processed by the distributed parallel processing apparatus is transmitted to the event receiving units 220-1, 220-2, ..., 220-n, and each distributed parallel processing apparatus 200-1. , 200-2, ..., 200-n) process the corresponding event and then offset the event processed in the event offset storage unit 275-1, 275-2, ..., 275-n (this is the event). Store identification information). The event offset storage units 275-1, 275-2,..., 275-n transmit the event offset to the event offset storage device 290.

그 결과, 이벤트 오프셋 저장 장치(290)에는 각각의 분산 병렬 처리 장치(200-1, 200-2, ..., 200-n)가 처리한 이벤트에 대한 오프셋이 저장되며, 이에 따라 분산 병렬 처리 시스템(200)에서 처리 완료된 이벤트를 확인할 수 있다.As a result, the event offset storage device 290 stores the offset for the event processed by each of the distributed parallel processing units 200-1, 200-2,..., 200-n, and accordingly distributed parallel processing The system 200 may check the processed event.

한편, 만약 임의의 분산 병렬 처리 장치(200-1)에 전체 또는 일부 오류가 발생하여 장애 상태로 된 경우, 클러스터링 관리 장치(280)의 오류 감지부(282)는 이를 감지하고 밸런싱부(284)에 통지한다.On the other hand, if any distributed parallel processing unit (200-1) is a failure state due to all or part of the error, the error detection unit 282 of the clustering management device 280 detects the balancing unit 284 Notify

그러면, 클러스터링 관리 장치(280)의 밸런싱부(284)는, 전체 오류 발생인 경우, 오류가 발생한 분산 병렬 처리 장치(200-1)가 담당하는 이벤트를 확인하고, 이를 나머지 분산 병렬 처리 장치(200-2, ..., 200-n)로 재분배한다. 즉, 클러스터링 관리 장치(280)의 밸런싱부(284)는 전체 오류가 발생한 분산 병렬 처리 장치(200-1)가 담당하는 이벤트를 나머지 분산 병렬 처리 장치(200-2, ..., 200-n)로 재분배하고, 재분배되어 각각 분담하게 된 이벤트에 대한 식별정보(예, 이벤트 오프셋)을 나머지 분산 병렬 처리 장치(200-2, ..., 200-n)의 이벤트 오프셋 관리부(270-2, ..., 270-n)로 전송한다. 한편, 일부 오류 발생인 경우, 클러스터링 관리 장치(280)의 밸런싱부(284)는 일부 오류가 발생한 분산 병렬 처리 장치(200-1)에 대해 나머지 분산 병렬 처리 장치(200-2, ..., 200-n)보다 이벤트를 작게 분배한다.Then, the balancer 284 of the clustering management device 280, in the case of a total error, checks the event that the distributed parallel processing unit 200-1 in which the error occurs is responsible for, and the remaining distributed parallel processing unit 200 -2, ..., 200-n). In other words, the balancing unit 284 of the clustering management unit 280 is responsible for the remaining distributed parallel processing unit 200-2,..., 200-n ), And the event offset management unit 270-2 of the remaining distributed parallel processing apparatuses 200-2, ..., 200-n ..., 270-n). On the other hand, if there is some error, the balancing unit 284 of the clustering management device 280 is the other distributed parallel processing unit (200-2, ..., ... Distribute events smaller than 200-n).

그러면, 나머지 분산 병렬 처리 장치(200-2, ..., 200-n)의 이벤트 오프셋 관리부(270-2, ..., 270-n)는 각각 클러스터링 관리 장치(280)의 밸런싱부(284)로부터 전송된 재분배된 이벤트의 식별정보를 수신하고, 또한 이벤트 오프셋 저장 장치(290)에 저장된 이미 처리된 이벤트의 식별정보를 조회하여, 해당 분산 병렬 처리 장치가 처리해야 할 이벤트를 확인하고 처리한다.Then, the event offset management unit 270-2, ..., 270-n of the remaining distributed parallel processing units 200-2, ..., 200-n respectively balances 284 of the clustering management unit 280, respectively. Receives identification information of the redistributed event transmitted from the RX, and inquires the identification information of the already processed event stored in the event offset storage device 290 to identify and process the event to be processed by the distributed parallel processing device. .

지금까지 본 발명을 바람직한 실시예를 참조하여 상세히 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명의 기술적 사상이나 필수적 특징들을 변경하지 않고서 다른 구체적인 다양한 형태로 실시할 수 있는 것이므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다.Although the present invention has been described in detail with reference to preferred embodiments, it will be apparent to those skilled in the art that the present invention may be implemented in various other specific forms without changing the technical spirit or essential features of the present invention. The described embodiments are to be understood in all respects as illustrative and not restrictive.

그리고, 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 특정되는 것이며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태는 본 발명의 범위에 포함되는 것으로 해석되어야 한다.In addition, the scope of the present invention is specified by the appended claims rather than the detailed description, and all changes or modifications derived from the meaning and scope of the claims and equivalent concepts are included in the scope of the present invention. Should be interpreted as

Claims

Distributed parallel processing system with data loss prevention function,
A first distributed parallel processing apparatus that receives first data corresponding to a first event from a data collection system and stores the first data corresponding to the first event in a big data storage system, and stores an offset of the first event;
A second distributed parallel processing apparatus that receives second data corresponding to a second event from the data collection system, stores the second data in the big data storage system, and stores an offset of the second event; And
Clustering management device for instructing data processing by referring to the offset of the first and second event for the remaining distributed parallel processing unit when an error occurs in any one of the first and second distributed parallel processing unit Including,
The first distributed parallel processing apparatus may include: an event receiver configured to receive a first event from the data collection system; A data fetch unit receiving first data corresponding to the first event from the data collection system; A file generator which generates a file for storing big data corresponding to the first data; And an event offset storage unit for receiving and storing the offset of the first event after the file generated by the file generation unit is stored in the big data storage system. .

delete

The method of claim 1,
And an event offset storage device for receiving and storing offsets of the first and second events from the first and second distributed parallel processing devices.

The method according to claim 1 or 3,
The clustering management device,
An error detector configured to detect whether an error occurs in the first and second distributed parallel processing apparatuses; And
Distributing an event to be processed in the first and second distributed parallel processing units, and when an error occurs in any one of the first and second distributed parallel processing units, the second distributed parallel processing unit And a balancing unit for instructing data processing with reference to the offsets of the first and second events.

The method of claim 4, wherein
The error detection unit, in the distributed parallel processing unit of any one of the first and second distributed parallel processing unit, the time from the data collection system to request and receive data when the event reception interval exceeds a predetermined time If the preset time is exceeded, if the size of the specified memory is exceeded and stored in the memory, or if the file creation fails due to an abnormality of the historical data storage device, an offset of a value smaller than the offset stored in the event offset storage device is displayed. And when the at least one of the incoming cases occurs, notifying the balancing unit of the occurrence of some error corresponding to the second importance with respect to the distributed parallel processing apparatus.

The method of claim 4, wherein
The error detection unit, in the distributed parallel processing unit of any one of the first and second distributed parallel processing unit, when receiving data of an unspecified type from the data collection system, the number of data of the event and the number of divided data Is different, if the data is not stored in the memory even though the memory is not occupied, and if at least one of the cases in which a file of the same name is created occurs, some error corresponding to the first importance for the distributed parallel processing unit occurs. And a notifying unit of the balancing unit. A distributed parallel processing system having a data loss prevention function.

delete

The method according to claim 1 or 3,
The first distributed parallel processing apparatus,
And an event offset management unit for receiving an event offset for an event to be processed from the clustering management device.

A distributed parallel processing unit having a data loss prevention function,
An event offset management unit that receives an event offset for an event to be processed from the clustering management device;
An event receiver configured to receive the event to be processed from a data collection system based on the event offset;
A data fetch unit for receiving data corresponding to the event to be processed from the data collection system;
A file generation unit generating a file for storing big data corresponding to the data; And
And an event offset storage unit configured to receive and store the event offset after the file generated by the file generator is stored in the big data storage system.

The method of claim 9,
The event offset management unit receives an event offset for the event already processed from the event offset storage device,
Distributed event processing apparatus having a data loss prevention function, characterized in that the event offset received from the event offset storage device of the event offset received from the clustering management device to be transmitted to the event receiver.

The method of claim 9 or 10,
The data fetch unit,
A load data fetch unit receiving load data from the data collection system; And
And a no-load data fetch for receiving no-load data from the data collection system.

A clustering management device for managing a plurality of distributed parallel processing units,
An error detector configured to establish a session with a plurality of distributed parallel processing apparatuses and detect an error of the plurality of distributed parallel processing apparatuses; And
Receiving information about an event to be processed from a data collection system, distributing the event to be processed to the plurality of distributed parallel processing units, and an error occurs in any one of the plurality of distributed parallel processing units. A balancing unit for redistributing an event in charge of the distributed parallel processing unit in which the error occurs with respect to the remaining distributed parallel processing units;
At least one distributed parallel processing device of the plurality of distributed parallel processing device includes an event receiving unit for receiving a first event from the data collection system; A data fetch unit receiving first data corresponding to the first event from the data collection system; A file generator which generates a file for storing big data corresponding to the first data; And an event offset storage unit configured to receive and store the offset of the first event after the file generated by the file generator is stored in the big data storage system. Device.

The method of claim 12,
The error detection unit, in any one of the plurality of distributed parallel processing unit, when the event reception interval exceeds a predetermined time, the time until the request and receive data from the data collection system is preset If the time is exceeded, if the size of the specified memory is stored in the memory, if the file creation fails due to the abnormality of the historical data storage device, the offset of the value smaller than the offset stored in the event offset storage device And at least one case occurs, notifying the balancing unit of the occurrence of a partial error corresponding to a second importance level with respect to the distributed parallel processing unit.

The method according to claim 12 or 13,
When the error detection unit receives data of an unspecified type from the data collection system in any one of the plurality of distributed parallel processing devices, the number of data of the event and the number of divided data are different. When the data is not stored in the memory even though the memory is not occupied, when at least one of the cases in which a file of the same name is generated occurs, balancing the occurrence of some error corresponding to the first importance for the distributed parallel processing unit A clustering management apparatus for managing the plurality of distributed parallel processing apparatuses, which is notified to the unit.

Distributed parallel processing method with data loss prevention function,
Receiving, by the plurality of distributed parallel processing apparatuses, data corresponding to events to be processed from the data collection system, storing the offsets of the processed events after storing the data corresponding to the events to be processed in the big data storage system; And
If an error occurs in any one of the plurality of distributed parallel processing units, the clustering management unit instructing data processing with reference to the offset of the processed event with respect to the remaining distributed parallel processing units; ,
At least one distributed parallel processing device of the plurality of distributed parallel processing device includes an event receiving unit for receiving a first event from the data collection system; A data fetch unit receiving first data corresponding to the first event from the data collection system; A file generator which generates a file for storing big data corresponding to the first data; And an event offset storage unit configured to receive and store the offset of the first event after the file generated by the file generator is stored in the big data storage system. .

The method of claim 15,
Prior to the directing step,
And an event offset storage device receiving an offset of the processed event from the plurality of distributed parallel processing devices.