KR20150061864A

KR20150061864A - Apparatus for providing franework of processing large-scale data from business sequence and data processing method thereof

Info

Publication number: KR20150061864A
Application number: KR1020130146106A
Authority: KR
Inventors: 박주상; 이호성; 황재각; 방효찬
Original assignee: 한국전자통신연구원
Priority date: 2013-11-28
Filing date: 2013-11-28
Publication date: 2015-06-05
Also published as: JP2015106406A; JP6457747B2; KR102075386B1

Abstract

The present invention relates to an apparatus to provide a framework processing the various kinds of data, which occurs with time or in each process step necessary for a work flow, and supporting a service easy to extend for processing data of a sudden increase, and stable even in occurrence of a disability. According to the present invention, flexibility of composing functions to collect, store, process, manage the various kinds of data to be processed, according to a user definition, and a function to effectively manage massive amount of data, having unpredictable increasing amount, are provided.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a device for providing a framework for large-capacity sequential collection data processing, and a data processing method therefor. 2. Description of the Related Art [0002]

본 발명은 대용량 데이터 처리를 위한 시스템에 관한 것으로서, 보다 상세하게는 처리하고자 하는 다양한 종류의 데이터를 사용자 정의에 따라 수집, 저장, 처리, 관리하는 기능들을 구성할 수 있는 유연성과 데이터를 효율적으로 처리할 수 있는 관리 기능을 제공하기 위한 프레임워크 제공장치 및 이의 데이터 처리방법에 관한 것이다.The present invention relates to a system for processing large amounts of data, and more particularly, to a system for processing large amounts of data, and more particularly, to a system and a method for efficiently processing various types of data to be processed, And a data processing method for the same.

최근 스마트폰의 확산과 더불어 사물인터넷 기술의 발전에 따른 센서를 적용한 정보기술의 저변 확대로 정보시스템에서 수집, 처리, 저장, 관리, 사용되는 디지털 데이터가 폭증하고 있다. 한 예로서, 제약과 의약품 유통 산업의 경우, 의약품 불법유통과 위조 의약품 유통을 방지하기 위해 바코드를 이용한 의약품 관리를 넘어, RFID 기술을 이용한 개별 의약품에 대한 식별 및 관리를 추진하면서, 의약품의 제조와 유통 과정에서 다양한 업무 처리로 인해 발생하는 데이터 양이 폭발적으로 증가할 것으로 예산된다. Recently, with the proliferation of smartphones, the digital data that are collected, processed, stored, managed, and used in information systems are exploding due to the expansion of the information technology using sensors according to the development of Internet technology. For example, in the pharmaceutical and drug distribution industry, to prevent illegal distribution of drugs and distribution of counterfeit medicines, in addition to the management of medicines using bar codes, identification and management of individual medicines using RFID technology are promoted, It is anticipated that the amount of data generated by various business processes in the distribution process will increase explosively.

또한 사물인터넷 기술이 확산되면, 대량의 센서 데이터가 발생하고 이를 활용한 다양한 서비스들이 등장할 것으로 예상되는 바, 센서 데이터의 다양성과 신규 센서 장치의 증가, 사용자의 요구사항 변화 등으로 인해 서비스가 수시로 변경될 수 있는 환경에서는 하나의 일괄된 처리 방식의 적용이 어렵다는 문제가 있다.In addition, as the internet technology of things spreads, it is expected that a large amount of sensor data will be generated and a variety of services will be introduced. As a result, diversity of sensor data, increase of new sensor devices, There is a problem that it is difficult to apply a single batch processing method in an environment that can be changed.

위와 같은 배경에서 대량의 데이터를 효율적으로 수집, 저장, 처리하기 위한 다양한 기술들이 등장하고 있으나, 사용자의 요구사항 변경과 데이터의 다양성 측면에서 사용자 정의에 따른 데이터 수집, 저장, 처리 기능을 유연하게 구성할 수 있고, 데이터 규모 측면에서 여러 데이터 소스로부터 발생되는 대량의 데이터를 효율적으로 관리하기 위한 방법이 필요하다. 특히, 특정 산업 분야 또는 업무 분야에 종속되어 장기간 지속되며, 대량으로 발생하여 누적되는 환경에서, 다수의 이해관계자가 해당 데이터에 대한 다양한 요구사항을 반영해야 하는 경우 이를 효과적으로 지원하기 위한 프레임워크가 특히 핵심요소로서 필요하다.Various technologies for collecting, storing, and processing large amounts of data efficiently are emerging from the above background, but it is possible to flexibly configure the data collection, storage and processing functions according to the user's requirement change and data diversity In terms of data size, there is a need for a method for effectively managing a large amount of data originating from various data sources. Particularly, in a situation where a large number of stakeholders need to reflect diverse requirements for the data in an environment that is dependent on a specific industrial field or a business field for a long period of time, It is necessary as a core element.

본 발명은 상술한 종래 기술의 문제점을 해결하기 위하여, 다양한 종류의 데이터에 대해 사용자 정의에 따른 수집, 저장, 처리, 관리, 조회 등의 기능을 구성함으로써, 가변적인 사용자 요구사항과 임의의 데이터 변경에 대해 유연하게 대처할 수 있는 프레임워크 제공장치 및 이의 데이터 처리방법을 제공하는 것을 목적으로 한다.SUMMARY OF THE INVENTION The present invention has been made in order to solve the above-mentioned problems of the related art, and it is an object of the present invention to provide a data processing apparatus, a data processing method, And a data processing method for the same.

본 발명의 다른 목적은 다양한 종류의 데이터와 그 데이터의 증가 속도를 예측할 수 없는 환경에서 대용량의 데이터를 분산된 저장소에 특정 영역에 편중되지 않게 저장하여 관리할 수 있으며, 장애가 발생해도 서비스 요청을 충족할 수 있는 프레임워크 제공장침 및 이의 데이터 처리방법을 제공하는 것이다.It is another object of the present invention to provide a data storage system capable of storing and managing a large amount of data in a distributed storage in a specific area in an environment where various types of data and an increase rate of the data can not be predicted, And a data processing method thereof.

본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects not mentioned can be clearly understood by those skilled in the art from the following description.

상술한 본 발명의 목적을 달성하기 위한 본 발명의 일 면에 따른 데이터 처리방법은 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치에서의 데이터 처리방법에 있어서,According to another aspect of the present invention, there is provided a method of processing data in an apparatus for providing a framework for massively sequential collection data processing,

외부의 데이터 제공 장치에서 수집한 원본 데이터에 대해 사용자 정의 기반의 전처리 작업을 수행하는 단계;Performing a user-defined pre-processing operation on original data collected by an external data providing apparatus;

상기 원본 데이터를 사용자가 정의한 저장 규칙에 기초하여 분산형 데이터 저장소에 저장하는 단계;Storing the original data in a distributed data store based on a storage rule defined by a user;

사용자가 정의한 데이터 처리 규칙에 기초하여 상기 원본 데이터 또는 상기 원본 데이터의 저장 정보를 처리하여 가공 데이터를 생성하는 단계;Processing the storage data of the original data or the original data based on a data processing rule defined by a user to generate processed data;

상기 가공 데이터를 상기 분산형 데이터 저장소에 분배하여 저장하기 위한 데이터 분배 규칙을 등록하는 단계; 및Registering a data distribution rule for distributing and storing the processed data to the distributed data repository; And

상기 데이터 분배 규칙에 기초하여 상기 가공 데이터를 상기 분산형 데이터 저장소에 분배하여 저장하는 단계를 포함한다.And distributing and storing the processed data to the distributed data repository based on the data distribution rule.

또한, 본 발명의 일 면에 따른 데이터 처리방법은 상기 분산형 데이터 저장소에 저장된 상기 원본 데이터 또는 상기 가공 데이터에 사용자 조회가 있는 경우, 상기 사용자가 정의한 저장 규칙 또는 상기 데이터 분배 규칙을 참조하여 상기 원본 데이터 또는 상기 가공 데이터를 사용자 단말로 전송하는 단계를 더 포함할 수 있다.According to another aspect of the present invention, there is provided a data processing method, comprising the steps of: when a user inquiry is made on the original data or the processed data stored in the distributed data repository, And transmitting the data or the processed data to the user terminal.

일 실시예로서, 상기 사용자 정의 기반의 전처리 작업을 수행하는 단계는 사용자 인터페이스를 통해 등록되는 외부 시스템 연동 기능과 데이터 전처리 기능에 기초하여 데이터 수집 프로세스를 생성하는 단계를 포함한다.In one embodiment, the user-defined pre-processing includes generating a data collection process based on an external system interworking function and a data preprocessing function registered through a user interface.

일 실시예로서, 상기 분산형 데이터 저장소에 저장하는 단계는 사용자 인터페이스를 통해 등록되는 사용자가 정의한 원본 데이터 저장 규칙에 기초하여 원본 데이터 저장 프로세스를 생성하는 단계를 포함한다.In one embodiment, the step of storing in the distributed data store includes generating a source data storage process based on a user-defined original data storage rule registered through a user interface.

일 실시예로서, 상기 가공 데이터를 생성하는 단계는 사용자 인터페이스를 통해 등록된 사용자 정의 기반 데이터 처리 기능과 상기 가공 데이터 저장 규칙에 기초하여 적어도 하나 이상의 원본 데이터 처리 프로세스를 생성하는 단계를 포함한다.In one embodiment, generating the processed data includes generating at least one original data processing process based on user-defined data processing functions registered through a user interface and the processed data storage rules.

일 실시예로서, 상기 데이터 분배 규칙을 등록하는 단계는 상기 분산형 데이터 저장소가 키-밸류(key-value) 분산형 데이터 저장소인 경우, 상기 가공 데이터를 균등하게 저장할 수 있는 키 값 생성 규칙을 등록하는 단계를 포함한다. In one embodiment, the step of registering the data distribution rule may include registering a key value generation rule capable of evenly storing the processed data when the distributed data store is a key-value distributed data store .

본 발명의 다른 실시예에 따른 데이터 처리방법은 사용자 인터페이스를 통해 등록되는 외부 시스템 연동 기능과 데이터 전처리 기능에 기초하여 데이터 수집 프로세스를 생성하는 단계;According to another embodiment of the present invention, there is provided a data processing method comprising: generating a data collection process based on an external system interworking function and a data preprocessing function registered through a user interface;

사용자 인터페이스를 통해 등록되는 사용자가 정의한 원본 데이터 저장 규칙에 기초하여 원본 데이터 저장 프로세스를 생성하는 단계;Creating a source data storage process based on a user-defined source data storage rule registered through a user interface;

사용자 인터페이스를 통해 등록된 사용자 정의 기반 데이터 처리 기능과 상기 가공 데이터 저장 규칙에 기초하여 적어도 하나 이상의 원본 데이터 처리 프로세스를 생성하는 단계;Creating at least one original data processing process based on user-defined data processing functions registered through a user interface and the processed data storage rules;

상기 가공 데이터를 상기 분산형 데이터 저장소에 분배하여 저장하기 위한 데이터 분배 규칙을 등록하고, 상기 데이터 분배 규칙에 기초하여 가공 데이터 분배 프로세스를 생성하는 단계; 및Registering a data distribution rule for distributing and storing the processed data to the distributed data store, and generating a processed data distribution process based on the data distribution rule; And

상기 각각의 프로세스 사이의 관계 설정을 통해 사용자 정의 기반 데이터 처리를 위한 통합 프로세스를 생성하는 단계를 포함한다.And creating an integrated process for user-defined based data processing through establishing a relationship between each of the processes.

한편, 본 발명의 목적을 달성하기 위한 본 발명의 다른 면에 따른 프레임워크 제공장치는 대용량 순차 수집 데이터 처리를 위한 프레임워크를 제공하는 프로그램 코드(code)가 저장된 비휘발성 메모리와, 상기 프로그램 코드를 실행하는 적어도 하나의 프로세서를 포함한다.According to another aspect of the present invention, there is provided an apparatus for providing a framework, comprising: a non-volatile memory storing a program code for providing a framework for processing large-capacity sequential data; And at least one processor executing the instructions.

여기서, 상기 프레임워크는 외부의 데이터 제공 장치에서 수집한 원본 데이터에 대해 사용자 정의 기반의 전처리 작업을 수행하는 데이터 수집 모듈;Here, the framework may include a data collection module for performing a user-defined preprocessing operation on original data collected by an external data providing apparatus;

상기 원본 데이터를 사용자가 정의한 저장 규칙에 기초하여 분산형 데이터 저장소에 저장하는 데이터 저장 모듈;A data storage module for storing the original data in a distributed data storage based on a storage rule defined by a user;

사용자가 정의한 데이터 처리 규칙에 기초하여 상기 원본 데이터 또는 상기 원본 데이터의 저장 정보를 처리하여 가공 데이터를 생성하는 데이터 처리 모듈;A data processing module for processing the original data or the storage information of the original data based on a data processing rule defined by a user to generate processed data;

상기 가공 데이터를 상기 분산형 데이터 저장소에 분배하여 저장하기 위한 데이터 분배 규칙을 등록하는 데이터 관리정책 모듈을 제공하도록 구현된 것을 특징으로 한다.And a data management policy module for registering data distribution rules for distributing and storing the processed data to the distributed data repository.

일 실시예로서, 상기 데이터 수집 모듈은 외부 시스템 연동 기능과 데이터 파싱을 포함하는 사용자 정의 기반 전처리 기능을 등록할 수 있는 사용자 인터페이스를 제공한다.In one embodiment, the data collection module provides a user interface for registering a user-defined pre-processing function including an external system interworking function and data parsing.

일 실시예로서, 상기 데이터 저장 모듈은 사용자 정의에 따른 원본 데이터 저장 규칙을 등록할 수 있는 사용자 인터페이스를 제공한다.In one embodiment, the data storage module provides a user interface for registering original data storage rules according to user definition.

일 실시예로서, 상기 데이터 저장 모듈은 상기 원본 데이터로부터 필요한 데이터 추출 작업 또는 상기 원본 데이터에 대한 가공 작업이 필요한 경우, 상기 데이터 처리 모듈을 등록하고, 상기 데이터 처리 모듈에서 사용자가 정의한 데이터 처리 규칙에 기초하여 상기 원본 데이터를 상기 데이터 처리 모듈로 전달하거나, 상기 원본 데이터의 저장 정보를 통지한다.In one embodiment, the data storage module registers the data processing module when a necessary data extraction operation from the original data or a processing operation on the original data is required, and the data processing module registers the data processing rule defined by the user in the data processing module Transmits the original data to the data processing module on the basis of the stored information, or notifies the storage information of the original data.

일 실시예로서, 상기 데이터 처리 모듈은 상기 사용자가 정의한 데이터 처리 규칙과 상기 가공 데이터 저장 규칙을 등록할 수 있는 사용자 인터페이스를 제공한다.In one embodiment, the data processing module provides a user interface for registering the data processing rules defined by the user and the processed data storing rules.

일 실시예로서, 상기 데이터 처리 모듈은 사용자가 정의한 데이터 처리 규칙에 기초하여 상기 원본 데이터 또는 상기 원본 데이터의 저장 정보를 즉시 수행 방식 또는 주기적 수행 방식으로 처리한다.In one embodiment, the data processing module processes the original data or the stored information of the original data on the basis of a data processing rule defined by the user, either in an immediate execution mode or a cyclic execution mode.

일 실시예로서, 상기 데이터 관리정책 모듈은 상기 데이터 분배 규칙을 등록 및 조회할 수 있는 사용자 인터페이스를 제공하되, 상기 분산형 데이터 저장소가 키-밸류(key-value) 분산형 데이터 저장소인 경우, 상기 데이터 분배 규칙은 상기 가공 데이터를 균등하게 저장할 수 있는 키 값 생성 규칙인 것을 특징으로 한다.In one embodiment, the data management policy module provides a user interface capable of registering and querying the data distribution rule. When the distributed data store is a key-value distributed data store, And the data distribution rule is a key value generation rule capable of evenly storing the processed data.

이상 상술한 바와 같은 본 발명에 따르면, 복수의 이해관계자가 관여하는 업무 순차에 따라 대량으로 발생하여 수집되는 다양한 종류의 데이터에 대해 수집, 저장, 처리, 조회, 관리 기능을 사용자의 정의에 따라 구성 가능하고, 필요시 장치를 추가하여 저장과 처리 능력을 증강시킬 수 있는 프레임워크를 제공함으로써, 임의로 변화하거나 증가하는 사용자의 요구사항과 데이터에 대해 유연하고 효율적으로 대응하고, 비용을 최소화할 수 있다. 특히, 시간이 지남에 따라 데이터가 급증할 것으로 예상되는 환경에서 높은 확장성과 가용성을 갖는 프레임워크를 제공함으로써, 대량의 데이터를 효율적으로 관리하고 다수의 이해관계자를 충족시키는 서비스를 제공할 수 있다.As described above, according to the present invention, it is possible to collect, store, process, query, and manage various types of data generated and collected in large quantities according to the sequence of tasks involved in a plurality of stakeholders It is possible to flexibly and efficiently respond to arbitrarily changing or increasing user requirements and data, and to minimize costs by providing a framework that can increase the storage and processing capabilities by adding devices where possible and as needed . In particular, by providing a framework with high scalability and availability in an environment where data is expected to surge over time, it is possible to efficiently manage large amounts of data and provide services that satisfy a large number of stakeholders.

도 1은 본 발명의 실시예에 따른 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치에서 제공되는 프레임워크의 구성을 도시한 도면.
도 2는 본 발명의 실시예에 따른 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치를 활용한 시스템의 일 예를 도시한 도면.
도 3은 본 발명의 실시예에 따른 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치를 활용한 시스템의 다른 예를 도시한 도면.
도 4는 본 발명의 실시예에 따라 분산형 저장소의 데이터 저장공간을 확장하는 일 예를 도시한 도면.
도 5a 및 도 5b는 본 발명의 실시예에 따라 키-밸류 분산형 저장소에서 사용자 정의 기반 데이터 저장방식을 설명하기 위한 도면.
도 6a 및 도 6b는 키-밸류 분산형 저장소에 데이터가 분산되어 저장된 상황을 비교하여 설명하기 위한 도면.
도 7은 본 발명의 실시예에 따른 프레임워크 제공장치에서 데이터 처리방법을 도시한 도면.BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a framework provided in an apparatus for providing a framework for massively sequential collection data processing according to an embodiment of the present invention; FIG.
BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a system for providing a framework for massively sequential collected data processing.
3 is a diagram illustrating another example of a system utilizing an apparatus for providing a framework for massively sequential collection data processing according to an embodiment of the present invention.
Figure 4 illustrates an example of extending the data storage space of a distributed storage in accordance with an embodiment of the present invention.
5A and 5B illustrate a user-defined data storage method in a key-value distributed storage according to an embodiment of the present invention;
FIGS. 6A and 6B are diagrams for explaining a situation in which data is distributed and stored in a key-value distributed storage; FIG.
7 is a diagram illustrating a data processing method in a framework providing apparatus according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification.

이하, 본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. 우선 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, the same reference numerals are used to designate the same or similar components throughout the drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

본 발명에 따른 프레임워크 제공장치는 대용량 순차 수집 데이터 처리를 위한 프레임워크를 제공하는 프로그램 코드(code)가 저장된 비휘발성 메모리와, 상기 프로그램 코드를 실행하는 적어도 하나의 프로세서를 포함한다. 여기서, 프레임워크를 제공하는 장치는 외부의 데이터 제공장치, 분산형 데이터 저장소, 사용자 단말 사이에서 데이터 수집, 처리 및/또는 가공, 조회 등의 역할을 수행하는 서버 장치일 수 있다. The apparatus for providing a framework according to the present invention includes a nonvolatile memory storing program code for providing a framework for large-volume sequential collection data processing, and at least one processor for executing the program code. Here, the apparatus for providing the framework may be a server apparatus performing an operation of collecting, processing and / or processing data, and inquiring data between an external data providing apparatus, a distributed data repository, and a user terminal.

본 발명에 따른 프레임워크 제공장치에서 상기 프로그램 코드의 실행은 하나의 '작업' 또는 '프로세스' 단위로 처리될 수 있고, 상기 프로그램 코드의 실행 결과 대용량 순차 수집 데이터 처리를 위한 프레임워크가 생성된다. 도 1에는 본 발명의 실시예에 따른 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치에서 제공되는 프레임워크의 구성이 도시된다.In the apparatus for providing a framework according to the present invention, the execution of the program code can be processed in units of one 'task' or 'process', and as a result of execution of the program code, a framework for large capacity sequential collection data processing is generated. FIG. 1 is a block diagram of a framework provided in an apparatus for providing a framework for massively sequential collection data processing according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시 예에 따른 대량의 업무 순차 수집 데이터 처리 프레임워크는, 데이터 수집 모듈(101), 데이터 저장 모듈(102), 데이터 처리 모듈(104), 데이터 관리정책 모듈(105), 모듈구성 관리자(106), 데이터 서비스 제공자(107)를 포함한다.1, a large amount of business sequential collection data processing framework according to an embodiment of the present invention includes a data collection module 101, a data storage module 102, a data processing module 104, a data management policy module 105, a module configuration manager 106, and a data service provider 107.

데이터 수집 모듈(101)은 다양한 종류의 데이터를 제공하는 외부 데이터 제공 시스템과 연동하여 데이터를 수집하고, 수집된 데이터(이하, 원본 데이터)에 대해 사용자 정의 기반의 전처리 작업을 수행한다. 전처리 과정을 거친 원본 데이터는 데이터 저장 모듈(102)로 전달된다. The data collection module 101 collects data in cooperation with an external data providing system that provides various types of data, and performs a user-defined preprocessing operation on collected data (hereinafter referred to as original data). The original data having undergone the preprocessing process is transferred to the data storage module 102.

데이터 수집 모듈(101)은 외부 데이터 제공 시스템 연동 기능과 데이터 파싱 등의 사용자 정의 기반의 전처리 기능을 등록할 수 있는 사용자 인터페이스를 제공한다.The data collection module 101 provides a user interface for registering an external data providing system interworking function and a user-defined preprocessing function such as data parsing.

이로 인해, 외부 데이터 제공 시스템과의 연동 방식이 변경되더라도 사용자는 데이터 저장 모듈(102)을 변경하지 않더라도 데이터 수집 모듈(101)의 외부 데이터 제공 시스템 연동 기능만 변경함으로써 외부 변경에 대응할 수 있다.Therefore, even if the method of interworking with the external data providing system is changed, the user can respond to the external change by changing only the function of interlocking the external data providing system of the data collecting module 101 without changing the data storing module 102.

데이터 수집 모듈(101)은 Pull 방식과 Push 방식으로 동작할 수 있는데, Pull 방식은 외부 데이터 제공 시스템에 데이터 요청을 주기적으로 수행하여 데이터를 수집하는 방식이고, Push 방식은 데이터 수신 대기 상태에서 외부 데이터 제공 시스템에서 데이터를 전송하면 이를 수집하는 방식이다.The data collection module 101 can operate in a pull mode and a push mode. The pull mode is a method of periodically collecting data by requesting an external data providing system. It is a method of collecting data when it is transmitted from the providing system.

데이터 저장 모듈(102)은 데이터 수집 모듈(101)로부터 전달받은 데이터의 원본을 분산형 저장소(103)에 저장하는 기능과 사용자 정의에 따른 원본 데이터 저장 규칙을 등록할 수 있는 인터페이스를 제공한다. 사용자는 데이터 저장 모듈(102)을 이용하여 분산형 저장소(103)에 원본 데이터를 저장하고 유지, 관리함으로써 향후 새로운 서비스 제공이 필요하거나 기존 서비스의 변경이 필요한 상황에서도 원활하게 대응할 수 있으며, 데이터 처리 결과의 오류 검증에도 활용할 수 있다. 또한, 사용자는 데이터 저장 모듈(102)로 전달된 원본 데이터로부터 필요한 데이터의 추출 작업 또는 원본 데이터의 가공 작업이 필요한 경우, 데이터 저장 모듈(102)에 데이터 처리 모듈(104)를 등록할 수 있다. 데이터 저장 모듈(102)은 자신에게 등록된 데이터 처리 모듈(104)이 존재하는 경우, 해당 데이터 처리 모듈(104)의 데이터 처리 방식에 따라 해당 원본 데이터를 즉시 전달하거나 분산형 저장소(103)에 저장된 원본 데이터의 저장 정보(데이터 제공자, 데이터 발생시각, 데이터 저장 경로 등)를 통지한다.The data storage module 102 provides a function of storing the original data received from the data collection module 101 in the distributed storage 103 and an interface capable of registering the original data storage rules according to the user definition. The user stores, maintains and manages original data in the distributed storage 103 using the data storage module 102, so that the user can smoothly cope with a situation where a new service is required to be provided or a change of an existing service is required. It can also be used for error checking of results. The user may register the data processing module 104 in the data storage module 102 when the user needs to extract necessary data from the original data transferred to the data storage module 102 or to process the original data. When the data processing module 104 registered in the data storing module 102 exists, the data storing module 102 immediately transfers the corresponding original data according to the data processing method of the corresponding data processing module 104, (Data provider, data generation time, data storage path, etc.) of the original data.

한편, 분산형 저장소(103)은 데이터 저장 모듈(102)로부터 저장 요청을 받은 원본 데이터와 데이터 처리 모듈(104)에 의해 발생된 가공 데이터를 저장하고 관리하는 기능을 제공한다. 분산형 저장소(103)는 데이터를 저장할 수 있는 복수의 정보 기기를 직접 또는 네트워크로 연결하여 구성하며, 필요할 때마다 장치를 단순히 추가함으로써 데이터 저장 공간과 처리 능력을 향상시킬 수 있다.On the other hand, the distributed storage 103 provides a function of storing and managing the original data received from the data storage module 102 and the processed data generated by the data processing module 104. The distributed storage 103 is configured by connecting a plurality of information devices capable of storing data directly or via a network, and can improve the data storage space and the processing capability by simply adding the devices whenever necessary.

도 4는 본 발명에서 기술한 프레임워크를 구성하는 분산형 저장소(401)의 데이터 저장공간이 부족한 상황에서 저장 공간을 확장하는 절차에 관한 것이다. 도 4a는 복수의 분산 노드(402-1, 402-2,...,402-n)로 구성된 분산형 저장소(401)를 구축한 환경을 보여준다. 도 4b는 데이터량이 급증하는 상황에서 새로운 분산 노드(402-n+1)을 분산형 저장소(401)에 추가해 줌으로써 데이터 저장공간을 확장한 실시 예를 보여준다.FIG. 4 illustrates a procedure for expanding storage space in a situation where the data storage space of the distributed storage 401 constituting the framework described in the present invention is insufficient. 4A shows an environment in which a distributed repository 401 configured with a plurality of distributed nodes 402-1, 402-2, ..., 402-n is constructed. 4B shows an embodiment in which the data storage space is expanded by adding a new distributed node 402-n + 1 to the distributed storage 401 in a situation in which the amount of data increases rapidly.

데이터 처리 모듈(104)은 데이터 저장 모듈(102)로부터 전달 또는 통지 받은 데이터를 사용자가 정의한 데이터 처리 규칙에 기초하여 처리하고, 그 처리 결과로 생성된 가공 데이터를 분산형 저장소(103)에 저장하는 기능을 제공한다.The data processing module 104 processes data received or notified from the data storage module 102 based on a data processing rule defined by the user and stores the processed data generated as the processing result in the distributed storage 103 Function.

데이터 처리 모듈(104)은 사용자 정의 기반 데이터 처리 기능과 가공 데이터 저장 규칙을 등록할 수 있는 사용자 인터페이스를 제공한다. 데이터 처리 모듈(104)에 등록된 사용자 정의 기반 데이터 처리 규칙은 사용자 설정에 따라 주기적 수행과 즉시 수행 방식으로 동작할 수 있다.The data processing module 104 provides a user-based data processing function and a user interface for registering processing data storage rules. The user-defined data processing rules registered in the data processing module 104 may operate in a periodical execution mode and an immediate execution mode according to user setting.

여기서, 주기적 수행 방식은 일정한 시간 간격으로 사용자 정의 기반 데이터 처리 기능을 수행하고, 즉시 수행 방식은 데이터를 전달 또는 통지 받은 즉시 사용자 정의 기반 데이터 처리 기능을 수행한다.Here, the periodic execution method performs a user-defined data processing function at a predetermined time interval, and the immediate execution method performs a user-defined data processing function upon receiving or notifying the data.

데이터 처리 모듈(104)은 사용자 정의 기반 데이터 처리 기능을 수행한 결과 생성된 가공 데이터를 분산형 저장소(103)에 저장함에 있어, 분산형 저장소(103)의 특정 영역에 데이터가 집중되어 저장되는 것을 방지하기 위해 데이터 관리정책 모듈(105)에서 제공하는 가공 데이터 분배 규칙 정보를 이용하여 가공 데이터를 저장한다. 이하, 데이터 관리정책 모듈(105)에 대해서 구체적으로 살펴본다.The data processing module 104 stores the processed data generated as a result of performing the user-defined data processing function in the distributed storage 103 so that data is concentrated in a specific area of the distributed storage 103 The processed data is stored using the processed data distribution rule information provided by the data management policy module 105. [ Hereinafter, the data management policy module 105 will be described in detail.

데이터 관리정책 모듈(105)은 데이터 처리 모듈(104)에서 발생된 가공 데이터가 분산형 저장소(103)의 특정 영역에 편중되어 저장됨으로써 분산형 저장소(103)의 성능이 저하되는 문제를 방지하고자, 가공 데이터를 분산형 저장소(103)의 전체 영역에 균등하게 저장하기 위한 데이터 분배 규칙을 등록하고 조회할 수 있는 사용자 인터페이스를 제공한다. 분산형 저장소(103)의 실시 예 중에 하나인 Key-Value 분산형 저장소의 경우, 데이터의 Key 값에 의해 데이터 분배가 이루어지기 때문에 사용자는 데이터를 균등하게 저장할 수 있는 Key 값 생성 규칙을 데이터 관리정책 모듈(105)에 등록함으로써 분산형 저장소(103)의 성능 저하 문제를 방지할 수 있다.The data management policy module 105 has a function of preventing the degradation of the performance of the distributed storage 103 by processing data generated in the data processing module 104 in a specific area of the distributed storage 103, And provides a user interface for registering and inquiring data distribution rules for uniformly storing processing data in the entire area of the distributed repository 103. In the case of the key-value distributed storage, which is one of the embodiments of the distributed storage 103, since the data is distributed by the key value of the data, the user can store the key value generation rule, By registering in the module 105, the performance degradation problem of the distributed storage 103 can be prevented.

이하, 도 5를 참조하여 Key-Value 분산형 저장소에서 Key 값 생성 규칙에 따라 데이터 저장방식의 일 예를 살펴본다. 도 5는 본 발명의 실시예에 따라 키-밸류 분산형 저장소에서 사용자 정의 기반 데이터 저장방식을 설명하기 위한 도면이다.Hereinafter, an example of a data storage method according to a key value generation rule in a key-value distributed storage will be described with reference to FIG. 5 is a diagram for explaining a user-defined data storage method in a key-value distributed storage according to an embodiment of the present invention.

도 5a의 테이블(501)은 유통이벤트의 Key 값을 발생시각, 유통이벤트 ID의 순서로 구성한 일 예로서, 최근에 발생된 유통이벤트가 테이블의 끝 부분에 저장된다. 도 5b의 테이블(502)는 유통이벤트의 Key 값을 유통이벤트 ID, 발생시각의 순서로 구성한 일 예로서, 유통이벤트가 유통이벤트 ID 단위로 군집되어 저장된다.The table 501 in FIG. 5A is an example in which the Key value of the distribution event is configured in the order of the generation time and the distribution event ID, and the distribution event generated recently is stored at the end of the table. The table 502 in FIG. 5B is an example in which the key value of the distribution event is configured in the order of the distribution event ID and the generation time, and the distribution event is clustered and stored in units of the distribution event ID.

도 6은 도 5a의 테이블(501)과 도 5b의 테이블(502)의 일례에 따라 유통이벤트 데이터가 Key-Value 분산형 저장소(601)내에 분배되어 저장된 상황을 비교한 것으로, Key-Value 분산형 저장소(601)는 다수의 분산 노드(602)로 구성되며, 대량의 유통이벤트 데이터는 다수의 분산 노드(602)의 데이터 저장영역(603)에 분배되어 저장된다.FIG. 6 is a diagram for comparing distribution events data distributed and stored in the key-value distribution storage 601 according to an example of the table 501 of FIG. 5A and the table 502 of FIG. 5B, The storage 601 is composed of a plurality of distribution nodes 602, and a large amount of distribution event data is distributed and stored in the data storage area 603 of the plurality of distribution nodes 602.

도 6a의 경우, 최근에 발생된 유통이벤트 데이터가 특정 데이터 저장영역(603)에 편중되어 저장됨으로써 특정 분산 노드(602)에 유통이벤트 저장 작업에 대한 부하가 집중되어 Key-Value 분산형 저장소(601)의 전체 성능이 저하될 수 있는 반면, 도 6b의 경우, 최근에 발생된 유통이벤트 데이터가 여러 분산 노드(602)에 분배되어 저장됨으로써 유통이벤트 저장 작업에 대한 부하를 분산시킬 수 있다.In the case of FIG. 6A, the distribution event data generated recently is concentrated in the specific data storage area 603, so that the load on the distribution event storage job is concentrated on the specific distribution node 602, In the case of FIG. 6B, the distribution event data generated recently can be distributed and distributed to a plurality of distribution nodes 602, so that the load on the distribution event storage operation can be dispersed.

한편, 모듈구성 관리자(106)는 데이터 수집 모듈(101), 데이터 저장 모듈(102), 데이터 처리 모듈(104), 데이터 관리정책 모듈(105)간의 관계 설정을 통해 사용자 정의 기반 데이터 처리 절차를 설정, 생성, 조회, 관리할 수 있는 기능을 제공한다.On the other hand, the module configuration manager 106 establishes a user-defined data processing procedure through the relationship setting between the data collection module 101, the data storage module 102, the data processing module 104, and the data management policy module 105 Generation, retrieval, and management.

데이터 서비스 제공자(107)는 모듈구성 관리자(106)에 의해 생성된 사용자 정의 기반 데이터 처리 절차를 수행한 결과로 분산형 저장소(103)에 저장된 데이터를 이용한 데이터 서비스를 제공한다. The data service provider 107 provides a data service using the data stored in the distributed repository 103 as a result of performing the user defined data processing procedure generated by the module configuration manager 106.

예를 들어, 데이터 서비스 제공자는(107)는 데이터 서비스를 제공하는데 있어 필요한 데이터의 효율적 검색을 위해 데이터 관리정책 모듈(105)에 등록된 데이터 분배 규칙 정보를 활용하여 데이터 서비스를 제공한다.For example, the data service provider 107 provides the data service using the data distribution rule information registered in the data management policy module 105 for efficient retrieval of data necessary for providing the data service.

이하, 도 2를 참조하여 본 발명의 실시예에 따른 대용량 순차 수집 데이터 처리를 위한 프레임워크 제공장치를 활용한 시스템의 일 예를 살펴본다.Hereinafter, an example of a system utilizing an apparatus for providing a framework for massively sequential collected data processing according to an embodiment of the present invention will be described with reference to FIG.

도 2는 본 발명에서 기술한 프레임워크 제공장치를 활용한 시스템의 일례를 나타내는 개념도로, 센서 네트워크(251)로부터 발생되는 센서데이터를 활용한 센서 정보 조회 서비스와, 물류/유통 네트워크(252)로부터 발생되는 유통이벤트 데이터를 활용한 유통 정보 조회 서비스에 관한 것이다.FIG. 2 is a conceptual diagram showing an example of a system utilizing the framework providing apparatus described in the present invention. The sensor information inquiry service utilizing the sensor data generated from the sensor network 251 and the sensor information inquiry service utilizing the sensor data generated from the distribution / distribution network 252 And a distribution information inquiry service using the generated distribution event data.

먼저, 도 1에 기술된 프레임워크를 이용하여 센서 정보 조회 서비스 제공하고자 하는 사용자는 센서데이터 처리 프로세스(210)와 센서 정보 조회 서비스(220)를 생성할 수 있다.First, a user who wants to provide a sensor information inquiry service using the framework described in FIG. 1 can generate a sensor data processing process 210 and a sensor information inquiry service 220.

사용자는 프레임워크를 이용하여 센서데이터 수집 모듈(211), 센서데이터 저장 모듈(212), 센서데이터 가공 모듈(213), 시계열패턴 추출 모듈(214), 센서데이터 관리정책 모듈(215), 시계열패턴 관리정책 모듈(216)을 생성하고, 도 1의 모듈구성 관리자(106)를 이용하여 생성된 모듈들 간의 관계를 설정하여 센서데이터 처리 프로세스(210)를 구성할 수 있다.The user uses the framework to generate a sensor data collection module 211, a sensor data storage module 212, a sensor data processing module 213, a time series pattern extraction module 214, a sensor data management policy module 215, Management policy module 216 and configure the sensor data processing process 210 by establishing a relationship between the modules generated using the module configuration manager 106 of FIG.

한편, 도 2를 통해 설명되는 '모듈'은 도 1의 프레임워크를 통해 사용자가 생성한 중간 프로그램 또는 중간 프로세스로서 하나로 일괄되어 특정 기능을 수행할 수 있는 다른 프로그램으로도 재이용할 수 있는 형으로 되어 있는 것을 의미한다. 이러한 복수의 '모듈' 집합에 대한 관계 설정을 통해 센서데이터 처리 프로세스 또는 유통 이벤트 처리 프로세스와 같은 최종 프로세스가 생성된다. The 'module' described with reference to FIG. 2 is a type that can be reused as an intermediate program or an intermediate process created by the user through the framework of FIG. 1 as another program that can collectively perform a specific function . A final process, such as a sensor data processing process or a distribution event processing process, is created through the establishment of relationships to a plurality of such 'module' sets.

다시 도 2로 돌아가면, 사용자는 도 1의 데이터 수집 모듈(101)에서 제공하는 외부 데이터 제공 시스템 연동 기능과 데이터 전처리 기능을 등록할 수 있는 사용자 인터페이스를 이용하여, 센서 네트워크 연동 기능과 센서데이터 전처리 기능을 등록함으로써 센서데이터 수집 모듈(211)을 생성할 수 있다.Returning to FIG. 2, the user can use the sensor network interworking function and the sensor data preprocessing function using the user interface capable of registering the external data providing system interworking function and the data preprocessing function provided by the data collecting module 101 of FIG. The sensor data collection module 211 can be created by registering the function.

또한, 사용자는 도 1의 데이터 저장 모듈(102)에서 제공하는 원본 데이터 저장 규칙 등록 사용자 인터페이스를 이용하여, 센서데이터 저장 규칙을 등록함으로써 센서데이터 저장 모듈(212)을 생성할 수 있다.Also, the user can create the sensor data storage module 212 by registering the sensor data storage rule using the original data storage rule registration user interface provided by the data storage module 102 of FIG.

또한, 사용자는 도 1의 데이터 처리 모듈(104)에서 제공하는 사용자 정의 기반 데이터 처리 기능과 가공 데이터 저장 규칙을 등록할 수 있는 사용자 인터페이스를 이용하여, 센서데이터 가공 기능과 센서데이터 저장 규칙을 등록함으로써 센서데이터 가공 모듈(213)을 생성하고, 시계열패턴 추출 기능과 시계열패턴 저장 규칙을 등록함으로써 시계열패턴 추출 모듈(214)을 생성할 수 있다.In addition, the user registers the sensor data processing function and the sensor data storage rule using a user-defined data processing function provided by the data processing module 104 of FIG. 1 and a user interface capable of registering the processed data storage rule The time series pattern extracting module 214 can be generated by generating the sensor data processing module 213 and registering the time series pattern extracting function and the time series pattern storing rule.

그리고 사용자는 센서데이터 가공 기능의 경우 센서 네트워크로부터 수집된 센서데이터에 대해 즉각적인 처리가 필요하기 때문에 센서데이터 가공 모듈(213)을 즉시 수행 방식으로 설정하고, 시계열패턴 추출 기능의 경우 일정 시간 동안 수집된 센서데이터에 대한 처리가 요구되는 작업이기 때문에 시계열패턴 추출 모듈(214)을 주기적 수행 방식으로 설정할 수 있다.In the case of the sensor data processing function, the user needs to immediately process the sensor data collected from the sensor network. Therefore, the user sets the sensor data processing module 213 to the immediate execution type. In the case of the time-series pattern extraction function, The time series pattern extraction module 214 can be set as a periodic execution type because it is a task that requires processing on sensor data.

또한, 사용자는 도 1의 데이터 관리정책 모듈(105)에서 제공하는 데이터 분배 규칙 등록 사용자 인터페이스를 이용하여, 센서데이터 분배 규칙을 등록함으로써 센서데이터 관리정책 모듈(215)을 생성하고, 시계열패턴 데이터 분배 규칙을 등록함으로써 시계열패턴 관리정책 모듈(216)을 생성할 수 있다.Also, the user creates the sensor data management policy module 215 by registering the sensor data distribution rule using the data distribution rule registration user interface provided by the data management policy module 105 of FIG. 1, The time series pattern management policy module 216 can be created by registering the rules.

또한, 사용자는 도 1의 데이터 서비스 제공자(107)에서 제공하는 데이터 관리정책 모듈(105)과 분산형 저장소(103) 접근 기능을 이용하여 센서데이터 통계조회 서비스(221)와 시계열패턴 조회 서비스(222)를 생성할 수 있다.The user can access the sensor data statistics inquiry service 221 and the time series pattern inquiry service 222 using the data management policy module 105 and the distributed repository 103 access function provided by the data service provider 107 of FIG. Can be generated.

센서데이터 통계조회 서비스(221)는 센서데이터 통계 조회 요청에 대해 센서데이터 관리정책 모듈(215)에 등록된 센서데이터 분배 규칙 정보를 이용하여 해당 조회 조건을 만족하는 센서데이터의 저장 영역을 알아내어, 해당 영역 내의 센서데이터에 대한 통계 처리를 통해 센서 통계 조회 요청을 효율적으로 처리할 수 있다.The sensor data statistics inquiry service 221 finds the storage area of the sensor data satisfying the query condition using the sensor data distribution rule information registered in the sensor data management policy module 215 for the sensor data statistics inquiry request, The sensor statistics inquiry request can be efficiently processed through the statistical processing on the sensor data in the corresponding area.

시계열패턴 조회 서비스(222)는 시계열패턴 조회 요청에 대해 시계열패턴 관리정책 모듈(216)에 등록된 시계열패턴 데이터 분배 규칙 정보를 이용하여 해당 조회 조건을 만족하는 시계열 패턴의 저장 영역을 알아내어, 시계열패턴 조회 요청을 효율적으로 처리할 수 있다.The time series pattern inquiry service 222 uses the time series pattern distribution rule information registered in the time series pattern management policy module 216 for the time series pattern inquiry request to find a storage area of the time series pattern satisfying the inquiry condition, The pattern query request can be efficiently processed.

또한, 사용자는 향후 새로운 종류의 센서 데이터의 수집이 필요한 경우, 센서데이터 수집 모듈(211)의 센서데이터 전처리 기능과, 센서데이터 저장 모듈(212)의 센서데이터 저장 규칙의 변경을 통해 센서데이터 처리 프로세스(210)를 변경 및 확장할 수 있으며, 센서데이터를 활용한 새로운 정보 조회 서비스가 필요한 경우 해당 서비스를 위한 데이터 처리 모듈, 데이터 관리정책, 데이터 서비스 제공자를 프레임워크 내에 추가함으로써 센서 정보 조회 서비스(220)를 확장할 수 있다.In addition, when a new kind of sensor data is required to be collected in the future, the user can perform a sensor data processing process through the sensor data preprocessing function of the sensor data acquisition module 211 and the sensor data storage rule of the sensor data storage module 212 A data management module 210, a data management policy, and a data service provider for a corresponding service are added to the framework, so that the sensor information inquiry service 220 ) Can be extended.

다른 한편으로, 도 1에 기술된 프레임워크를 이용하여 유통 정보 조회 서비스 제공하고자 하는 사용자는 유통이벤트 처리 프로세스(230)와 유통 정보 조회 서비스(240)를 생성할 수 있다.On the other hand, a user who wants to provide a distribution information inquiry service using the framework described in FIG. 1 can generate a distribution event processing process 230 and a distribution information inquiry service 240. [

사용자는 유통이벤트 수집 모듈(231), 유통이벤트 저장 모듈(232), 유통이벤트 가공 모듈(233), 유통이력 추출 모듈(234), 유통이벤트 관리정책 모듈(235), 유통이력 관리정책 모듈(236)을 생성하고, 도 1의 모듈구성 관리자(106)를 이용하여 생성된 모듈의 관계를 설정하여 유통이벤트 처리 프로세스를 구성한다.The user has a distribution event collection module 231, a distribution event storage module 232, a distribution event processing module 233, a distribution history extraction module 234, a distribution event management policy module 235, a distribution history management policy module 236 ), And establishes a relationship of modules generated by using the module configuration manager 106 of FIG. 1 to configure a distribution event processing process.

사용자는 도 1의 데이터 수집 모듈(101)에서 제공하는 외부 시스템 연동 기능과 데이터 전처리 기능 등록 인터페이스를 이용하여, 물류/유통 네트워크로부터 유통이벤트를 수집하기 위한 HTTP 서버 기능과 XML 형식의 유통이벤트 메시지에 대한 전처리 기능을 등록함으로써 유통이벤트 수집 모듈(231)을 생성할 수 있다.The user uses an external system interworking function and a data preprocessing function registration interface provided by the data collecting module 101 of FIG. 1 to perform an HTTP server function for collecting distribution events from the distribution / distribution network and a distribution event message The distribution event collecting module 231 can be created by registering the preprocessing function.

또한, 사용자는 도 1의 데이터 저장 모듈(102)에서 제공하는 원본 데이터 저장 규칙 등록 인터페이스를 이용하여, 유통이벤트 저장 규칙을 등록함으로써 유통이벤트 저장 모듈(232)을 생성할 수 있다.Also, the user can create the distribution event storage module 232 by registering the distribution event storage rule using the original data storage rule registration interface provided by the data storage module 102 of FIG.

또한, 사용자는 도 1의 데이터 처리 모듈(104)에서 제공하는 사용자 정의 기반 데이터 처리 기능과 가공 데이터 저장 규칙 등록 인터페이스를 이용하여, 유통이벤트 가공 기능과 유통이벤트 저장 규칙을 등록함으로써 유통이벤트 가공 모듈(233), 유통이력 추출 기능과 유통이력 저장 규칙을 등록함으로써 유통이력 추출 모듈(234)을 생성할 수 있다. 그리고 사용자는 유통이벤트 가공 기능과 유통이력 추출 기능이 실시간 처리가 요구되기 때문에 유통이벤트 가공 모듈(233)과 유통이력 추출 모듈(234)의 데이터 처리 방식을 즉시 수행 방식으로 설정한다.Also, the user registers the distribution event processing function and the distribution event storage rule using the user-defined data processing function and the processed data storage rule registration interface provided by the data processing module 104 of FIG. 1, 233), the distribution history extracting module 234 can be created by registering the distribution history extracting function and the distribution history storing rule. Since the distribution event processing function and the distribution history extraction function require real-time processing, the user sets the data processing method of the distribution event processing module 233 and the distribution history extraction module 234 as an immediate execution method.

또한, 사용자는 도 1의 데이터 관리정책 모듈(105)에서 제공하는 데이터 분배 규칙 등록 인터페이스를 이용하여, 유통이벤트 분배 규칙을 등록함으로써 유통이벤트 관리정책 모듈(235), 유통이력 분배 규칙을 등록함으로써 유통이력 관리정책 모듈(236)을 생성할 수 있다.Also, the user registers the distribution event distribution rule by using the data distribution rule registration interface provided by the data management policy module 105 of FIG. 1 to register the distribution event distribution policy module 235 and the distribution history distribution rule, The history management policy module 236 can be generated.

또한, 사용자는 도 1의 데이터 서비스 제공자(107)에서 제공하는 데이터 관리정책 모듈(105)과 분산형 저장소(103) 접근 기능을 이용하여 유통이벤트 조회 서비스(241)와 유통이력 조회 서비스(242)를 생성할 수 있다. 유통이벤트 조회 서비스(241)는 유통이벤트 조회 요청에 대해 유통이벤트 관리정책 모듈(235)에 등록된 유통이벤트 분배 규칙 정보를 이용하여 해당 조건을 만족하는 유통이벤트의 저장 영역을 알아내어, 해당 영역에서의 유통이벤트 검색을 통해 유통이벤트 조회 요청을 효율적으로 처리할 수 있다. 유통이력 조회 서비스(242)는 특정 상품에 대한 유통이력 조회 요청에 대해 유통이력 관리정책 모듈(236)에 등록된 유통이력 분배 규칙 정보를 이용하여 해당 상품의 이력 데이터가 저장된 영역을 알아내어 유통이력 조회 요청에 대해 빠르게 응답할 수 있다.The user can access the distribution event inquiry service 241 and the distribution history inquiry service 242 using the data management policy module 105 and the distributed repository 103 access function provided by the data service provider 107 of FIG. Lt; / RTI > The distribution event inquiry service 241 uses the distribution event distribution rule information registered in the distribution event management policy module 235 for the distribution event inquiry request to find out the storage area of the distribution event satisfying the corresponding condition, The distribution event inquiry request can be efficiently processed through the retrieval of the distribution event of the distribution event. The distribution history inquiry service 242 obtains the area where the history data of the corresponding commodity is stored by using the distribution history distribution rule information registered in the distribution history management policy module 236 for the distribution history inquiry request for the specific commodity, You can respond quickly to query requests.

또한, 사용자는 향후 각 물류창고에 보관 중인 상품에 대한 재고 정보 조회 서비스가 필요한 경우, 해당 서비스를 위한 데이터 처리 모듈, 데이터 관리정책 모듈, 데이터 서비스 제공자를 생성하고 기존의 유통이벤트 저장 모듈(232)로부터 유통이벤트를 전달받음으로써 재고 정보 조회 서비스를 제공할 수 있다.In addition, when the user needs an inventory information inquiry service for goods stored in each warehouse in the future, the user creates a data processing module, a data management policy module, and a data service provider for the service, The inventory information inquiry service can be provided.

한편, 본 발명에 따른 프레임워크 제공장치는 프레임워크를 이용하여 다양한 분야에서 데이터를 효율적으로 처리할 수 있는 방법을 제공할 수 있다. Meanwhile, an apparatus for providing a framework according to the present invention can provide a method for efficiently processing data in various fields using a framework.

도 3에는 본 발명에서 기술된 프레임워크를 이용하여 물류/유통 네트워크(302)에서 상품의 유통 과정에서 발생되는 유통이벤트 데이터를 수집, 처리, 조회 기능을 제공하는 물류/유통 데이터 공유 시스템(301)의 다양한 실시 예가 도시된다.FIG. 3 shows a distribution / distribution data sharing system 301 for collecting, processing, and displaying distribution event data generated in a distribution process of goods in the distribution / distribution network 302 using the framework described in the present invention. Are shown.

물류/유통 데이터 공유 시스템(301)은 상품의 제조에서부터 판매까지의 유통 전 과정에서 발생되는 유통이벤트를 다수의 물류/유통 참여자들이 공유할 수 있는 환경을 제공함으로써 다양한 종류의 응용 시스템을 구축할 수 있도록 한다. 도 3의 의약품 유통이력 추적 시스템(303)은 물류/유통 데이터 공유 시스템(301)으로부터 의약품에 대한 유통이벤트를 제공받아, 의약품의 유통 상황을 실시간으로 모니터링 함으로써 불법 의약품이 유통되는 사고를 미연에 방지할 수 있다. 도 3의 재고관리 시스템(304)은 물류/유통 데이터 공유 시스템(301)으로부터 각 물류/유통 거점에서 상품의 입출고 수량을 제공받아, 각 상품의 재고상황을 파악함으로써 재고 부족 또는 과잉 문제가 발생하기 전에 재고관리 계획을 수립할 수 있다. 도 3의 유효기한 관리 시스템(305)은 물류/유통 데이터 공유 시스템(301)으로부터 식품에 대한 유통이벤트를 제공받아, 유효기한이 만료된 식품이 유통되어 소비자들의 건강을 해치는 문제를 예방할 수 있다. 도 3의 불법 거래 추척 시스템(306)은 물류/유통 데이터 공유 시스템(301)으로부터 상품의 거래 데이터를 제공받아, 상품에 대한 불법 거래 상황을 추적할 수 있다.The distribution / distribution data sharing system 301 can provide a variety of application systems by providing an environment in which a plurality of logistics / distribution participants can share distribution events occurring in the entire distribution process from manufacture to sale . The drug distribution history tracking system 303 of FIG. 3 receives distribution events for medicines from the distribution / distribution data sharing system 301 and monitors the distribution status of medicines in real time to prevent accidents in which illicit drugs are distributed can do. The inventory management system 304 of FIG. 3 receives the quantity of goods delivered and received from each distribution / distribution point from the distribution / distribution data sharing system 301 and determines the inventory status of each product, You can establish inventory management plans in advance. The expiration time management system 305 shown in FIG. 3 can receive the distribution event for the food from the distribution / distribution data sharing system 301 and prevent the problem that the expired food is circulated to harm the health of the consumers. The illegal transaction tracking system 306 of FIG. 3 can receive the transaction data of the goods from the distribution / distribution data sharing system 301 and track the illegal transaction status of the goods.

이하, 도 7을 참조하여 본 발명의 실시예에 따른 프레임워크 제공장치에서 수행되는 데이터 처리방법에 대해서 구체적으로 설명한다. 데이터의 처리는 사용자 정의에 의해 생성된 중간 프로세스에 의해 각각 수행되며, 프레임워크는 중간 프로세스를 생성할 수 있도록 하는 사용자 인터페이스를 사용자에게 제공한다. Hereinafter, a data processing method performed in the framework providing apparatus according to the embodiment of the present invention will be described in detail with reference to FIG. The processing of data is each performed by an intermediate process created by the user, and the framework provides the user with a user interface that allows the intermediate process to be created.

구체적으로 설명하면, 본 발명의 실시예에 따른 프레임워크 제공장치는 사용자 인터페이스를 통해 등록되는 외부 시스템 연동 기능과 데이터 전처리 기능에 기초하여 데이터 수집 프로세스를 생성하고, 사용자 인터페이스를 통해 등록되는 사용자가 정의한 원본 데이터 저장 규칙에 기초하여 원본 데이터 저장 프로세스를 생성한다.More specifically, the apparatus for providing a framework according to an embodiment of the present invention generates a data collection process based on an external system interworking function and a data preprocessing function registered through a user interface, And creates a source data storage process based on the source data storage rules.

또한, 프레임워크 제공장치는 사용자 인터페이스를 통해 등록된 사용자 정의 기반 데이터 처리 기능과 상기 가공 데이터 저장 규칙에 기초하여 적어도 하나 이상의 원본 데이터 처리 프로세스를 생성하고, 상기 가공 데이터를 상기 분산형 데이터 저장소에 분배하여 저장하기 위한 데이터 분배 규칙을 등록하고, 상기 데이터 분배 규칙에 기초하여 가공 데이터 분배 프로세스를 생성한다.In addition, the framework providing apparatus may further comprise: at least one or more original data processing processes based on user-defined data processing functions registered through a user interface and based on the processed data storing rules, and distributing the processed data to the distributed data storage , And creates a processed data distribution process based on the data distribution rule.

마지막으로, 프레임워크 제공장치는 상기 각각의 프로세스 사이의 관계 설정을 통해 사용자 정의 기반 데이터 처리를 위한 통합 프로세스를 생성하여 데이터를 처리한다. Finally, the framework providing apparatus creates an integrated process for user-defined based data processing through relationship setting between the respective processes to process the data.

도 7을 참조하면, 먼저 데이터 수집 모듈(101)은 다양한 종류의 데이터를 제공하는 외부 데이터 제공 시스템과 연동하여 데이터를 수집하고, 수집된 데이터(이하, 원본 데이터)에 대해 사용자 정의 기반의 전처리 작업을 수행한다. 전처리 과정을 거친 원본 데이터는 데이터 저장 모듈(102)로 전달된다(S10). Referring to FIG. 7, the data acquisition module 101 collects data in cooperation with an external data providing system that provides various types of data, and prepares a user-defined preprocessing operation . The preprocessed original data is transferred to the data storage module 102 (S10).

그리고, 데이터 저장 모듈(102)은 데이터 수집 모듈(101)로부터 전달받은 데이터의 원본을 분산형 저장소(103)에 저장한다(S20). 이때, 데이터 저장 모듈(102)은 사용자 정의에 따른 원본 데이터 저장 규칙에 기초하여 원본 데이터를 저장하고 유지, 관리함으로써 향후 새로운 서비스 제공이 필요하거나 기존 서비스의 변경이 필요한 상황에서도 원활하게 대응할 수 있으며, 데이터 처리 결과의 오류 검증에도 활용할 수 있다. The data storage module 102 stores the original data received from the data collection module 101 in the distributed storage 103 (S20). At this time, the data storage module 102 stores, maintains and manages the original data based on the original data storage rule according to the user definition, so that the data storage module 102 can smoothly cope with a situation where a new service is required in the future, It can also be used for error checking of data processing results.

다음, 데이터 처리 모듈(104)은 데이터 저장 모듈(102)로부터 전달 또는 통지 받은 데이터를 사용자가 정의한 데이터 처리 규칙에 기초하여 처리하고, 그 처리 결과로 생성된 가공 데이터를 분산형 저장소(103)에 저장하는 기능을 제공한다(S30).Next, the data processing module 104 processes data received or notified from the data storage module 102 based on a data processing rule defined by the user, and transmits the processed data generated as a result of the processing to the distributed storage 103 (S30).

한편, 데이터 관리정책 모듈(105)은 데이터 처리 모듈(104)에서 발생된 가공 데이터가 분산형 저장소(103)의 특정 영역에 편중되어 저장됨으로써 분산형 저장소(103)의 성능이 저하되는 문제를 방지하고자, 가공 데이터를 분산형 저장소(103)의 전체 영역에 균등하게 저장하기 위한 데이터 분배 규칙을 등록하고 조회할 수 있는 사용자 인터페이스를 제공한다(S40). 사용자 인터페이스를 통해 가공 데이터를 분산 저장하기 위한 데이터 분배 규칙이 등록되면, 데이터 처리 모듈(104)은 상기 데이터 분배 규칙에 기초하여 가공 데이터를 분산형 저장소(103)에 분배하여 저장한다(S50). On the other hand, the data management policy module 105 prevents the problem that the performance of the distributed storage 103 is deteriorated by processing data generated in the data processing module 104 being accumulated in a specific area of the distributed storage 103 (S40), a user can register and inquire data distribution rules for uniformly storing the processed data in the entire area of the distributed storage 103. [ When the data distribution rule for distributing the processed data through the user interface is registered, the data processing module 104 distributes the processed data to the distributed storage 103 based on the data distribution rule and stores the distributed data in S50.

이후, 데이터 서비스 제공자(107)로부터 상기 분산형 저장소(103)에 저장된 원본 데이터 또는 가공 데이터에 대한 조회 요청이 있는 경우, 데이터 서비스 제공자는(107)는 데이터 서비스를 제공하는데 있어 필요한 데이터의 효율적 검색을 위해 데이터 관리정책 모듈(105)에 등록된 데이터 분배 규칙 정보를 활용하여 데이터 서비스를 사용자에게 제공한다(S60).Thereafter, when there is a request for inquiry about the original data or the processed data stored in the distributed storage 103 from the data service provider 107, the data service provider 107 transmits an efficient search for data necessary for providing the data service The data distribution rule information registered in the data management policy module 105 is used to provide the data service to the user (S60).

한편, 상술한 본 발명에 따른 프레임워크 제공장치에서의 데이터 처리방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현되는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체로는 컴퓨터 시스템에 의하여 해독될 수 있는 데이터가 저장된 모든 종류의 기록 매체를 포함한다. 예를 들어, ROM(Read Only Memory), RAM(Random Access Memory), 자기 테이프, 자기 디스크, 플래시 메모리, 광 데이터 저장장치 등이 있을 수 있다. 또한, 컴퓨터로 판독 가능한 기록매체는 컴퓨터 통신망으로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 읽을 수 있는 코드로서 저장되고 실행될 수 있다.Meanwhile, the data processing method in the apparatus for providing a framework according to the present invention described above can be implemented as a computer-readable code on a computer-readable recording medium. The computer-readable recording medium includes all kinds of recording media storing data that can be decoded by a computer system. For example, there may be a ROM (Read Only Memory), a RAM (Random Access Memory), a magnetic tape, a magnetic disk, a flash memory, an optical data storage device and the like. The computer-readable recording medium may also be distributed and executed in a computer system connected to a computer network and stored and executed as a code that can be read in a distributed manner.

본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 보호범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구의 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.
It will be understood by those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description, and all changes or modifications derived from the scope of the claims and their equivalents should be construed as being included within the scope of the present invention.

Claims

A method for processing data in an apparatus for providing a framework for mass sequential collection data processing,
Performing a user-defined pre-processing operation on original data collected by an external data providing apparatus;
Storing the original data in a distributed data store based on a storage rule defined by a user;
Processing the storage data of the original data or the original data based on a data processing rule defined by a user to generate processed data;
Registering a data distribution rule for distributing and storing the processed data to the distributed data repository; And
Distributing the processed data to the distributed data store based on the data distribution rule and storing
/ RTI >

The method according to claim 1,
Transmitting the original data or the processed data to the user terminal by referring to the storage rule defined by the user or the data distribution rule when there is a user inquiry in the original data stored in the distributed data store or the processed data
Further comprising the steps of:

2. The method of claim 1, wherein the user-
And a step of generating a data collection process based on an external system interworking function and a data preprocessing function registered through a user interface
/ RTI >

2. The method of claim 1, wherein storing in the distributed data store comprises:
And generating a source data storage process based on a user-defined source data storage rule registered through a user interface
/ RTI >

2. The method according to claim 1,
And generating at least one original data processing process based on the user-defined data processing function registered through the user interface and the processed data storing rule
/ RTI >

2. The method of claim 1, wherein registering the data distribution rule comprises:
And registering a key value generation rule capable of evenly storing the processed data when the distributed data repository is a key-value distributed data repository
/ RTI >

A method for processing data in an apparatus for providing a framework for mass sequential collection data processing,
Generating a data collection process based on an external system interworking function and a data preprocessing function registered through a user interface;
Creating a source data storage process based on a user-defined source data storage rule registered through a user interface;
Creating at least one original data processing process based on user-defined data processing functions registered through a user interface and the processed data storage rules;
Registering a data distribution rule for distributing and storing the processed data to the distributed data store, and generating a processed data distribution process based on the data distribution rule; And
Generating an integrated process for user-defined based data processing through establishing a relationship between each of the processes
/ RTI >

A nonvolatile memory having stored thereon program code for providing a framework for mass sequential collection data processing, the apparatus comprising at least one processor for executing the program code,
The framework comprises:
A data collection module for performing a user-defined preprocessing operation on original data collected by an external data provider;
A data storage module for storing the original data in a distributed data storage based on a storage rule defined by a user;
A data processing module for processing the original data or the storage information of the original data based on a data processing rule defined by a user to generate processed data;
And a data management policy module for registering a data distribution rule for distributing and storing the processed data to the distributed data repository.

9. The data acquisition system according to claim 8,
Providing a user interface for registering user-defined pre-processing functions including external system interworking and data parsing
In which the framework is provided.

9. The apparatus of claim 8, wherein the data storage module comprises:
Providing a user interface for registering original data storage rules according to user definition
In which the framework is provided.

9. The apparatus of claim 8, wherein the data storage module comprises:
Wherein the data processing module registers the original data in the data processing module based on a data processing rule defined by a user in the data processing module when the necessary data extraction operation from the original data or a processing operation on the original data is required, , Or notifies the storage information of the original data
In which the framework is provided.

The data processing system according to claim 8,
And providing a user interface capable of registering the data processing rule defined by the user and the processed data storing rule
In which the framework is provided.

The data processing system according to claim 8,
Processing the original data or the storage information of the original data in an immediate or periodic manner based on a data processing rule defined by a user
In which the framework is provided.

9. The data management method according to claim 8,
The method of claim 1, wherein if the distributed data store is a key-value distributed data store, the data distribution rule stores the processed data evenly A key value generation rule
In which the framework is provided.