KR102167791B1

KR102167791B1 - Apparatus for Managing In-memory and Computer-Readable Recording Medium with Program therefor

Info

Publication number: KR102167791B1
Application number: KR1020140150089A
Authority: KR
Inventors: 정유선
Original assignee: 에스케이 텔레콤주식회사
Priority date: 2014-10-31
Filing date: 2014-10-31
Publication date: 2020-10-19
Also published as: KR20160050917A

Abstract

블록의 매핑 정보를 저장하는 하나의 네임 노드와 대용량 데이터를 분산하여 저장하는 복수의 데이터 노드를 포함하는 분산처리 시스템에서 인-메모리 레이어를 관리하는 장치에 있어서, 데이터 블록을 인-메모리에 저장하라는 요청인 인-메모리 요청을 수신하는 요청 수신부; 상기 인-메모리 요청을 분석하여 상기 데이터 블록에 대한 매핑정보를 획득하는 매핑정보 전송부; 및 상기 매핑정보를 수신한 클라이언트로부터 상기 데이터 블록을 인-메모리로 저장하라는 인-메모리 명령을 수신하여 상기 데이터 블록을 인-메모리로 저장하는 저장부를 포함하는 인-메모리 레이어 관리 장치 및 그를 위한 컴퓨터로 읽을 수 있는 기록 매체를 제공한다.In a device that manages an in-memory layer in a distributed processing system that includes one name node storing block mapping information and a plurality of data nodes that distribute and store large-capacity data, it is required to store data blocks in in-memory. A request receiving unit for receiving an in-memory request as a requester; A mapping information transmission unit for analyzing the in-memory request and obtaining mapping information for the data block; And a storage unit for storing the data block as an in-memory by receiving an in-memory command to store the data block as an in-memory from a client receiving the mapping information, and a computer therefor. Provides a recording medium that can be read by.

Description

In-memory layer management device and computer-readable recording medium therefor {Apparatus for Managing In-memory and Computer-Readable Recording Medium with Program therefor}

본 실시예는 인-메모리 레이어 관리 장치 및 그를 위한 컴퓨터로 읽을 수 있는 기록 매체에 관한 것이다.The present embodiment relates to an in-memory layer management apparatus and a computer-readable recording medium therefor.

이 부분에 기술된 내용은 단순히 본 실시예에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The content described in this section merely provides background information on the present embodiment and does not constitute the prior art.

기존의 하둡 기반의 데이터 처리는 하둡 기반 시스템인 하둡 분산파일시스템(HDFS: Hadoop Distributed File System)에 의존적이다. 하둡 시스템에 의해 처리되어야 하는 데이터가 여러 단계를 거쳐 데이터가 실행되는 경우 매번 단계의 실행이 끝날 때마다 중간 파일을 HDFS에 쓰게 되어 디스크(Disk) I/O를 유발함으로써 전체 하둡 시스템의 성능에 영향을 줄 수 있다.
[선행기술문헌] 한국 공개특허공보 10-2012-0085400 (2012.08.01)Existing Hadoop-based data processing depends on the Hadoop Distributed File System (HDFS), a Hadoop-based system. When data to be processed by the Hadoop system is executed through several steps, intermediate files are written to HDFS at the end of each step, causing disk I/O to affect the performance of the entire Hadoop system. Can give
[Prior technical literature] Korean Patent Application Publication 10-2012-0085400 (2012.08.01)

본 실시예는 소프트웨어적으로 분산 인-메모리(In-memory)를 구현하여 중간파일을 In-memory에 저장하여 Disk I/O를 감소시키고 처리 속도를 향상시키는데 주된 목적이 있다.The main object of this embodiment is to reduce disk I/O and improve processing speed by implementing distributed in-memory in software and storing intermediate files in in-memory.

SQL(Structured Query Language)과 유사한 질의어가 사용 가능한 환경을 제공하면서 다양한 데이터 스키마를 동적으로 생성 가능하도록 하는 데에도 그 목적이 있다.Its purpose is to provide an environment in which a query language similar to SQL (Structured Query Language) can be used, and to dynamically create various data schemas.

본 실시예의 일 측면에 의하면, 블록의 매핑 정보를 저장하는 하나의 네임 노드와 대용량 데이터를 분산하여 저장하는 복수의 데이터 노드를 포함하는 분산처리 시스템에서 인-메모리 레이어를 관리하는 장치에 있어서, 데이터 블록을 인-메모리에 저장하라는 요청인 인-메모리 요청을 수신하는 요청 수신부; 상기 인-메모리 요청을 분석하여 상기 데이터 블록에 대한 매핑정보를 획득하는 매핑정보 전송부; 및 상기 매핑정보를 수신한 클라이언트로부터 상기 데이터 블록을 인-메모리로 저장하라는 인-메모리 명령을 수신하여 상기 데이터 블록을 인-메모리로 저장하는 저장부를 포함하는 것을 특징으로 하는 인-메모리 레이어 관리 장치를 제공한다.According to an aspect of the present embodiment, there is provided an apparatus for managing an in-memory layer in a distributed processing system including one name node for storing block mapping information and a plurality of data nodes for distributing and storing large-capacity data. A request receiving unit for receiving an in-memory request, which is a request to store a block in the in-memory; A mapping information transmission unit for analyzing the in-memory request and obtaining mapping information for the data block; And a storage unit for storing the data block as an in-memory by receiving an in-memory command for storing the data block as an in-memory from a client receiving the mapping information. Provides.

본 실시예의 다른 측면에 의하면, 컴퓨터에, 데이터 블록을 인-메모리에 저장하라는 요청인 인-메모리 요청을 수신하는 과정; 상기 인-메모리 요청을 분석하여 상기 데이터 블록에 대한 매핑정보를 획득하는 과정; 및 상기 매핑정보를 수신한 클라이언트로부터 상기 데이터 블록을 인-메모리로 저장하라는 인-메모리 명령을 수신하여 상기 데이터 블록을 인-메모리로 저장하는 과정을 실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다.According to another aspect of the present embodiment, there is provided a process of receiving an in-memory request, which is a request to store a data block in an in-memory, to a computer; Analyzing the in-memory request to obtain mapping information for the data block; And a computer-readable recording in which a program for executing a process of storing the data block as an in-memory by receiving an in-memory command to store the data block in in-memory from a client receiving the mapping information. Provide the medium.

이상에서 설명한 바와 같이, 본 실시예에 의하면, 빈번하게 접근하는 작은 파일을 메모리에 로드하여 처리 효율성을 높일 수 있으며, 공통코드나 메타정보 등 모든 데이터 처리시에 공통적으로 필요한 데이터들을 메모리에 적재하여 효율을 높이는 효과가 있다.As described above, according to the present embodiment, it is possible to increase processing efficiency by loading a small file that is frequently accessed into the memory, and loads commonly required data in the memory when processing all data such as common code and meta information. There is an effect of increasing the efficiency.

또한, 어플리케이션에서 데이터 처리 중간에 발생하는 중간 파일을 메모리에 적재 가능하도록 구현되는 경우 디스크 I/O가 감소하며 이로 인해 해당 어플리케이션의 전체 처리 속도가 향상된다.In addition, when an application is implemented so that an intermediate file that occurs in the middle of data processing can be loaded into a memory, disk I/O is reduced, which improves the overall processing speed of the application.

도 1은 본 발명의 일 실시예에 따른 인-메모리 레이어 관리 장치(100)를 포함하는 HDFS 시스템을 블록도로 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 인-메모리 레이어 관리 장치(100)를 포함하는 시스템을 블록도로 도시한 도면이다.
도 3은 본 발명의 다른 실시예에 따른 인-메모리 레이어 관리 방법을 도시한 흐름도이다.1 is a block diagram illustrating an HDFS system including an in-memory layer management apparatus 100 according to an embodiment of the present invention.
2 is a block diagram illustrating a system including the in-memory layer management apparatus 100 according to an embodiment of the present invention.
3 is a flowchart illustrating an in-memory layer management method according to another embodiment of the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 본 발명의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In describing an embodiment of the present invention, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present invention, a detailed description thereof will be omitted.

도 1은 본 발명의 일 실시예에 따른 인-메모리 레이어 관리 장치(100)를 포함하는 HDFS 시스템을 블록도로 도시한 도면이다.1 is a block diagram illustrating an HDFS system including an in-memory layer management apparatus 100 according to an embodiment of the present invention.

도 1에 도시한 바와 같이, HDFS 시스템(10)은 일반적으로 대용량의 데이터를 분산하여 저장하고 있는 복수의 데이터 노드(11, 12, 13)와, 데이터 블록이 저장되어 있는 데이터 노드(11, 12, 13)의 위치에 대한 정보인 매핑 정보를 저장하는 하나의 네임노드(14) 및 앱 클라이언트(130)를 포함하여 구현된다.As shown in FIG. 1, the HDFS system 10 generally includes a plurality of data nodes 11, 12, and 13 that distribute and store large amounts of data, and data nodes 11 and 12 in which data blocks are stored. It is implemented by including one name node 14 and an app client 130 that stores mapping information, which is information on the location of 13).

인-메모리 레이어 관리 장치(100)는 복수의 데이터 노드(11, 12, 13) 중에서 하나의 데이터 노드(예컨대, 데이터 노드1) 내에서 동작할 수도 있으나 반드시 이에 한정되는 것은 아니며 구성요소 별로 여러 개의 노드에 분산되어 동작될 수도 있다.The in-memory layer management apparatus 100 may operate within one data node (eg, data node 1) among a plurality of data nodes 11, 12, and 13, but is not limited thereto. It can also be operated distributed across nodes.

도 2는 본 발명의 일 실시예에 따른 인-메모리 레이어 관리 장치(100)를 포함하는 시스템을 블록도로 도시한 도면이다.2 is a block diagram illustrating a system including the in-memory layer management apparatus 100 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 인-메모리 레이어 관리 장치(100)는 요청 수신부(111), 매핑정보 전송부(112), 명령 전송부(130), 저장부(140) 및 인-메모리 클라이언트(150)를 포함하여 구현될 수 있다. 실시예에 따라서 인-메모리 레이어 관리 장치(100)는 요청 수신부(111), 매핑정보 전송부(112), 명령 전송부(130), 저장부(140) 및 인-메모리 클라이언트(150)라는 구성요소 중에서 일부 구성요소는 생략되거나 다른 구성요소를 추가로 포함하여 구현될 수 있다.The in-memory layer management apparatus 100 according to an embodiment of the present invention includes a request receiving unit 111, a mapping information transmitting unit 112, a command transmitting unit 130, a storage unit 140, and an in-memory client ( 150) may be implemented. According to the embodiment, the in-memory layer management apparatus 100 includes a request receiving unit 111, a mapping information transmitting unit 112, a command transmitting unit 130, a storage unit 140, and an in-memory client 150. Some of the elements may be omitted or may be implemented by additionally including other elements.

인-메모리 레이어 관리 장치(100)는 블록의 매핑 정보를 저장하는 하나의 네임 노드와 대용량 데이터를 분산하여 저장하는 복수의 데이터 노드를 포함하는 하둡 분산처리 시스템(10)에서 실행되도록 구현된다.The in-memory layer management apparatus 100 is implemented to be executed in the Hadoop distributed processing system 10 including one name node that stores mapping information of a block and a plurality of data nodes that distribute and store large amounts of data.

요청 수신부(111)는 데이터 블록을 인-메모리에 저장하라는 요청인 인-메모리 요청을 수신한다. 인-메모리 요청은 명령 전송부(130)에서 사용자가 CLI(Command Line Interface)를 이용한 입력 등에 의해 발생한다.The request receiving unit 111 receives an in-memory request, which is a request to store a data block in an in-memory. The in-memory request is generated by a user input from the command transmission unit 130 using a command line interface (CLI).

매핑정보 전송부(112)는 요청 수신부(111)로부터 수신된 인-메모리 요청을 분석하여 저장될 데이터 블록에 대한 매핑정보를 획득하여 명령 전송부(130)로 전송한다.The mapping information transmission unit 112 analyzes the in-memory request received from the request reception unit 111 to obtain mapping information for a data block to be stored, and transmits it to the command transmission unit 130.

명령 전송부(130)는 요청 수신부(111)로 전송하는 인-메모리 요청을 발생하며, 매핑정보 전송부(112)로부터 매핑정보를 수신하고 수신한 매핑정보에 대응되는 데이터 노드로 접근하여 해당 데이터 블록을 인-메모리에 저장하라는 명령을 저장부(140)에 전송한다. 여기서, 명령 전송부(130)는 인-메모리 레이어 관리 장치(100)의 외부에 존재하는 앱 클라이언트일 수도 있으며 HDFS(10) 내에서 실행되는 다른 어플리케이션일 수도 있으며, 이하의 설명에서는 명령 전송부(130) 대신 앱 클라이언트(130)로서 예를 들어 설명한다.The command transmission unit 130 generates an in-memory request transmitted to the request reception unit 111, receives mapping information from the mapping information transmission unit 112, accesses a data node corresponding to the received mapping information, and A command to store the block in the in-memory is transmitted to the storage unit 140. Here, the command transmission unit 130 may be an app client that exists outside the in-memory layer management apparatus 100, or may be another application executed within the HDFS 10, and in the following description, the command transmission unit ( 130) instead of the app client 130 will be described as an example.

여기서, 요청 수신부(111) 및 매핑정보 전송부(112)는 하나의 드라이버(110) 모듈로서 구현될 수 있다.Here, the request receiving unit 111 and the mapping information transmitting unit 112 may be implemented as a single driver 110 module.

저장부(140)는 매핑정보를 수신한 앱 클라이언트(130)로부터 해당 데이터 블록을 인-메모리에 저장하라는 명령을 수신하여 해당 데이터 블록을 인-메모리에 저장한다. 여기서 저장부(140)는 요청 수신부(111) 및 매핑정보 전송부(112)와는 다른 데이터 노드에 저장될 수도 있다.The storage unit 140 receives a command to store the data block in the in-memory from the app client 130 that has received the mapping information, and stores the data block in the in-memory. Here, the storage unit 140 may be stored in a data node different from the request receiving unit 111 and the mapping information transmitting unit 112.

매핑정보 전송부(112)는 해당 메타정보를 저장하고 있지 않은 경우에는 네임노드(14)에 해당 메타정보를 요청하여 매핑정보를 획득한다.When the corresponding meta information is not stored, the mapping information transmission unit 112 requests the corresponding meta information from the name node 14 to obtain the mapping information.

참고로, 매핑정보 전송부(112)가 네임노드(14)에 해당 메타정보를 요청하여 매핑정보를 획득한 경우, 드라이버(110)는 인-메모리에 해당 메타정보를 저장하여 다음 번에 해당 메타정보가 필요할 경우에 인-메모리에 저장된 것을 사용한다.For reference, when the mapping information transmission unit 112 requests the meta information from the name node 14 to obtain the mapping information, the driver 110 stores the meta information in the in-memory and stores the meta information next time. When the information is needed, use the one stored in memory.

앱 클라이언트(130)가 어느 하나의 노드(11) 내에 존재하는 인-메모리 레이어 관리 장치(100)의 드라이버(110)에 접속하여 인-메모리 설정에 필요한 정보를 수신하고 앱 클라이언트(130)는 어느 한 데이터 노드에 존재하는 저장부(140)로 명령을 보내어 인-메모리 영역을 설정하고 설정된 인-메모리 영역에 데이터를 저장한다. 예컨대, 드라이버(110)와 저장부(140)는 서로 같은 데이터 노드(11, 12, 13)에 존재할 수도 있고 서로 다른 데이터 노드(11, 12, 13)에 존재할 수도 있다.The app client 130 accesses the driver 110 of the in-memory layer management device 100 existing in any one node 11 to receive information necessary for in-memory setting, and the app client 130 An in-memory area is set by sending a command to the storage unit 140 existing in one data node, and data is stored in the set in-memory area. For example, the driver 110 and the storage unit 140 may exist in the same data nodes 11, 12, and 13 or may exist in different data nodes 11, 12, and 13.

HDFS에서는 일반적으로 리소스 매니저로서 YARN(Yet Another Resource Negotiator)을 사용한다. 하둡 YARN 프레임워크는 하둡에서 어플리케이션마다 자원을 할당하고 규칙에 따라 자원을 분배하는 리소스 관리자이다. 하둡의 어플리케이션들은 메모리 또는 저장공간 등의 리소스의 할당 또는 해제를 위하여 YARN에게 명령을 전송하는 형태로 구현된다.In HDFS, YARN (Yet Another Resource Negotiator) is generally used as a resource manager. Hadoop YARN framework is a resource manager that allocates resources for each application in Hadoop and distributes resources according to rules. Hadoop applications are implemented in the form of sending commands to YARN to allocate or release resources such as memory or storage space.

앱 클라이언트(130)에서 생성되는 인-메모리 명령은, HDFS 시스템(10)에서 특정 어플리케이션 실행 도중에 쓰기가 이루어져서 HDFS 상의 데이터 노드의 디스크에 중간 파일로서 생성되는 특정 데이터에 대하여 특정 인-메모리 영역에 함께 저장하라는 명령일 수도 있다. 이 경우에, 이 중간 파일이 저장되는 특정 인-메모리 영역은 미리 공간이 확보된다. 인-메모리 영역의 공간확보는 해당 특정 어플리케이션에 의해 생성되는 중간 파일을 특정 인-메모리 영역에 저장하는 것으로 지정함으로써 특정 인-메모리 영역 공간을 확보한다.The in-memory command generated by the app client 130 is written in the HDFS system 10 during the execution of a specific application, so that the specific data generated as an intermediate file on the disk of the data node on the HDFS is together in a specific in-memory area. It could be an order to save. In this case, the specific in-memory area in which this intermediate file is stored is reserved in advance. In order to secure space in the in-memory area, a specific in-memory area is secured by designating an intermediate file created by a specific application to be stored in a specific in-memory area.

여기서, 생성되는 중간 파일은 하나의 어플리케이션의 동작 과정에서 해당 어플리케이션이 다음번에 사용할 목적으로 발생될 수도 있고, 복수개의 어플리케이션이 실행되는 과정에서 하나의 어플리케이션이 생성한 중간 파일을 다른 어플리케이션이 사용할 수도 있다.Here, the generated intermediate file may be generated for the next use of the application during the operation of one application, or another application may use the intermediate file generated by one application during the execution of a plurality of applications. .

인-메모리 클라이언트(150)는 HDFS 분산처리 시스템에서 생성된 어떤 하나의 어플리케이션에 대하여 인-메모리를 접근하도록 해준다. 예컨대, 하나의 어플리케이션에서 접근하는 데이터 노드마다 각각 하나의 인-메모리 클라이언트(150)가 생성된다.The in-memory client 150 allows in-memory access to any one application created in the HDFS distributed processing system. For example, one in-memory client 150 is generated for each data node accessed by one application.

인-메모리 클라이언트(150)는 드라이버(110)와 통신하여 해당 어플리케이션이 사용하는 인-메모리가 어느 데이터 노드에 있는지 확인하고 해당 데이터 노드와 직접 통신하여 필요한 인-메모리 데이터를 가져오도록 해준다. 만일, 인-메모리 데이터 영역에 원하는 데이터가 없는 경우에는 HDFS에서 읽어와서 캐싱을 하고 인-메모리에도 저장한다. 여기서 인-메모리 클라이언트(150)가 데이터 노드와 통신한다는 표현을 사용하였으나, 실제로는 데이터 노드에 있는 인-메모리 관리용 워커(Worker)와 통신하는 기능을 한다. 이외에도 본 실시예에서 데이터 노드와 어떤 동작을 수행하는 것으로 설명하는 경우도 모두 해당 동작을 수행하는 워커가 해당 데이터 노드에 구비되어 있을 수도 있다.The in-memory client 150 communicates with the driver 110 to determine in which data node the in-memory used by the application is located, and directly communicates with the corresponding data node to obtain necessary in-memory data. If there is no desired data in the in-memory data area, it is read from HDFS, cached, and stored in in-memory. Here, the expression that the in-memory client 150 communicates with the data node is used, but actually functions to communicate with a worker for in-memory management in the data node. In addition, in the present embodiment, even when a data node and a certain operation are described as being performed, a worker performing the corresponding operation may be provided in the data node.

어떤 데이터 노드에서 특정 인-메모리 공간의 확보는, 앱 클라이언트(130)에서 해당 데이터 노드에 대하여 인-메모리 공간을 확보하라는 명령이 발생되는 경우, YARN에 요청함으로써 인-메모리 공간이 확보될 수 있다. 이러한 명령은 사용자 명령에 의해서 생성될 수도 있고, 어떤 실행 어플리케이션에서 생성될 수도 있다.In order to secure a specific in-memory space in a certain data node, when a command to secure an in-memory space for the data node is issued from the app client 130, the in-memory space may be secured by requesting YARN. . These commands may be generated by a user command, or may be generated by any running application.

특정 데이터를 인-메모리에 적재하는 방법으로는, 중간 파일을 저장하는 경우처럼 특정 데이터(예컨대, 메타 데이터)가 어플리케이션으로부터 생성되는 경우에 인-메모리 공간에 저장되도록 설정할 수도 있다. 또한, 사용자 명령에 의해 특정 HDFS 저장 블록을 확보된 인-메모리 공간에 적재하도록 하라는 명령이 생성되면 드라이버(110)를 통하여 해당 블록에 대한 매핑 정보를 가져와서 해당 블록을 관리하는 데이터 노드의 워커에게 메모리 적재 명령을 내릴 수도 있다.As a method of loading specific data into the in-memory, it may be set to be stored in the in-memory space when specific data (eg, meta data) is generated by an application, such as when storing an intermediate file. In addition, when a command to load a specific HDFS storage block into the secured in-memory space is generated by a user command, the mapping information for the corresponding block is obtained through the driver 110 to the worker of the data node managing the block. You can also issue a memory load command.

한편, 어떤 데이터 블록을 인-메모리에 저장하는 경우, 하나의 데이터 노드만이 아니 다른 데이터 노드에도 중복하여 저장할 수도 있다. 인-메모리 영역이 할당되는 경우에, 해당 영역에 대한 복제 개수 및 영역 해제 기간 등이 설정될 수 있다. 인-메모리 영역의 할당 및 해제는 파일 기반으로 또는 디렉토리 기반으로 할당 및 해제될 수 있다.On the other hand, when a certain data block is stored in the in-memory, not only one data node may be repeatedly stored in another data node. When an in-memory area is allocated, the number of copies and an area release period for the area can be set. Allocation and deallocation of the in-memory area can be allocated and deallocated on a file or directory basis.

한편, 인-메모리 영역에 대한 용량 관리를 위하여, 주기적으로 또는 비주기적으로 앱 클라이언트(130)에서 자원 모니터링 명령(또는 질의)를 보내서 드라이버(110)에게 전송한다. 드라이버(110)는 자원 모니터링 명령을 수신하면 해당 자원 모니터링 결과를 앱 클라이언트(130)에게 전송한다. 앱 클라이언트(130)에서는 인-메모리 가용 용량이 부족하다고 판단하는 경우에는 YARN에게 요청하여 일정 용량의 인-메모리를 추가로 할당받는다. 여기서 추가로 할당받는 명령은 사용자에 의하여 명령을 수신할 수도 있고, 경우에 따라서는 리소스 관리부(160)라는 특정 데몬(Daemon process)에 의해 리소스를 모니터링하다가 리소스가 모자라는 경우(예컨대, 인-메모리 가용 용량이 임계치 이하인 경우) 자동적으로 기설정 사이즈의 인-메모리 추가할당을 요청하는 인-메모리 요청 명령을 YARN에게 제공하여 YARN으로부터 리소스를 추가로 할당받을 수도 있다.Meanwhile, in order to manage the capacity of the in-memory area, periodically or aperiodically, the app client 130 sends a resource monitoring command (or query) to the driver 110. When the driver 110 receives the resource monitoring command, it transmits the corresponding resource monitoring result to the app client 130. When the app client 130 determines that the available in-memory capacity is insufficient, it requests the YARN to additionally allocate a certain amount of in-memory. Here, the command to be additionally allocated may receive a command by the user, and in some cases, when a resource is insufficient while monitoring a resource by a specific daemon called the resource management unit 160 (e.g., in-memory When the usable capacity is less than the threshold), additional resources may be allocated from YARN by automatically providing an in-memory request command for requesting additional allocation of in-memory of a preset size to YARN.

드라이버(110)는 처음 구동되는 경우 YARN에게 드라이버 id를 알려주고 리소스를 요청한다. 드라이버(110)는 워커마다 얼마만큼의 메모리를 할당할지에 대한 정보와 워커의 개수를 설정한다. 워커는 여러 데이터 노드에 분산되어 존재할 수 있다.When the driver 110 is driven for the first time, it informs the driver ID to YARN and requests a resource. The driver 110 sets information on how much memory to allocate for each worker and the number of workers. Workers can be distributed across multiple data nodes.

앱 클라이언트(130)는 드라이버 id를 YARN에게 보내고 YARN으로부터 해당 드라이버 주소 정보를 받아서 해당하는 드라이버(130)와 통신하여 질의를 드라이버(130)로 전송한다.The app client 130 sends a driver id to YARN, receives the driver address information from YARN, communicates with the corresponding driver 130, and transmits a query to the driver 130.

도 3은 본 발명의 다른 실시예에 따른 인-메모리 레이어 관리 방법을 도시한 흐름도이다.3 is a flowchart illustrating an in-memory layer management method according to another embodiment of the present invention.

본 발명의 다른 실시예에 따른 동적 스키마 질의처리 방법은, 인-메모리 요청 발생과정(S310), 인-메모리 요청 수신과정(S320), 매핑정보 생성과정(S330), 인-메모리 명령 전송과정(S340) 및 인-메모리 저장과정(S350)을 포함한다.A dynamic schema query processing method according to another embodiment of the present invention includes an in-memory request generation process (S310), an in-memory request reception process (S320), a mapping information generation process (S330), and an in-memory command transmission process ( S340) and an in-memory storage process (S350).

인-메모리 요청 발생과정(S310)에서는 사용자 또는 어플리케이션에서 데이터 블록을 인-메모리에 저장하라는 인-메모리 요청을 발생한다.In the process of generating an in-memory request (S310), a user or an application generates an in-memory request to store a data block in the in-memory.

인-메모리 요청 수신과정(S320)에서는 데이터 블록을 인-메모리에 저장하라는 요청인 인-메모리 요청을 수신한다.In the in-memory request receiving process (S320), an in-memory request, which is a request to store a data block in the in-memory, is received.

매핑정보 생성과정(S330)에서는 인-메모리 요청을 분석하여 저장될 데이터 블록에 대한 매핑정보를 획득하고 이를 앱 클라이언트(130)로 전송한다.In the mapping information generation process (S330), the in-memory request is analyzed to obtain mapping information for a data block to be stored, and it is transmitted to the app client 130.

인-메모리 명령 전송과정(S340)에서는 앱 클라이언트(130)가 매핑정보를 수신하여 수신된 매핑정보에 대응되는 데이터 노드로 접근하여 저장할 데이터 블록에 대한 정보와 함께 데이터 블록을 인-메모리로 저장하라는 인-메모리 명령을 전송한다.In the in-memory command transmission process (S340), the app client 130 receives the mapping information, accesses the data node corresponding to the received mapping information, and asks to store the data block as an in-memory together with information on the data block to be stored. Send in-memory commands.

인-메모리 저장과정(S350)에서는 인-메모리 명령을 수신하여 데이터 블록을 인-메모리 저장공간으로 저장한다.In the in-memory storage process (S350), the data block is stored in the in-memory storage space by receiving the in-memory command.

인-메모리 요청 발생과정(S310) 및 인-메모리 명령 전송과정(S340)에서의 동작은 앱 클라이언트(130)의 동작에 대응되고, 인-메모리 요청 수신과정(S320)은 요청 수신부(111)의 동작에 대응되고, 매핑정보 생성과정(S330)은 매핑정보 전송부(112)의 동작에 대응되고, 인-메모리 저장과정(S350)은 저장부(150)의 동작에 대응되므로 더 이상의 상세한 설명은 생략한다.The operations in the in-memory request generation process (S310) and the in-memory command transmission process (S340) correspond to the operation of the app client 130, and the in-memory request reception process (S320) is performed by the request receiving unit 111 Since the operation corresponds to the operation, the mapping information generation process (S330) corresponds to the operation of the mapping information transmission unit 112, and the in-memory storage process (S350) corresponds to the operation of the storage unit 150, further detailed description Omit it.

도 3에서는 과정 S310 내지 과정 S350을 순차적으로 실행하는 것으로 기재하고 있으나, 이는 본 발명의 일 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것이다. 다시 말해, 본 발명의 일 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 일 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 3 기재된 순서를 변경하여 실행하거나 과정 S310 내지 과정 S350 중 하나 이상의 과정을 병렬적으로 실행하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이므로, 도 3은 시계열적인 순서로 한정되는 것은 아니다.In FIG. 3, it is described that processes S310 to S350 are sequentially executed, but this is merely illustrative of the technical idea of an embodiment of the present invention. In other words, a person of ordinary skill in the art to which an embodiment of the present invention belongs can change the order described in FIG. 3 and execute one or more of the processes S310 to S350 without departing from the essential characteristics of the embodiment of the present invention. Since the process is executed in parallel and can be applied by various modifications and variations, FIG. 3 is not limited to a time series order.

본 발명의 실시예에 따른 인-메모리 레이어 관리 장치(100)는 각종 기기 또는 유무선 통신망과 통신을 수행하기 위한 통신 모뎀 등의 통신 장치, 프로그램을 실행하기 위한 데이터를 저장하기 위한 메모리, 프로그램을 실행하여 연산 및 제어하기 위한 마이크로프로세서 등을 구비하는 다양한 장치를 의미할 수 있다. 적어도 일 실시예에 따르면, 메모리는 램(Random Access Memory: RAM), 롬(Read Only Memory: ROM), 플래시 메모리, 광 디스크, 자기 디스크, 솔리드 스테이트 디스크(Solid State Disk: SSD) 등의 컴퓨터로 판독 가능한 기록/저장매체일 수 있다. 적어도 일 실시예에 따르면, 마이크로프로세서는 명세서에 기재된 동작과 기능을 하나 이상 선택적으로 수행하도록 프로그램될 수 있다. 적어도 일 실시예에 따르면, 마이크로프로세서는 전체 또는 부분적으로 특정한 구성의 주문형반도체(Application Specific Integrated Circuit: ASIC) 등의 하드웨어로써 구현될 수 있다.The in-memory layer management apparatus 100 according to an embodiment of the present invention executes a communication device such as a communication modem for performing communication with various devices or wired/wireless communication networks, a memory for storing data for executing a program, and a program. Thus, it may mean various devices including a microprocessor for calculation and control. According to at least one embodiment, the memory is a computer such as a random access memory (RAM), a read only memory (ROM), a flash memory, an optical disk, a magnetic disk, or a solid state disk (SSD). It may be a readable recording/storing medium. According to at least one embodiment, the microprocessor may be programmed to selectively perform one or more operations and functions described in the specification. According to at least one embodiment, the microprocessor may be implemented entirely or partially as hardware such as an Application Specific Integrated Circuit (ASIC) having a specific configuration.

전술한 바와 같이, 도 3에 기재된 인-메모리 레이어 관리 방법은 프로그램으로 구현되고 컴퓨터로 읽을 수 있는 기록매체에 기록될 수 있다. 본 실시예에 따른 인-메모리 레이어 관리 방법을 구현하기 위한 프로그램이 기록되고 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 이러한 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수도 있다. 또한, 본 실시예를 구현하기 위한 기능적인(Functional) 프로그램, 코드 및 코드 세그먼트들은 본 실시예가 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있을 것이다.As described above, the in-memory layer management method described in FIG. 3 may be implemented as a program and recorded on a computer-readable recording medium. A recording medium in which a program for implementing the in-memory layer management method according to the present embodiment is recorded and the computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of such computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage devices. In addition, the computer-readable recording medium may be distributed over a computer system connected through a network, and computer-readable codes may be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the present embodiment may be easily inferred by programmers in the technical field to which the present embodiment belongs.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present embodiment, and those of ordinary skill in the technical field to which the present embodiment belongs will be able to make various modifications and variations without departing from the essential characteristics of the present embodiment. Accordingly, the present exemplary embodiments are not intended to limit the technical idea of the present exemplary embodiment, but are illustrative, and the scope of the technical idea of the present exemplary embodiment is not limited by these exemplary embodiments. The scope of protection of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present embodiment.

10: HDFS 시스템
11: 데이터 노드1
12: 데이터 노드2
13: 데이터 노드n
14: 네임 노드
100: 인-메모리 레이어 관리 장치
110: 드라이버
111: 요청 수신부
112: 매핑정보 전송부
130: 명령 전송부, 앱 클라이언트
140: 저장부
150: 인-메모리 클라이언트10: HDFS system
11: data node 1
12: data node 2
13: data node n
14: Name node
100: in-memory layer management device
110: driver
111: request receiver
112: mapping information transmission unit
130: command transmission unit, app client
140: storage unit
150: in-memory client

Claims

In a distributed file system that includes a name node that stores mapping information, which is information about the location of a data node where data blocks are stored, and a plurality of data nodes that distribute and store large amounts of data, In the device for managing the memory layer,
A request receiving unit for receiving an in-memory request, which is a request to store the data block in an in-memory;
A mapping information transmission unit that obtains the mapping information from the name node based on the in-memory request; And
An in-memory command to receive the mapping information from the mapping information transmission unit, access a first data node, which is a data node corresponding to the mapping information among the plurality of data nodes, to store the data block as an in-memory Command transmission unit for transmitting to the first data node
In-memory layer management apparatus comprising a.

delete

The method of claim 1,
Wherein the first data node includes a storage unit for receiving the in-memory command and storing the data block as an in-memory.

delete

The method of claim 1,
Wherein the in-memory command is a command to store specific data to be written in the distributed file system in an in-memory area.

The method of claim 5, wherein when it is designated to store an intermediate file generated by one application among applications operating in the distributed file system in the in-memory area, the intermediate file is stored in the in-memory area. In-memory layer management device.

The method of claim 1,
And an in-memory client for accessing the in-memory for an application generated in the distributed file system.

The method of claim 1,
Including a driver module including the request receiving unit and the mapping information transmission unit,
The driver module receives a request for an in-memory area for storing data as an in-memory from a user and transmits a request for the in-memory area to a resource management device of the distributed file system, and the in-memory area In-memory layer management device, characterized in that the allocation of.

The method of claim 8,
When the available capacity of the in-memory area is less than or equal to a threshold, the in-memory layer management apparatus further comprises a resource management unit for additionally allocating an in-memory area.

delete

On the computer,
In a distributed file system that includes a name node that stores mapping information, which is information about the location of a data node where data blocks are stored, and a plurality of data nodes that distribute and store large amounts of data, In a method for a device managing a memory layer to manage an in-memory layer,
Receiving an in-memory request, which is a request to store the data block in in-memory, from a user;
Obtaining the mapping information from the name node based on the in-memory request; And
Receiving the mapping information, accessing a first data node corresponding to the mapping information among the plurality of data nodes, and sending an in-memory command to store the data block as an in-memory to the first data node Transfer process
A recording medium that can be read by a computer on which a program to run is recorded.