KR20210106929A

KR20210106929A - Apparatus for distributed processing through remote direct memory access and method for the same

Info

Publication number: KR20210106929A
Application number: KR1020210107834A
Authority: KR
Inventors: 안신영; 임은지; 최용석; 우영춘; 최완
Original assignee: 한국전자통신연구원
Priority date: 2018-01-17
Filing date: 2021-08-17
Publication date: 2021-08-31
Also published as: KR102292389B1; KR20190087783A; KR102372424B1

Abstract

According to one embodiment of the present invention, provided is a distributed processing apparatus through remote direct memory access, comprising: a communication unit that directly accesses remote memories provided in a plurality of shared memory servers constituting a shared memory server cluster to transmit and receive data; a shared memory access management unit that allocates a local shared memory area to a memory by the same size as a shared memory buffer and synchronizes the local shared memory area with the shared memory buffer for the shared memory buffer configured with shared memory buffer segments allocated from the remote memories; a memory mapping table management unit that manages a memory mapping table between the shared memory buffer and the local shared memory area; and a calculation unit that performs a calculation given for the local shared memory area.

Description

Distributed processing device and method through remote direct memory access

본 발명은 원격 직접 메모리 접근을 통한 분산 처리 장치 및 그 방법에 관한 것으로, 구체적으로 분산 처리 프레임워크에서 공유 메모리 서버들의 메모리들로 구성된 가상의 공유 메모리 버퍼에 직접 접근하는 분산 처리 장치 및 그 방법에 관한 것이다.The present invention relates to a distributed processing apparatus and method through remote direct memory access, and more particularly, to a distributed processing apparatus and method for directly accessing a virtual shared memory buffer composed of memories of shared memory servers in a distributed processing framework it's about

분산 병렬 처리란 다수의 계산 자원을 동시에 병렬로 사용하여 대규모 데이터 분석을 빠르게 수행하는 것이다. 다수의 계산 노드에 분산 병렬 실행되는 프로세스들은 상호간에 데이터 공유가 필수적이며 대표적인 데이터 공유 방식으로 MPI(Message Passing Interface)를 들 수 있다. 그러나 분산 처리 데이터의 일부만을 일부 프로세스 간에 메시지 패싱 형태로 전달하는 형태가 아니라 지속적으로 전체 분산 처리 프로세스 간에 전체 처리 데이터를 비동기적으로 업데이트하고 참조하는 경우에는 MPI 방식보다는 공유 메모리 형태로 공유하는 것이 더 유리하다. Distributed parallel processing is to perform large-scale data analysis quickly by using multiple computational resources in parallel at the same time. Processes that are distributed and executed in parallel on multiple computation nodes must share data with each other, and MPI (Message Passing Interface) is a representative data sharing method. However, in the case of continuously updating and referencing the entire distributed processing data asynchronously between all distributed processing processes rather than passing only a part of the distributed processing data in the form of message passing between some processes, it is better to share it in the form of shared memory rather than the MPI method. It is advantageous.

분산 처리 플랫폼에서 분산 처리를 수행하는 프로세스들은 상호 간에 대규모 공유 데이터를 빈번하게 송수신해야 하며, 이에 따른 통신 오버헤드는 전체 분산 처리 성능이나 처리 시간에서 차지하는 비중이 매우 높다. 통신 오버헤드가 높을수록 계산 노드의 계산 프로세서(예컨대, CPU, GPU 등)들은 대기하는 시간이 길어지고 이는 자원 사용률 저하로 나타난다. 통신 오버헤드가 높은 이유는 TCP/IP를 포함한 대부분의 통신 프로토콜 스택은 응용 프로세스가 보내는 메시지를 다단계의 프로토콜 레이어를 통해 처리하는 프로토콜 처리 오버헤드와 프로토콜 처리중에 1회 이상 메모리 복사가 발생 때문이다. 따라서, 원격 직접 메모리 접근(RDMA: Remote Direct Memory Access)을 통하여 분산 처리에 따른 통신 오버헤드를 줄이는 것이 요구된다.Processes performing distributed processing on a distributed processing platform must frequently transmit and receive large-scale shared data with each other, and the resulting communication overhead occupies a very high proportion in the overall distributed processing performance or processing time. The higher the communication overhead, the longer the waiting time of the computational processors (eg, CPU, GPU, etc.) of the computation node, and this appears as a decrease in resource utilization. The reason for the high communication overhead is that most communication protocol stacks, including TCP/IP, process the messages sent by the application process through multi-level protocol layers, and the memory copy occurs more than once during protocol processing. Therefore, it is required to reduce communication overhead due to distributed processing through remote direct memory access (RDMA).

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라할 수는 없다.The above-mentioned background art is technical information possessed by the inventor for the derivation of the present invention or acquired in the process of derivation of the present invention, and cannot necessarily be said to be a known technique disclosed to the general public prior to the filing of the present invention.

국내 공개특허공보 제10-2006-0009244호Domestic Patent Publication No. 10-2006-0009244

본 발명의 목적은 원격 직접 메모리 접근을 통하여 분산 처리 데이터를 공유하는 원격 직접 메모리 접근을 통한 분산 처리 장치 및 그 방법을 제공하는 것이다.It is an object of the present invention to provide a distributed processing apparatus and method through remote direct memory access that share distributed processing data through remote direct memory access.

또한, 본 발명의 목적은 다수의 공유 메모리 서버들을 클러스터링하고 각각의 공유 메모리 서버들로부터 공유 메모리 버퍼 세그먼트들을 할당하여 공유 메모리 버퍼를 구성하고, 공유 메모리 버퍼에 원격 직접 메모리 접근하는 분산 처리 장치 및 그 방법을 제공하는 것이다.In addition, an object of the present invention is to configure a shared memory buffer by clustering a plurality of shared memory servers and allocating shared memory buffer segments from each of the shared memory servers, and a distributed processing apparatus for remote direct memory access to the shared memory buffer and its to provide a way

본 발명의 일 실시예는, 공유 메모리 서버 클러스터를 구성하는 복수의 공유 메모리 서버들에 구비된 원격 메모리들에 직접 접근하여 데이터를 송수신하는 통신부; 상기 원격 메모리들로부터 할당된 공유 메모리 버퍼 세그먼트들로 구성된 공유 메모리 버퍼에 대하여, 메모리에 상기 공유 메모리 버퍼와 동일한 크기만큼 로컬 공유 메모리 영역을 할당하고, 상기 공유 메모리 버퍼와 상기 로컬 공유 메모리 영역을 동기화하는 공유 메모리 접근 관리부; 상기 공유 메모리 버퍼와 상기 로컬 공유 메모리 영역 사이의 메모리 맵핑 테이블을 관리하는 메모리 맵핑 테이블 관리부; 및 상기 로컬 공유 메모리 영역에 대한 주어진 연산을 수행하는 연산부를 포함하는 것을 특징으로 하는, 원격 직접 메모리 접근을 통한 분산 처리 장치를 제공한다.An embodiment of the present invention, a communication unit for transmitting and receiving data by directly accessing remote memories provided in a plurality of shared memory servers constituting the shared memory server cluster; For a shared memory buffer composed of shared memory buffer segments allocated from the remote memories, a local shared memory area of the same size as the shared memory buffer is allocated in memory, and the shared memory buffer and the local shared memory area are synchronized a shared memory access management unit; a memory mapping table management unit for managing a memory mapping table between the shared memory buffer and the local shared memory area; And it provides a distributed processing device through remote direct memory access, characterized in that it comprises a calculation unit for performing a given operation on the local shared memory area.

이때, 상기 공유 메모리 접근 관리부는 직접 상기 공유 메모리 버퍼를 생성하여 동일한 분산 처리 프레임워크를 구성하는 다른 분산 처리 장치들에 공유 메모리 버퍼 정보를 공유하거나, 상기 다른 분산 처리 장치에 의하여 생성된 상기 공유 메모리 버퍼에 상응하는 공유 메모리 버퍼 정보를 수신할 수 있다.In this case, the shared memory access management unit directly creates the shared memory buffer to share shared memory buffer information with other distributed processing devices constituting the same distributed processing framework, or the shared memory created by the other distributed processing device Receive shared memory buffer information corresponding to the buffer.

이때, 상기 공유 메모리 접근 관리부는 상기 공유 메모리 서버들에 각각에 상응하는 상기 공유 메모리 버퍼 세그먼트들의 크기를 계산하고, 상기 공유 메모리 서버들에 상기 공유 메모리 버퍼 세그먼트들의 생성 및 할당을 요청하여 상기 공유 메모리 버퍼를 생성할 수 있다.In this case, the shared memory access management unit calculates the size of the shared memory buffer segments corresponding to each of the shared memory servers, and requests the shared memory servers to generate and allocate the shared memory buffer segments to the shared memory. You can create a buffer.

이때, 상기 공유 메모리 접근 관리부는 상기 연산에 의하여 상기 로컬 공유 메모리 영역의 데이터가 변경된 경우에 상기 로컬 공유 메모리 영역의 데이터를 상기 공유 메모리 버퍼에 복사하여 상기 원격 메모리들과 데이터를 동기화하고, 상기 다른 분산 처리 장치들에 의하여 상기 공유 메모리 버퍼의 데이터가 변경된 경우에 상기 공유 메모리 버퍼의 데이터를 상기 로컬 공유 메모리 영역으로 복사하여 상기 원격 메모리들과 데이터를 동기화할 수 있다.At this time, when the data of the local shared memory area is changed by the operation, the shared memory access management unit copies the data of the local shared memory area to the shared memory buffer to synchronize data with the remote memories, and When the data of the shared memory buffer is changed by the distributed processing devices, the data of the shared memory buffer may be copied to the local shared memory area to synchronize data with the remote memories.

이때, 상기 공유 메모리 접근 관리부는 두 개 이상의 공유 메모리 버퍼들 사이의 데이터 누적 연산을 수행할 수 있다.In this case, the shared memory access management unit may perform a data accumulation operation between two or more shared memory buffers.

이때, 상기 공유 메모리 접근 관리부는 상기 공유 메모리 서버들에 누적 연산을 요청하고, 상기 공유 메모리 서버들로부터 상기 공유 메모리 버퍼 세그먼트들에 대하여 누적 연산을 수행한 결과를 수신하여 상기 데이터 누적 연산을 수행할 수 있다.In this case, the shared memory access management unit requests the shared memory servers for an accumulation operation, receives the result of performing the accumulation operation on the shared memory buffer segments from the shared memory servers, and performs the data accumulation operation. can

이때, 상기 공유 메모리 접근 관리부는 상기 공유 메모리 서버들에 상기 공유 메모리 버퍼 세그먼트들의 해제 및 삭제를 요청하고, 상기 공유 메모리 버퍼 세그먼트들의 해제 및 삭제 요청의 결과들을 수신함에 따라 상기 로컬 공유 메모리 영역을 해제 및 삭제하여 상기 공유 메모리 버퍼의 사용을 종료하고, 상기 메모리 맵핑 테이블 관리부는 상기 공유 메모리 버퍼의 사용이 종료되면 상기 메모리 맵핑 테이블을 삭제할 수 있다.In this case, the shared memory access management unit requests the shared memory servers to release and delete the shared memory buffer segments, and releases the local shared memory area upon receiving the results of the request to release and delete the shared memory buffer segments. and deletion to end use of the shared memory buffer, and the memory mapping table manager may delete the memory mapping table when the use of the shared memory buffer is terminated.

본 발명의 다른 일 실시예는, 원격 직접 메모리 접근을 통하여 복수의 분산 처리 장치들과 데이터를 송수신하는 통신부; 상기 분산 처리 장치들이 직접 접근할 수 있는 메모리; 동일한 공유 메모리 서버 클러스터를 구성하는 다른 공유 메모리 서버들과 공유 메모리 버퍼를 구성하는 공유 메모리 관리부를 포함하는 것을 특징으로 하는, 공유 메모리 서버를 제공한다.Another embodiment of the present invention provides a communication unit for transmitting and receiving data to and from a plurality of distributed processing devices through remote direct memory access; a memory directly accessible to the distributed processing devices; It provides a shared memory server, characterized in that it comprises a shared memory management unit constituting a shared memory buffer with other shared memory servers constituting the same shared memory server cluster.

이때, 상기 공유 메모리 버퍼는 상기 분산 처리 장치들 각각에 대하여 상기 공유 메모리 버퍼와 동일한 크기로 할당된 로컬 공유 메모리 영역과 메모리 맵핑 테이블을 이용하여 동기화될 수 있다.In this case, the shared memory buffer may be synchronized using a memory mapping table and a local shared memory area allocated to each of the distributed processing units in the same size as the shared memory buffer.

이때, 상기 공유 메모리 관리부는 상기 분산 처리 장치로부터 상기 공유 메모리 버퍼를 생성하기 위한 공유 메모리 버퍼 세그먼트의 크기 정보와 함께 상기 공유 메모리 버퍼 세그먼트의 생성 및 할당을 요청을 수신하고, 상기 공유 메모리 버퍼 세그먼트를 생성 및 할당하여 상기 공유 메모리 버퍼를 구성할 수 있다.At this time, the shared memory management unit receives a request for creation and allocation of the shared memory buffer segment together with size information of the shared memory buffer segment for creating the shared memory buffer from the distributed processing device, and sets the shared memory buffer segment You can configure the shared memory buffer by creating and allocating it.

이때, 상기 공유 메모리 버퍼는 연산에 의하여 특정 로컬 공유 메모리 영역의 데이터가 변경된 경우에 상기 변경된 로컬 공유 메모리 영역의 데이터와 동기화되고, 변경된 데이터로 나머지 로컬 공유 메모리 영역들과 동기화될 수 있다.In this case, when data of a specific local shared memory area is changed by operation, the shared memory buffer may be synchronized with the changed data of the local shared memory area, and may be synchronized with the other local shared memory areas with the changed data.

이때, 상기 공유 메모리 관리부는 상기 분산 처리 장치로부터 두 개 이상의 공유 메모리 버퍼들 사이의 데이터 누적 연산 요청을 수신하고, 상기 데이터 누적 연산의 대상이 되는 공유 메모리 버퍼 세그먼트들에 대하여 누적 연산을 수행하고, 결과를 상기 분산 처리 장치에 반환할 수 있다.In this case, the shared memory management unit receives a data accumulation operation request between two or more shared memory buffers from the distributed processing device, and performs an accumulation operation on the shared memory buffer segments that are the target of the data accumulation operation, A result may be returned to the distributed processing unit.

이때, 상기 공유 메모리 관리부는 상기 분산 처리 장치가 상기 공유 메모리 버퍼의 사용을 종료하기 위하여 전송한 상기 공유 메모리 버퍼 세그먼트의 해제 및 삭제 요청을 수신하여 상기 공유 메모리 버퍼 세그먼트를 해제 및 삭제하고, 결과를 상기 분산 처리 장치에 반환하여 상기 분산 처리 장치가 상기 로컬 공유 메모리 영역을 해제 및 삭제하고 상기 메모리 맵핑 테이블을 삭제하도록 할 수 있다.At this time, the shared memory management unit receives the request for releasing and deleting the shared memory buffer segment sent by the distributed processing device to end the use of the shared memory buffer, releasing and deleting the shared memory buffer segment, and displaying the result By returning to the distributed processing unit, the distributed processing unit may release and delete the local shared memory area and delete the memory mapping table.

본 발명의 다른 일 실시예는, 공유 메모리 서버 클러스터를 구성하는 복수의 공유 메모리 서버들에 구비된 원격 메모리들로부터 할당된 공유 메모리 버퍼 세그먼트들로 구성된 공유 메모리 버퍼에 대하여, 메모리에 상기 공유 메모리 버퍼와 동일한 크기만큼 로컬 공유 메모리 영역을 할당하는 단계; 상기 공유 메모리 버퍼와 상기 로컬 공유 메모리 영역 사이의 메모리 맵핑 테이블을 관리하는 단계; 상기 원격 메모리들에 직접 접근하여 데이터를 송수신하여 상기 공유 메모리 버퍼와 상기 로컬 공유 메모리 영역을 동기화하는 단계; 및 상기 로컬 공유 메모리 영역에 대한 주어진 연산을 수행하는 단계를 포함하는 것을 특징으로 하는, 원격 직접 메모리 접근을 통한 분산 처리 방법을 제공한다.Another embodiment of the present invention, with respect to a shared memory buffer consisting of shared memory buffer segments allocated from remote memories provided in a plurality of shared memory servers constituting a shared memory server cluster, the shared memory buffer in memory allocating a local shared memory area of the same size as ; managing a memory mapping table between the shared memory buffer and the local shared memory area; synchronizing the shared memory buffer and the local shared memory region by directly accessing the remote memories to transmit and receive data; and performing a given operation on the local shared memory region, providing a distributed processing method through remote direct memory access.

이때, 직접 상기 공유 메모리 버퍼를 생성하여 동일한 분산 처리 프레임워크를 구성하는 다른 분산 처리 장치들에 공유 메모리 버퍼 정보를 공유하거나, 상기 다른 분산 처리 장치에 의하여 생성된 상기 공유 메모리 버퍼에 상응하는 공유 메모리 버퍼 정보를 수신하여 상기 공유 메모리 버퍼 정보를 획득하는 단계를 더 포함할 수 있다.In this case, the shared memory buffer is directly created to share shared memory buffer information with other distributed processing devices constituting the same distributed processing framework, or a shared memory corresponding to the shared memory buffer created by the other distributed processing device. The method may further include receiving buffer information to obtain the shared memory buffer information.

이때, 상기 공유 메모리 버퍼 정보를 획득하는 단계는 상기 공유 메모리 서버들에 각각에 상응하는 상기 공유 메모리 버퍼 세그먼트들의 크기를 계산하는 단계; 및 상기 공유 메모리 서버들에 상기 공유 메모리 버퍼 세그먼트들의 생성 및 할당을 요청하여 상기 공유 메모리 버퍼를 생성하는 단계를 포함할 수 있다.In this case, the step of obtaining the shared memory buffer information comprises: calculating the size of the shared memory buffer segments corresponding to each of the shared memory servers; and requesting the shared memory servers to generate and allocate the shared memory buffer segments to create the shared memory buffer.

이때, 상기 동기화하는 단계는 상기 연산에 의하여 상기 로컬 공유 메모리 영역의 데이터가 변경된 경우에 상기 로컬 공유 메모리 영역의 데이터를 상기 공유 메모리 버퍼에 복사하여 상기 원격 메모리들과 데이터를 동기화하고, 상기 다른 분산 처리 장치들에 의하여 상기 공유 메모리 버퍼의 데이터가 변경된 경우에 상기 공유 메모리 버퍼의 데이터를 상기 로컬 공유 메모리 영역으로 복사하여 상기 원격 메모리들과 데이터를 동기화할 수 있다.In this case, in the synchronizing step, when the data of the local shared memory area is changed by the operation, the data of the local shared memory area is copied to the shared memory buffer to synchronize data with the remote memories, and the other distributed When the data of the shared memory buffer is changed by the processing devices, the data of the shared memory buffer may be copied to the local shared memory area to synchronize data with the remote memories.

이때, 두 개 이상의 공유 메모리 버퍼들 사이의 데이터 누적 연산을 수행하는 단계를 더 포함할 수 있다.In this case, the method may further include performing a data accumulation operation between two or more shared memory buffers.

이때, 상기 데이터 누적 연산을 수행하는 단계는 상기 공유 메모리 서버들에 누적 연산을 요청하는 단계; 및 상기 공유 메모리 서버들로부터 상기 공유 메모리 버퍼 세그먼트들에 대하여 누적 연산을 수행한 결과를 수신하는 단계를 포함할 수 있다.In this case, performing the data accumulation operation may include: requesting the accumulation operation to the shared memory servers; and receiving a result of performing an accumulation operation on the shared memory buffer segments from the shared memory servers.

이때, 상기 공유 메모리 서버들에 상기 공유 메모리 버퍼 세그먼트들의 해제 및 삭제를 요청하고, 상기 공유 메모리 버퍼 세그먼트들의 해제 및 삭제 요청의 결과들을 수신함에 따라 상기 로컬 공유 메모리 영역을 해제 및 삭제하고, 상기 메모리 맵핑 테이블을 삭제하여 상기 공유 메모리 버퍼의 사용을 종료하는 단계를 더 포함할 수 있다.In this case, the shared memory servers request release and deletion of the shared memory buffer segments, and release and delete the local shared memory area upon receiving the results of the request for release and deletion of the shared memory buffer segments, and the memory The method may further include terminating use of the shared memory buffer by deleting the mapping table.

본 발명에 따르면, 원격 직접 메모리 접근을 통한 분산 처리 장치 및 그 방법에 의해, 원격 직접 메모리 접근을 통하여 분산 처리 데이터를 공유함으로써 분산 처리시에 발생하는 통신 오버로드를 효과적으로 낮출 수 있다.According to the present invention, communication overload occurring during distributed processing can be effectively reduced by sharing distributed processing data through remote direct memory access by using the distributed processing apparatus and method through remote direct memory access.

또한, 본 발명은 원격 직접 메모리 접근을 통한 분산 처리 장치 및 그 방법에 의해, 다수의 공유 메모리 서버들을 클러스터링하고 각각의 공유 메모리 서버들로부터 공유 메모리 버퍼 세그먼트들을 할당하여 공유 메모리 버퍼를 구성하고 분산 처리 장치가 공유 메모리 버퍼에 원격 직접 메모리 접근함으로써, 공유 메모리 서버들 사이의 별도의 동기화 작업이 없이 효율적으로 메모리 데이터를 관리할 수 있다.In addition, the present invention configures a shared memory buffer and distributes processing by clustering a plurality of shared memory servers and allocating shared memory buffer segments from each shared memory server by a distributed processing apparatus and method through remote direct memory access. By remote direct memory access to the shared memory buffer by the device, memory data can be efficiently managed without a separate synchronization operation between the shared memory servers.

도 1은 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 시스템의 구성을 나타낸 도면이다.
도 2는 도 1에 도시된 원격 직접 메모리 접근을 통한 분산 처리 장치의 일 예를 나타낸 블록도이다.
도 3은 도 1에 도시된 공유 메모리 서버의 일 예를 나타낸 블록도이다.
도 4는 본 발명의 일 실시예에 따른 공유 메모리 버퍼를 구성하는 방법을 나타낸 도면이다.
도 5는 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법을 나타낸 동작 흐름도이다.
도 6은 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계의 일 예를 나타낸 동작 흐름도이다.
도 7은 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계의 일 예를 나타낸 동작 흐름도이다.
도 8은 본 발명의 일 실시예에 따른 공유 메모리 버퍼들의 데이터 누적 연산 방법을 나타낸 동작이다.
도 9는 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법을 나타낸 동작 흐름도이다.1 is a diagram showing the configuration of a distributed processing system through remote direct memory access according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating an example of a distributed processing apparatus through remote direct memory access illustrated in FIG. 1 .
3 is a block diagram illustrating an example of the shared memory server shown in FIG. 1 .
4 is a diagram illustrating a method of configuring a shared memory buffer according to an embodiment of the present invention.
5 is a flowchart illustrating a distributed processing method through remote direct memory access according to an embodiment of the present invention.
6 is an operation flowchart illustrating an example of a step of creating and allocating the shared memory buffer shown in FIG. 5 .
7 is an operation flowchart illustrating an example of a step of releasing and deleting the shared memory buffer shown in FIG. 5 .
8 is an operation illustrating a data accumulation operation method of shared memory buffers according to an embodiment of the present invention.
9 is an operation flowchart illustrating a data accumulation operation method of the shared memory buffers shown in FIG. 8 .

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 본 발명의 효과 및 특징, 그리고 그것들을 달성하는 방법은 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 여기서, 반복되는 설명, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능, 및 구성에 대한 상세한 설명은 생략한다. 본 발명의 실시형태는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위해서 제공되는 것이다. 따라서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다. Since the present invention can apply various transformations and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail. Effects and features of the present invention, and a method of achieving them, will become apparent with reference to the embodiments described below in detail in conjunction with the drawings. Here, repeated descriptions, well-known functions that may unnecessarily obscure the gist of the present invention, and detailed descriptions of configurations will be omitted. The embodiments of the present invention are provided in order to more completely explain the present invention to those of ordinary skill in the art. Accordingly, the shapes and sizes of elements in the drawings may be exaggerated for clearer description.

그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성되어 다양한 형태로 구현될 수 있다. 이하의 실시예에서, 제1, 제2 등의 용어는 한정적인 의미가 아니라 하나의 구성 요소를 다른 구성 요소와 구별하는 목적으로 사용되었다. 또한, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는한 복수의 표현을 포함한다. 또한, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.However, the present invention is not limited to the embodiments disclosed below, but all or some of the embodiments may be selectively combined and implemented in various forms. In the following embodiments, terms such as first, second, etc. are used for the purpose of distinguishing one component from another, not in a limiting sense. Also, the singular expression includes the plural expression unless the context clearly dictates otherwise. In addition, terms such as include or have means that the features or components described in the specification are present, and do not preclude the possibility that one or more other features or components will be added.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면 부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, and when described with reference to the drawings, the same or corresponding components are given the same reference numerals, and the overlapping description thereof will be omitted. .

도 1은 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 시스템(100)의 구성을 나타낸 도면이다.1 is a diagram illustrating the configuration of a distributed processing system 100 through remote direct memory access according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 시스템(100)에서 복수의 원격 직접 메모리 접근을 통한 분산 처리 장치들(110)은 원격 직접 메모리 접근(RDMA: Remote Direct Memory Access) 지원 네트워크(130)을 통해 복수의 공유 메모리 서버들(120)과 상호 연결된다. 여기서, 공유 메모리 서버들(120)은 하나의 공유 메모리 서버 클러스터(140)를 구성한다.Referring to FIG. 1 , in the distributed processing system 100 through remote direct memory access according to an embodiment of the present invention, the distributed processing devices 110 through a plurality of remote direct memory accesses are remote direct memory access (RDMA: Remote Direct Memory Access) is interconnected with a plurality of shared memory servers 120 through the support network (130). Here, the shared memory servers 120 constitute one shared memory server cluster 140 .

본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 장치(110)는 공유 메모리 서버 클러스터를 구성하는 복수의 공유 메모리 서버들에 구비된 원격 메모리들로부터 할당된 공유 메모리 버퍼 세그먼트들로 구성된 공유 메모리 버퍼에 대하여, 메모리에 공유 메모리 버퍼와 동일한 크기만큼 로컬 공유 메모리 영역을 할당하고, 공유 메모리 버퍼와 로컬 공유 메모리 영역 사이의 메모리 맵핑 테이블을 이용하여 공유 메모리 버퍼와 로컬 공유 메모리 영역을 동기화하는 것을 특징으로 한다. 그리고, 로컬 공유 메모리 영역에 대하여 주어진 혹은 입력된 연산을 처리한다. Distributed processing apparatus 110 through remote direct memory access according to an embodiment of the present invention consists of shared memory buffer segments allocated from remote memories provided in a plurality of shared memory servers constituting a shared memory server cluster. For the shared memory buffer, allocating a local shared memory area equal to the size of the shared memory buffer in memory, and synchronizing the shared memory buffer and the local shared memory area using a memory mapping table between the shared memory buffer and the local shared memory area. characterized in that Then, a given or input operation is processed for the local shared memory area.

여기서, 복수의 원격 직접 메모리 접근을 통한 분산 처리 장치들(110)은 하나의 분산 처리 프레임워크에 포함될 수 있다. 또한, 하나 이상의 원격 직접 메모리 접근을 통한 분산 처리 장치들(110)은 하나의 계산 노드로 구성되어 연산 기능을 제공할 수 있다.Here, the distributed processing devices 110 through a plurality of remote direct memory accesses may be included in one distributed processing framework. In addition, the distributed processing devices 110 through one or more remote direct memory access may be configured as one computation node to provide a computation function.

이때, 원격 직접 메모리 접근을 통한 분산 처리 장치(110)는 분산 처리 프레임워크에서 주도적으로 분산 처리 작업의 초기화 및 제어를 처리하는 마스터 분산 처리 장치 또는 마스터 분산 처리 장치의 제어를 받으며 계산을 담당하는 슬레이브 분산 처리 장치 혹은 워커 분산 처리 장치로 구분될 수 있다. At this time, the distributed processing unit 110 through remote direct memory access is controlled by the master distributed processing unit that handles initialization and control of distributed processing tasks in the distributed processing framework, or the master distributed processing unit, and the slave responsible for calculation It can be divided into a distributed processing unit or a worker distributed processing unit.

이때, 마스터 분산 처리 장치는 공유 메모리 서버들(120)에 데이터를 저장하기 위한 공유 메모리 버퍼를 생성하고 슬레이브 분산 처리 장치들에 공유 메모리 버퍼 정보를 전달하여 모든 분산 처리 장치들(110)이 공유 메모리 서버(120)상의 동일 메모리 세그먼트 영역을 접근할 수 있도록 한다. 여기서, 공유 메모리 버퍼 정보에는 공유 메모리 버퍼 전체 크기, 공유 메모리 버퍼 생성키, 공유 메모리 서버별로 생성된 공유 메모리 버퍼 세그먼트 정보 등이 포함될 수 있다.At this time, the master distributed processing unit creates a shared memory buffer for storing data in the shared memory servers 120 and transmits the shared memory buffer information to the slave distributed processing units so that all the distributed processing units 110 share the shared memory. It allows access to the same memory segment area on the server 120 . Here, the shared memory buffer information may include the total size of the shared memory buffer, a shared memory buffer generation key, shared memory buffer segment information generated for each shared memory server, and the like.

이때, 원격 직접 메모리 접근을 통한 분산 처리 장치(110)는 연산에 의하여 로컬 공유 메모리 영역의 데이터가 변경된 경우에 로컬 공유 메모리 영역의 데이터를 공유 메모리 버퍼에 복사하여 원격 메모리들과 데이터를 동기화할 수 있다. 또한, 다른 원격 직접 메모리 접근을 통한 분산 처리 장치(110)의 연산에 의하여 공유 메모리 버퍼의 데이터가 변경된 경우에 공유 메모리 버퍼의 데이터를 로컬 공유 메모리 영역으로 복사하여 원격 메모리들과 데이터를 동기화할 수 있다.In this case, the distributed processing device 110 through remote direct memory access copies the data of the local shared memory area to the shared memory buffer when the data of the local shared memory area is changed by operation to synchronize the data with the remote memories. have. In addition, when the data of the shared memory buffer is changed by the operation of the distributed processing unit 110 through another remote direct memory access, the data of the shared memory buffer is copied to the local shared memory area to synchronize the data with the remote memories. have.

공유 메모리 서버(120)는 분산 처리 프레임워크에서 공유 메모리를 제공하는 장치이다.The shared memory server 120 is a device that provides shared memory in a distributed processing framework.

여기서, 복수의 공유 메모리 서버들(120)은 하나의 공유 메모리 서버 클러스터를 구성할 수 있다. 또한, 하나 이상의 공유 메모리 서버들(110)은 하나의 메모리 서비스 노드로 구성되어 공유 메모리 서비스를 제공할 수 있다.Here, the plurality of shared memory servers 120 may constitute one shared memory server cluster. In addition, one or more shared memory servers 110 may be configured as one memory service node to provide a shared memory service.

이때, 복수의 공유 메모리 서버들(120)은 각각 공유 메모리 버퍼의 생성 및 할당 요청에 따라 공유 메모리 버퍼 세그먼트를 생성 및 할당하여, 각각의 공유 메모리 버퍼 세그먼트들을 연결한 가상의 공유 메모리 버퍼를 제공할 수 있다. 여기서, 공유 메모리 버퍼는 분산 처리 장치(110)가 원격 직접 메모리 접근 지원 네트워크(130)를 통하여 직접 접근할 수 있다.At this time, the plurality of shared memory servers 120 create and allocate a shared memory buffer segment according to a request for creation and allocation of the shared memory buffer, respectively, to provide a virtual shared memory buffer connecting the respective shared memory buffer segments. can Here, the shared memory buffer may be directly accessed by the distributed processing unit 110 through the remote direct memory access support network 130 .

이때, 공유 메모리 버퍼는 분산 처리 장치들(110)의 로컬 공유 메모리 영역과 동기화될 수 있다. 즉, RDMA 읽기/쓰기를 통하여 동기화할 수 있다.In this case, the shared memory buffer may be synchronized with the local shared memory area of the distributed processing devices 110 . That is, synchronization is possible through RDMA read/write.

이때, 공유 메모리 서버(120)는 공유 메모리 버퍼 세그먼트들 간의 누적 연산 기능을 제공할 수 있다.In this case, the shared memory server 120 may provide an accumulation operation function between the shared memory buffer segments.

원격 직접 메모리 접근 지원 네트워크(130)는 복수의 분산 처리 장치들(110)과 복수의 공유 메모리 서버들(120) 사이의 통신을 제공하는 네트워크로, 분산 처리 장치들(110)이 공유 메모리 서버들(120)의 메모리에 직접 접근 가능한 기능을 제공한다.The remote direct memory access support network 130 is a network that provides communication between a plurality of distributed processing devices 110 and a plurality of shared memory servers 120 , and the distributed processing devices 110 are shared memory servers. Provides a function that directly accesses the memory of 120.

즉, 원격 직접 메모리 접근을 지원하는 고성능 네트워크(130)로 연결된 고성능 컴퓨팅 클러스터 시스템 환경에서 분산 처리 장치들(110)이 다수의 공유 메모리 서버들(120)의 물리 메모리 세그먼트들을 결합하여 가상의 연속된 공유 메모리 버퍼에 직접 접근할 수 있도록 함으로써, 분산 처리 장치들 간의 데이터 공유를 가속화하고 효율성을 높일 수 있다.That is, in a high-performance computing cluster system environment connected by a high-performance network 130 supporting remote direct memory access, the distributed processing units 110 combine the physical memory segments of a plurality of shared memory servers 120 to create a virtual continuous By providing direct access to the shared memory buffer, data sharing between distributed processing units can be accelerated and efficiency can be increased.

이와 같은 공유 메모리 형태로 분산 처리 데이터를 공유하는 대표적인 분산 처리 방식으로는 비동기 딥러닝 트레이닝에 이용될 수 있다. 비동기 딥러닝 트레이닝 방식은 데이터 병렬 딥러닝 학습 방식의 하나로, 학습 데이터를 나누어 다수의 딥러닝 프로세스가 학습을 수행하고, 학습하는 도중에 학습한 내용을 다른 딥러닝 프로세스들과 공유 데이터 버퍼를 통해 비동기적으로 공유하는 학습 방법이다. 비동기 딥러닝 트레이닝 방식에서 각 딥러닝 분산 처리 프로세스는 딥러닝 파라미터(딥러닝 트레이닝에서 트레이닝의 대상이 되는 가중치와 특징값의 총칭)를 다른 프로세스들과 동기를 맞추지 않고 비동기적으로 파라미터를 업데이트하는데, 이 방식은 본 발명에서 제안하는 공유 메모리 구조에 적합하다.As a representative distributed processing method that shares distributed processing data in the form of such a shared memory, it can be used for asynchronous deep learning training. The asynchronous deep learning training method is one of the data parallel deep learning learning methods. Multiple deep learning processes perform learning by dividing the training data, and the contents learned during learning are shared with other deep learning processes through a shared data buffer. A shared learning method. In the asynchronous deep learning training method, each deep learning distributed processing process updates the parameters asynchronously without synchronizing the deep learning parameters (general names of weights and feature values to be trained in deep learning training) with other processes. This method is suitable for the shared memory structure proposed by the present invention.

또한, 본 발명에서 제안하는 방식은 파라미터 서버와 일부 유사하나, 딥러닝 분산 프로세스들로부터 그래디언트를 받아 능동적으로 가중치 파라미터를 계산하여 직접 딥러닝 파라미터를 업데이트하는 파라미터 서버 방식과 달리 분산 처리 장치들이 원격 직접 메모리 접근 기능을 통해 공유 메모리 서버의 개입 없이 공유 메모리 버퍼를 직접 읽고 쓰는 것이 가능하다. In addition, the method proposed in the present invention is somewhat similar to the parameter server, but unlike the parameter server method that receives a gradient from the deep learning distributed processes and updates the deep learning parameters directly by actively calculating the weight parameters, the distributed processing devices are remotely and directly The memory access function allows direct reading and writing of shared memory buffers without intervention of the shared memory server.

또한, 하나의 단일 메모리 서버상의 메모리만을 공유 메모리로 사용할 경우에는 확장성에 제한이 있으나, 공유 메모리 서버 클러스터를 구성함으로써 대규모 분산 처리에도 유연한 확장성을 제공할 수 있다. In addition, if only the memory on one single memory server is used as shared memory, scalability is limited, but flexible scalability can be provided even for large-scale distributed processing by configuring a shared memory server cluster.

도 2는 도 1에 도시된 원격 직접 메모리 접근을 통한 분산 처리 장치(110)의 일 예를 나타낸 블록도이다.FIG. 2 is a block diagram illustrating an example of the distributed processing apparatus 110 through remote direct memory access shown in FIG. 1 .

도 2를 참조하면, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 장치(110)는 제어부(210), 통신부(220), 메모리(230), 연산 처리부(240) 및 공유 메모리 서버 접근 지원부(250) 등을 포함한다.Referring to FIG. 2 , the distributed processing apparatus 110 through remote direct memory access according to an embodiment of the present invention includes a control unit 210 , a communication unit 220 , a memory 230 , an operation processing unit 240 , and a shared memory. and a server access support unit 250 and the like.

상세히, 제어부(210)는 일종의 중앙처리장치로서 원격 직접 메모리 접근을 통한 분산 처리 과정을 제어한다. 즉, 제어부(210)는 메모리(230), 연산 처리부(240) 및 공유 메모리 서버 접근 지원부(250) 등을 제어하여 다양한 기능을 제공할 수 있다.In detail, the control unit 210 controls the distributed processing process through remote direct memory access as a kind of central processing unit. That is, the controller 210 may provide various functions by controlling the memory 230 , the arithmetic processing unit 240 , and the shared memory server access support unit 250 .

여기서, 제어부(210)는 프로세서(processor)와 같이 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는, 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로써, 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 처리 장치를 망라할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.Here, the controller 210 may include all kinds of devices capable of processing data, such as a processor. Here, the 'processor' may refer to a data processing device embedded in hardware, for example, having a physically structured circuit to perform a function expressed as a code or an instruction included in a program. As an example of the data processing apparatus embedded in the hardware as described above, a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated (ASIC) circuit) and a processing device such as a field programmable gate array (FPGA), but the scope of the present invention is not limited thereto.

통신부(220)는 RDMA 지원 네트워크(도 1의 130 참조)를 통하여 원격 직접 메모리 접근을 통한 분산 처리 장치(110)와 공유 메모리 서버(도 1의 120 참조) 간의 송수신 신호를 전송하는데 필요한 통신 인터페이스를 제공한다.Communication unit 220 through the RDMA support network (see 130 in Fig. 1) through the remote direct memory access through the distributed processing unit 110 and the shared memory server (see 120 in Fig. 1) a communication interface necessary to transmit a transmission/reception signal to provide.

여기서, 통신부(220)는 다른 네트워크 장치와 유무선 연결을 통해 제어 신호 또는 데이터 신호와 같은 신호를 송수신하기 위해 필요한 하드웨어 및 소프트웨어를 포함하는 장치일 수 있다. Here, the communication unit 220 may be a device including hardware and software necessary for transmitting and receiving signals such as control signals or data signals through wired/wireless connection with other network devices.

이때, 통신부(220)는 RDMA 지원 네트워크(도 1의 130 참조)를 통해 공유 메모리 서버들(도 1의 120 참조)의 원격 메모리들에 직접 접근하여 데이터를 읽고 쓸 수 있다.In this case, the communication unit 220 may directly access remote memories of the shared memory servers (see 120 of FIG. 1 ) through the RDMA support network (see 130 of FIG. 1 ) to read and write data.

메모리(230)는 제어부(210)가 처리하는 데이터를 일시적 또는 영구적으로 저장하는 기능을 수행한다. 여기서, 메모리(230)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.The memory 230 performs a function of temporarily or permanently storing data processed by the controller 210 . Here, the memory 230 may include magnetic storage media or flash storage media, but the scope of the present invention is not limited thereto.

이때, 메모리(230)는 공유 메모리 서버들(도 1의 120 참조)의 원격 메모리들로부터 구성된 공유 메모리 버퍼와 동일한 크기만큼을 로컬 공유 메모리 영역으로 할당하고, 공유 메모리 버퍼와 동기화할 수 있다.In this case, the memory 230 may allocate the same size as the shared memory buffer configured from remote memories of the shared memory servers (see 120 of FIG. 1 ) to the local shared memory area, and may be synchronized with the shared memory buffer.

이에 따라, 분산 처리 장치(110)는 로컬 공유 메모리 영역에 대하여 연산을 수행하고, 이를 공유 메모리 버퍼와 동기화함으로써 다른 분산 처리 장치들과 메모리를 공유할 수 있다.Accordingly, the distributed processing unit 110 may share memory with other distributed processing units by performing an operation on the local shared memory area and synchronizing it with the shared memory buffer.

연산 처리부(240)는 분산 처리 프레임워크에서 분산 처리 장치(110)에 주어진 연산을 수행한다.The arithmetic processing unit 240 performs an operation given to the distributed processing unit 110 in the distributed processing framework.

이때, 연산 처리부(240)는 로컬 공유 메모리 영역에 대하여 연산을 수행할 수 있다.In this case, the operation processing unit 240 may perform an operation on the local shared memory area.

이때, 연산 처리부(240)는 API(Application Programmable Interface)를 통해 공유 메모리 서버 접근 지원부(250)에 명시적으로 로컬 공유 메모리 영역과 공유 메모리 버퍼의 동기화를 요청 및 수행할 수 있다.In this case, the operation processing unit 240 may explicitly request and perform synchronization of the local shared memory area and the shared memory buffer to the shared memory server access support unit 250 through an API (Application Programmable Interface).

이때, 연산 처리부(240)는 API(Application Programmable Interface)를 통해 공유 메모리 서버 접근 지원부(250)에 복수의 공유 메모리 버퍼들에 대한 누적 연산을 요청 및 수행할 수 있다.In this case, the operation processing unit 240 may request and perform an accumulation operation on the plurality of shared memory buffers to the shared memory server access support unit 250 through an application programmable interface (API).

공유 메모리 서버 접근 지원부(250)는 공유 메모리 서버(도 1의 120 참조)와의 RDMA 읽기/쓰기를 통한 접근을 지원하여, 공유 메모리 버퍼와의 동기화를 지원한다.The shared memory server access support unit 250 supports access through RDMA read/write with the shared memory server (see 120 of FIG. 1 ), and supports synchronization with the shared memory buffer.

이때, 공유 메모리 서버 접근 지원부(250)는 공유 메모리 서버 클러스터를 등록하여 공유 메모리 서버 클러스터를 구성하는 공유 메모리 서버들(도 1의 120 참조)의 정보를 획득할 수 있다. 여기서, 공유 메모리 서버 클러스터의 등록은 사용자가 입력한 공유 메모리 서버들(도 1의 120 참조)의 접근 정보를 이용하여 해당 공유 메모리 서버들(도 1의 120 참조)과의 연결을 설정하고 공유 메모리 버퍼 세그먼트를 생성 및 할당받는 초기화 과정을 의미할 수 있다. 그리고, 공유 메모리 서버 접근 정보에는 IP 주소와 포트 번호 정보 등이 포함될 수 있다. 특히, 공유 메모리 서버 클러스터를 등록할 때, 모든 분산 처리 장치들(110)은 모든 공유 메모리 서버들(도 1의 120 참조)의 순서를 동일하게 등록할 수 있다.In this case, the shared memory server access support unit 250 may acquire information on the shared memory servers (see 120 of FIG. 1 ) constituting the shared memory server cluster by registering the shared memory server cluster. Here, the registration of the shared memory server cluster establishes a connection with the corresponding shared memory servers (see 120 in FIG. 1) using the access information of the shared memory servers (see 120 in FIG. 1) input by the user, and the shared memory This may refer to an initialization process in which a buffer segment is created and allocated. In addition, the shared memory server access information may include IP address and port number information. In particular, when registering the shared memory server cluster, all distributed processing units 110 may register the same order of all shared memory servers (see 120 of FIG. 1 ).

이때, 공유 메모리 서버 접근 지원부(250)는 공유 메모리 서버 클러스터에 등록된 각 공유 메모리 서버들(도 1의 120 참조)에 대하여 공유 메모리 버퍼를 구성하는 공유 메모리 버퍼 세그먼트들의 크기를 계산할 수 있다. 예컨대, 하나의 공유 메모리 서버 클러스터에 5개의 공유 메모리 서버들이 포함되어 있고, 공유 메모리 버퍼의 크기를 5GB로 구성하는 경우, 5개의 크기 1GB의 공유 메모리 버퍼 세그먼트들로 나눌 수 있다. 여기서, 각 공유 메모리 버퍼 세그먼트들의 크기는 동일하지 않을 수 있다.In this case, the shared memory server access support unit 250 may calculate the size of the shared memory buffer segments constituting the shared memory buffer for each shared memory server (see 120 of FIG. 1 ) registered in the shared memory server cluster. For example, when five shared memory servers are included in one shared memory server cluster and the size of the shared memory buffer is configured to be 5 GB, it can be divided into five shared memory buffer segments having a size of 1 GB. Here, the size of each shared memory buffer segment may not be the same.

이때, 공유 메모리 서버 접근 지원부(250)는 공유 메모리 서버들(도 1의 120 참조)에 계산된 공유 메모리 서버 세그먼트들의 크기 정보와 공유 메모리 버퍼 세그먼트의 생성 및 할당을 요청을 전달하여 공유 메모리 버퍼를 구성할 수 있다.At this time, the shared memory server access support unit 250 transmits a request to the shared memory servers (see 120 of FIG. 1 ) the size information of the calculated shared memory server segments and the creation and allocation of the shared memory buffer segment to create the shared memory buffer. configurable.

이때, 공유 메모리 서버 접근 지원부(250)는 다른 분산 처리 장치가 구성한 공유 메모리 버퍼에 대한 접근권을 획득하여 공유 메모리 버퍼의 생성 및 할당을 대신할 수 있다.In this case, the shared memory server access support unit 250 may obtain an access right to the shared memory buffer configured by another distributed processing device to replace the creation and allocation of the shared memory buffer.

이때, 공유 메모리 서버 접근 지원부(250)는 공유 메모리 버퍼를 구성하기 위하여 공유 메모리 서버들(도 1의 120 참조)에 공유 메모리 버퍼 생성키를 함께 전송할 수 있다. 여기서, 공유 메모리 버퍼 생성키는 동일한 공유 메모리 버퍼가 중복 생성을 방지하거나 유효한 요청인지 여부를 확인하거나 공유 메모리 버퍼 세그먼트를 특정하기 위한 목적으로 이용될 수 있다.In this case, the shared memory server access support unit 250 may transmit the shared memory buffer creation key together to the shared memory servers (see 120 of FIG. 1 ) to configure the shared memory buffer. Here, the shared memory buffer creation key may be used for the purpose of preventing duplicate creation of the same shared memory buffer, checking whether a request is a valid request, or specifying a shared memory buffer segment.

이때, 공유 메모리 서버 접근 지원부(250)는 모든 공유 메모리 서버들(도 1의 120 참조)에 대하여 공유 메모리 버퍼 세그먼트들의 생성 및 할당이 이루어지면 메모리(230)에 공유 메모리 버퍼와 동일한 크기로 로컬 공유 메모리 영역을 할당할 수 있다. 여기서, 로컬 공유 메모리 영역은 실제 물리 메모리에 할당되며, 로컬 공유 메모리 영역이 할당딤에 따라 주소 정보(예컨대, 가상 주소)가 반환된다.At this time, when the shared memory server access support unit 250 creates and allocates shared memory buffer segments for all shared memory servers (see 120 of FIG. 1 ), the shared memory buffer is locally shared in the memory 230 with the same size as the shared memory buffer. A memory area can be allocated. Here, the local shared memory area is allocated to real physical memory, and address information (eg, virtual address) is returned as the local shared memory area is allocated.

이때, 공유 메모리 서버 접근 지원부(250)는 로컬 공유 메모리 영역을 공유 메모리 버퍼와 동기화를 수행할 수 있다. 여기서, 동기화는 메모리 맵핑 테이블을 이용하여 수행될 수 있다.In this case, the shared memory server access support unit 250 may synchronize the local shared memory area with the shared memory buffer. Here, synchronization may be performed using a memory mapping table.

이때, 공유 메모리 서버 접근 지원부(250)는 연산 처리부(240)에 의하여 로컬 공유 메모리 영역의 데이터가 변경된 경우, RDMA 통해 로컬 공유 메모리 영역의 데이터를 공유 메모리 버퍼에 복사하여 동기화할 수 있다. 또는, 변경된 데이터에 대하여만 복사하여 동기화할 수 있다.In this case, when the data of the local shared memory area is changed by the operation processing unit 240 , the shared memory server access support unit 250 may synchronize the data of the local shared memory area by copying it to the shared memory buffer through RDMA. Alternatively, only the changed data can be copied and synchronized.

이때, 공유 메모리 서버 접근 지원부(250)는 다른 분산 처리 장치의 연산에 의하여 공유 메모리 버퍼의 데이터가 변경된 경우, RDMA를 통해 공유 메모리 버퍼의 데이터를 로컬 공유 메모리 영역에 복사하여 동기화할 수 있다. 또는, 변경된 데이터에 대하여만 복사하여 동기화할 수 있다.In this case, the shared memory server access support unit 250 may synchronize by copying the data of the shared memory buffer to the local shared memory area through RDMA when the data of the shared memory buffer is changed by the operation of another distributed processing device. Alternatively, only the changed data can be copied and synchronized.

이때, 공유 메모리 서버 접근 지원부(250)는 공유 메모리 버퍼의 사용이 종료된 경우 공유 메모리 서버들(도 1의 120 참조)에 공유 메모리 버퍼 세그먼트들의 해제 및 삭제를 요청할 수 있다.In this case, the shared memory server access support unit 250 may request the shared memory servers (see 120 of FIG. 1 ) to release and delete the shared memory buffer segments when the use of the shared memory buffer is terminated.

이때, 공유 메모리 서버 접근 지원부(250)는 모든 공유 메모리 버퍼 세그먼트들의 해제 및 삭제가 이루어지면, 공유 메모리 서버들(도 1의 120 참조)과의 연결을 종료하고 공유 메모리 서버 클러스터를 등록 해제하며 정보를 삭제하여 공유 메모리 버퍼의 사용을 종료할 수 있다.At this time, when all shared memory buffer segments are released and deleted, the shared memory server access support unit 250 terminates the connection with the shared memory servers (see 120 in FIG. 1 ), deregisters the shared memory server cluster, and information You can end the use of the shared memory buffer by deleting .

이때, 공유 메모리 서버 접근 지원부(250)는 복수의 공유 메모리 버퍼에 대하여 데이터 누적 연산 기능을 제공할 수 있다. 예컨대, 제1 공유 메모리 버퍼와 제2 공유 메모리 버퍼에 대한 누적 연산을 수행하는 경우, 제1 공유 메모리 버퍼에 대한 데이터 동기화를 수행하고, 각 공유 메모리 서버들(도 1의 120 참조)에 제1 공유 버퍼 세그먼트들로부터 제2 공유 버퍼 세그먼트들로의 누적 연산을 요청하고, 모든 공유 메모리 서버들(도 1의 120 참조)에서 누적 연산이 완료되면 그 결과를 반환할 수 있다. 각 공유 메모리 서버들(도 1의 120 참조)에서는 누적 연산을 위하여 제2 공유 메모리 버퍼 세그먼트를 잠그고, 제1 공유 메모리 버퍼 세그먼트에서 제2 공유 메모리 버퍼 세그먼트로의 누적 연산을 수행할 수 있다.In this case, the shared memory server access support unit 250 may provide a data accumulation operation function for a plurality of shared memory buffers. For example, when performing an accumulation operation on the first shared memory buffer and the second shared memory buffer, data synchronization is performed on the first shared memory buffer, and the first shared memory server (see 120 in FIG. 1 ) is An accumulation operation from the shared buffer segments to the second shared buffer segments is requested, and when the accumulation operation is completed in all shared memory servers (see 120 of FIG. 1 ), the result may be returned. Each of the shared memory servers (see 120 of FIG. 1 ) may lock the second shared memory buffer segment for an accumulation operation, and may perform an accumulation operation from the first shared memory buffer segment to the second shared memory buffer segment.

메모리 맵핑 테이블 관리부(260)는 로컬 공유 메모리 영역과 공유 메모리 버퍼 사이의 메모리 맵핑 테이블을 관리한다.The memory mapping table management unit 260 manages a memory mapping table between the local shared memory area and the shared memory buffer.

이때, 메모리 맵핑 테이블 관리부(260)는 저장소를 포함하여 직접 메모리 맵핑 테이블을 저장하여 관리할 수도 있지만, 별도의 저장소나 메모리(230)에 메모리 맵핑 테이블을 저장하여 관리할 수 있다.In this case, the memory mapping table management unit 260 may directly store and manage the memory mapping table including the storage, but may store and manage the memory mapping table in a separate storage or memory 230 .

이때, 메모리 맵핑 테이블 관리부(260)는 공유 메모리 버퍼가 생성되면 공유 메모리 버퍼와 로컬 공유 메모리 영역 사이의 메모리 맵핑 테이블을 생성할 수 있다.In this case, when the shared memory buffer is created, the memory mapping table management unit 260 may create a memory mapping table between the shared memory buffer and the local shared memory area.

이때, 메모리 맵핑 테이블 관리부(260)는 공유 메모리 버퍼의 사용이 종료되면 메모리 맵핑 테이블을 삭제할 수 있다.In this case, the memory mapping table management unit 260 may delete the memory mapping table when the use of the shared memory buffer is terminated.

도 3은 도 1에 도시된 공유 메모리 서버(120)의 일 예를 나타낸 블록도이다.3 is a block diagram illustrating an example of the shared memory server 120 shown in FIG.

도 3을 참조하면, 본 발명의 일 실시예에 따른 공유 메모리 서버(120)는 제어부(310), 통신부(320), 메모리(330) 및 공유 메모리 관리부(340) 등을 포함한다.Referring to FIG. 3 , the shared memory server 120 according to an embodiment of the present invention includes a control unit 310 , a communication unit 320 , a memory 330 , and a shared memory management unit 340 .

상세히, 제어부(310)는 일종의 중앙처리장치로서 원격 직접 메모리 접근을 통한 분산 처리 과정을 제어한다. 즉, 제어부(310)는 메모리(330) 및 공유 메모리 관리부(340) 등을 제어하여 다양한 기능을 제공할 수 있다.In detail, the control unit 310 controls a distributed processing process through remote direct memory access as a kind of central processing unit. That is, the controller 310 may provide various functions by controlling the memory 330 and the shared memory management unit 340 .

여기서, 제어부(310)는 프로세서(processor)와 같이 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는, 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로써, 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 처리 장치를 망라할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.Here, the controller 310 may include all kinds of devices capable of processing data, such as a processor. Here, the 'processor' may refer to a data processing device embedded in hardware, for example, having a physically structured circuit to perform a function expressed as a code or an instruction included in a program. As an example of the data processing apparatus embedded in the hardware as described above, a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated (ASIC) circuit) and a processing device such as a field programmable gate array (FPGA), but the scope of the present invention is not limited thereto.

통신부(320)는 RDMA 지원 네트워크(도 1의 130 참조)를 통하여 공유 메모리 서버(120)와 원격 직접 메모리 접근을 통한 분산 처리 장치들(도 1의 110 참조) 간의 송수신 신호를 전송하는데 필요한 통신 인터페이스를 제공한다.The communication unit 320 is a communication interface necessary for transmitting a transmission/reception signal between the shared memory server 120 and the distributed processing devices (refer to 110 of FIG. 1 ) through the remote direct memory access through the RDMA support network (see 130 of FIG. 1 ). provides

여기서, 통신부(320)는 다른 네트워크 장치와 유무선 연결을 통해 제어 신호 또는 데이터 신호와 같은 신호를 송수신하기 위해 필요한 하드웨어 및 소프트웨어를 포함하는 장치일 수 있다. Here, the communication unit 320 may be a device including hardware and software necessary for transmitting and receiving signals such as control signals or data signals through wired/wireless connection with other network devices.

이때, 통신부(320)는 RDMA 지원 네트워크(도 1의 130 참조)를 통해 원격 직접 메모리 접근을 통한 분산 처리 장치들(도 1의 110 참조)이 메모리(330)에 직접 접근하여 데이터를 읽고 쓸 수 있도록 지원할 수 있다.In this case, the communication unit 320 allows the distributed processing units (refer to 110 in FIG. 1) to directly access the memory 330 through remote direct memory access through the RDMA support network (see 130 in FIG. 1) to read and write data. can support you to

메모리(330)는 제어부(310)가 처리하는 데이터를 일시적 또는 영구적으로 저장하는 기능을 수행한다. 여기서, 메모리(330)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.The memory 330 performs a function of temporarily or permanently storing data processed by the controller 310 . Here, the memory 330 may include magnetic storage media or flash storage media, but the scope of the present invention is not limited thereto.

이때, 메모리(330)는 전체 또는 일부가 공유 메모리 버퍼 세그먼트로 할당되어, 다른 공유 메모리 서버들의 공유 메모리 버퍼 세그먼트들과 함께 공유 메모리 버퍼를 구성할 수 있다. 여기서, 공유 메모리 버퍼는 공유 메모리 버퍼 세그먼트들을 연결한 가상의 메모리 버퍼로, 실체는 각 공유 메모리 서버들(120)의 메모리(330)에 할당된 공유 메모리 버퍼 세그먼트들의 영역이다.In this case, all or part of the memory 330 may be allocated as a shared memory buffer segment, and may constitute a shared memory buffer together with the shared memory buffer segments of other shared memory servers. Here, the shared memory buffer is a virtual memory buffer that connects the shared memory buffer segments, and the substance is an area of the shared memory buffer segments allocated to the memory 330 of each shared memory server 120 .

이때, 메모리(330)에 할당된 공유 메모리 버퍼 세그먼트는 분산 처리 장치(도 1의 110 참조)의 로컬 공유 메모리 영역의 데이터가 변경됨에 따라 변경된 데이터가 동기화될 수 있다.In this case, the shared memory buffer segment allocated to the memory 330 may synchronize the changed data as data in the local shared memory area of the distributed processing unit (refer to 110 of FIG. 1 ) is changed.

공유 메모리 관리부(340)는 공유 메모리 버퍼를 구성하기 위하여 공유 메모리 버퍼 세그먼트를 할당하고 이를 관리한다.The shared memory management unit 340 allocates a shared memory buffer segment to configure the shared memory buffer and manages it.

이때, 공유 메모리 관리부(340)는 분산 처리 장치(도 1의 110 참조)로부터 공유 메모리 버퍼의 생성 및 할당을 요청받은 경우, 주어진 공유 메모리 버퍼 세그먼트의 크기만큼 메모리(330)에서 공유 메모리 버퍼 세그먼트를 생성 및 할당하여 할당 정보를 반환할 수 있다.At this time, when the shared memory management unit 340 receives a request for creation and allocation of the shared memory buffer from the distributed processing unit (see 110 of FIG. 1 ), the shared memory buffer segment is configured in the memory 330 by the size of the given shared memory buffer segment. You can create and allocate to return allocation information.

이때, 공유 메모리 관리부(340)는 공유 메모리 버퍼 세그먼트를 생성 요청에 대하여 공유 메모리 버퍼 생성키를 수신하고, 수신한 공유 메모리 버퍼 생성키가 이미 사용중이 아닌 경우에만 공유 메모리 버퍼 세그먼트를 생성 및 할당하여 할당 정보를 반환할 수 있다.At this time, the shared memory management unit 340 receives the shared memory buffer creation key in response to the shared memory buffer segment creation request, and creates and allocates the shared memory buffer segment only when the received shared memory buffer creation key is not already in use. Can return allocation information.

이때, 공유 메모리 관리부(340)는 공유 메모리 버퍼 세그먼트의 접근 요청에 대하여 공유 메모리 버퍼 생성키를 수신하고, 수신한 공유 메모리 버퍼 생성키가 접근을 요청하는 공유 메모리 버퍼 세그먼트의 정보와 일치하는 경우에 해당 공유 메모리 버퍼 세그먼트의 접근을 허용할 수 있다. 예컨대, 공유 메모리 버퍼 생성키는 공유 메모리 버퍼 세그먼트의 크기를 의미할 수 있다.At this time, the shared memory management unit 340 receives the shared memory buffer generation key in response to the request for access of the shared memory buffer segment, and when the received shared memory buffer generation key matches information of the shared memory buffer segment requesting access You can allow access to the corresponding shared memory buffer segment. For example, the shared memory buffer creation key may mean the size of the shared memory buffer segment.

이때, 공유 메모리 관리부(340)는 공유 메모리 버퍼 세그먼트들 사이의 데이터 누적 연산을 수행할 수 있다. 예컨대, 제1 공유 메모리 버퍼 세그먼트로부터 제2 공유 메모리 버퍼 세그먼트로의 데이터 누적은, 제2 공유 메모리 버퍼 세그먼트를 잠그고 제1 공유 메모리 버퍼 세그먼트의 데이터를 제2 공유 메모리 버퍼 세그먼트에 누적함으로써 수행될 수 있다. 그리고, 데이터 누적 연산이 완료된 경우 연산 완료를 알리는 결과를 반환할 수 있다.In this case, the shared memory management unit 340 may perform a data accumulation operation between the shared memory buffer segments. For example, accumulating data from the first shared memory buffer segment to the second shared memory buffer segment may be performed by locking the second shared memory buffer segment and accumulating the data of the first shared memory buffer segment into the second shared memory buffer segment. have. In addition, when the data accumulation operation is completed, a result indicating the completion of the operation may be returned.

도 4는 본 발명의 일 실시예에 따른 공유 메모리 버퍼를 구성하는 방법을 나타낸 도면이다.4 is a diagram illustrating a method of configuring a shared memory buffer according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시예에 따른 공유 메모리 버퍼를 구성하는 방법은, 공유 메모리 버퍼의 생성 및 할당을 주도하는 마스터 분산 처리 장치(410)가 공유 메모리 서버들(440, 450, 460 및 470)에 공유 메모리 버퍼를 구성하기 위한 공유 메모리 버퍼 세그먼트들의 생성 및 할당을 요청한다.4, in the method of configuring a shared memory buffer according to an embodiment of the present invention, the master distributed processing unit 410 that leads the creation and allocation of the shared memory buffer includes the shared memory servers 440, 450, 460 and 470) to request creation and allocation of shared memory buffer segments for constituting a shared memory buffer.

이때, 공유 메모리 서버 1(440)은 공유 메모리 버퍼 세그먼트 1(441)을 생성 및 할당하여 그 정보를 반환하고, 공유 메모리 서버 2(450)는 공유 메모리 버퍼 세그먼트 2(451)를 생성 및 할당하여 그 정보를 반환하고, 공유 메모리 서버 3(460)은 공유 메모리 버퍼 세그먼트 3(461)을 생성 및 할당하여 그 정보를 반환하고, 공유 메모리 서버 4(470)는 공유 메모리 버퍼 세그먼트 4(471)를 생성 및 할당하여 그 정보를 반환할 수 있다.At this time, the shared memory server 1 (440) creates and allocates the shared memory buffer segment 1 (441) and returns the information, and the shared memory server 2 (450) creates and allocates the shared memory buffer segment 2 (451). Returning the information, shared memory server 3 (460) creates and allocates shared memory buffer segment 3 (461) to return the information, and shared memory server 4 (470) creates shared memory buffer segment 4 (471). It can be created and assigned to return that information.

이때, 각 공유 메모리 서버들(440, 450, 460 및 470)에서 공유 메모리 버퍼 세그먼트들(441, 451, 461 및 471)이 생성 및 할당되면, 이들은 연결되어 가상의 공유 메모리 버퍼(430)를 구성할 수 있다. 즉, 공유 메모리 버퍼(430)의 실체는 각 공유 메모리 서버들(440, 450, 460 및 470)에 할당된 공유 메모리 버퍼 세그먼트들(441, 451, 461 및 471)이다. 따라서, 공유 메모리 버퍼(430)에 대한 데이터 입출력은 공유 메모리 버퍼 세그먼트들(441, 451, 461 및 471)에 대한 데이터 입출력을 의미한다.At this time, when the shared memory buffer segments 441 , 451 , 461 and 471 are created and allocated in each of the shared memory servers 440 , 450 , 460 and 470 , they are connected to form a virtual shared memory buffer 430 . can do. That is, the substance of the shared memory buffer 430 is the shared memory buffer segments 441 , 451 , 461 and 471 allocated to each of the shared memory servers 440 , 450 , 460 and 470 . Accordingly, data input/output to the shared memory buffer 430 means data input/output to the shared memory buffer segments 441 , 451 , 461 and 471 .

이때, 마스터 분산 처리 장치(410)와 동일한 분산 처리 프레임워크에 속하는 다른 분산 처리 장치들은 슬레이브 분산 처리 장치(420)로 분류되며, 슬레이브 분산 처리 장치(420)는 마스터 분산 처리 장치(410)에 의하여 구성된 공유 메모리 버퍼의 정보를 획득하여 동일한 공유 메모리 버퍼를 이용할 수 있다.At this time, other distributed processing units belonging to the same distributed processing framework as the master distributed processing unit 410 are classified as slave distributed processing units 420 , and the slave distributed processing unit 420 is controlled by the master distributed processing unit 410 . By acquiring information of the configured shared memory buffer, the same shared memory buffer can be used.

이때, 마스터 분산 처리 장치(410)와 슬레이브 분산 처리 장치(420)는 각각 자신의 메모리에 대하여 공유 메모리 버퍼와 동일한 크기의 로컬 공유 메모리 영역(411 및 421)을 할당할 수 있다. 그리고, 로컬 공유 메모리 영역(411 및 421)에 대하여 입출력을 수행하여 분산 처리 프레임워크에서의 분산 처리를 수행할 수 있다. In this case, the master distributed processing unit 410 and the slave distributed processing unit 420 may allocate local shared memory areas 411 and 421 having the same size as the shared memory buffer for their own memories, respectively. In addition, distributed processing in the distributed processing framework may be performed by performing input/output to the local shared memory regions 411 and 421 .

이때, 각각의 로컬 공유 메모리 영역(411 및 421)은 공유 메모리 버퍼(430)와 동기화되어 유지되고, 분산 처리 장치들(410 및 420) 사이에서 공유 메모리 버퍼(430)를 통해 분산 처리 데이터를 공유할 수 있다. 특히, 분산 처리 장치들(410 및 420)은 RDMA를 통하여 공유 메모리 버퍼(430)에 대하여 직접 입출력하여 로컬 공유 메모리 영역(411 및 421)과 공유 메모리 버퍼(430) 사이의 동기화를 수행할 수 있다.At this time, each of the local shared memory regions 411 and 421 is maintained in synchronization with the shared memory buffer 430 , and distributed processing data is shared between the distributed processing units 410 and 420 through the shared memory buffer 430 . can do. In particular, the distributed processing units 410 and 420 may perform direct input/output to and from the shared memory buffer 430 through RDMA to perform synchronization between the local shared memory areas 411 and 421 and the shared memory buffer 430 . .

도 5는 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법을 나타낸 동작 흐름도이다.5 is a flowchart illustrating a distributed processing method through remote direct memory access according to an embodiment of the present invention.

도 5를 참조하면, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버 클러스터를 등록한다(S501). 공유 메모리 서버 클러스터는 복수개의 공유 메모리 서버들(도 1의 120 참조)로 구성될 수 있다.Referring to FIG. 5 , in the distributed processing method through remote direct memory access according to an embodiment of the present invention, a distributed processing device (refer to 110 in FIG. 1 ) through remote direct memory access registers a shared memory server cluster ( S501). The shared memory server cluster may be composed of a plurality of shared memory servers (see 120 of FIG. 1 ).

또한, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 버퍼를 생성 및 할당한다(S503).In addition, in the distributed processing method through remote direct memory access according to an embodiment of the present invention, the distributed processing device (refer to 110 of FIG. 1 ) through remote direct memory access creates and allocates a shared memory buffer (S503).

또한, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 주어진 분산 연산을 수행하고 공유 메모리 버퍼에 대하여 RDMA를 통한 직접 데이터 읽기 및 쓰기를 통하여 분산 처리 데이터를 공유한다(S505).In addition, in the distributed processing method through remote direct memory access according to an embodiment of the present invention, the distributed processing unit (see 110 in FIG. 1 ) through remote direct memory access performs a given distributed operation and RDMA for the shared memory buffer The distributed processing data is shared through direct data read and write through (S505).

또한, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 버퍼의 사용을 종료하는 경우에 공유 메모리 버퍼를 해제 및 삭제한다(S507).In addition, in the distributed processing method through remote direct memory access according to an embodiment of the present invention, when the distributed processing device (see 110 in FIG. 1 ) through remote direct memory access ends the use of the shared memory buffer, the shared memory The buffer is released and deleted (S507).

또한, 본 발명의 일 실시예에 따른 원격 직접 메모리 접근을 통한 분산 처리 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버 클러스터의 등록을 해제한다(S509).In addition, in the distributed processing method through remote direct memory access according to an embodiment of the present invention, the distributed processing device (see 110 of FIG. 1 ) through remote direct memory access deregisters the shared memory server cluster (S509) .

도 6은 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)의 일 예를 나타낸 동작 흐름도이다.6 is an operation flowchart illustrating an example of the step ( S503 ) of creating and allocating the shared memory buffer shown in FIG. 5 .

도 6을 참조하면, 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버별 공유 메모리 버퍼 세그먼트 크기를 계산한다(S601).Referring to FIG. 6 , the step of creating and allocating the shared memory buffer shown in FIG. 5 ( S503 ) is performed by the distributed processing unit (see 110 of FIG. 1 ) through remote direct memory access, the shared memory buffer segment for each shared memory server The size is calculated (S601).

또한, 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)에 공유 메모리 버퍼 세그먼트의 생성 및 할당을 요청한다(S603). 여기서, 공유 메모리 버퍼 세그먼트의 생성 및 할당 요청은 공유 메모리 버퍼의 생성 및 할당 요청과 동일한 의미로 사용될 수 있다.In addition, in the step (S503) of creating and allocating the shared memory buffer shown in FIG. 5, the distributed processing unit (refer to 110 in FIG. 1) through remote direct memory access is to the shared memory servers (refer to 120 in FIG. 1). It requests creation and allocation of a shared memory buffer segment (S603). Here, a request for creation and allocation of a shared memory buffer segment may be used in the same meaning as a request for creation and allocation of a shared memory buffer.

또한, 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)에 의하여 공유 메모리 버퍼 세그먼트들이 생성 및 할당되면 반환되는 정보를 획득한다(S605).In addition, in the step (S503) of creating and allocating the shared memory buffer shown in FIG. 5, the distributed processing unit (refer to 110 in FIG. 1) through remote direct memory access is to the shared memory servers (refer to 120 in FIG. 1). When the shared memory buffer segments are created and allocated, information returned is obtained (S605).

또한, 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 모든 공유 메모리 버퍼 세그먼트들이 생성 및 할당되었는지 여부를 확인한다(S607).In addition, the step (S503) of creating and allocating the shared memory buffer shown in FIG. 5 is the distributed processing unit (refer to 110 in FIG. 1) through remote direct memory access, whether all shared memory buffer segments have been created and allocated. Confirm (S607).

또한, 도 5에 도시된 공유 메모리 버퍼를 생성 및 할당하는 단계(S503)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 모든 공유 메모리 버퍼 세그먼트들이 생성 및 할당되어 공유 메모리 버퍼가 구성되면, 메모리에 공유 메모리 버퍼와 동일한 크기만큼의 로컬 공유 메모리 영역을 할당하고, 공유 메모리 버퍼와 로컬 공유 메모리 영역 사이의 메모리 맵핑 테이블을 갱신한다(S609).In addition, in the step of creating and allocating the shared memory buffer shown in FIG. 5 ( S503 ), the distributed processing unit (see 110 of FIG. 1 ) through remote direct memory access creates and allocates all shared memory buffer segments to the shared memory When the buffer is configured, a local shared memory area of the same size as the shared memory buffer is allocated to the memory, and a memory mapping table between the shared memory buffer and the local shared memory area is updated (S609).

도 7은 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계(S507)의 일 예를 나타낸 동작 흐름도이다.7 is an operation flowchart illustrating an example of the step of releasing and deleting the shared memory buffer shown in FIG. 5 ( S507 ).

도 7을 참조하면, 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계(S507)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)에 공유 메모리 버퍼 세그먼트의 해제 및 삭제를 요청한다(S701). 여기서, 공유 메모리 버퍼 세그먼트의 해제 및 삭제 요청은 공유 메모리 버퍼의 해제 및 삭제 요청과 동일한 의미로 사용될 수 있다.Referring to FIG. 7 , the step of releasing and deleting the shared memory buffer shown in FIG. 5 ( S507 ) is performed by the distributed processing unit (see 110 of FIG. 1 ) through remote direct memory access, and the shared memory servers ( FIG. 1 ). 120) to request release and deletion of the shared memory buffer segment (S701). Here, the request to release and delete the shared memory buffer segment may be used in the same meaning as the request to release and delete the shared memory buffer.

또한, 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계(S507)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)에 의하여 공유 메모리 버퍼 세그먼트들이 해제 및 삭제되면 반환되는 정보를 획득한다(S703).In addition, the step of releasing and deleting the shared memory buffer shown in FIG. 5 ( S507 ) is performed by the distributed processing unit (see 110 in FIG. 1 ) through remote direct memory access to the shared memory servers (see 120 in FIG. 1 ). When the shared memory buffer segments are released and deleted, the returned information is obtained (S703).

또한, 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계(S507)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 모든 공유 메모리 버퍼 세그먼트들이 해제 및 삭제되었는지 여부를 확인한다(S705).In addition, the step of releasing and deleting the shared memory buffer shown in Fig. 5 (S507) is a distributed processing unit (see 110 in Fig. 1) through remote direct memory access, whether all shared memory buffer segments have been released and deleted Confirm (S705).

또한, 도 5에 도시된 공유 메모리 버퍼를 해제 및 삭제하는 단계(S507)는 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 모든 공유 메모리 버퍼 세그먼트들이 해제 및 삭제되어 공유 메모리 버퍼의 사용이 종료되면, 할당된 로컬 공유 메모리 영역을 해제하고, 상응하는 메모리 맵핑 테이블을 삭제한다(S707).In addition, in the step of releasing and deleting the shared memory buffer shown in FIG. 5 ( S507 ), the distributed processing unit (see 110 in FIG. 1 ) through remote direct memory access releases and deletes all shared memory buffer segments to the shared memory When the use of the buffer is finished, the allocated local shared memory area is released, and the corresponding memory mapping table is deleted (S707).

도 8은 본 발명의 일 실시예에 따른 공유 메모리 버퍼들의 데이터 누적 연산 방법을 나타낸 동작이다.8 is an operation illustrating a data accumulation operation method of shared memory buffers according to an embodiment of the present invention.

도 8을 참조하면, 본 발명의 일 실시예에 따른 공유 메모리 버퍼들의 데이터 누적 연산 방법은, 각 공유 메모리 서버들(810, 820 및 830)에서 할당된 공유 메모리 버퍼 세그먼트들에 대하여 데이터 누적 연산을 수행하는 것으로 이루어진다.Referring to FIG. 8 , in the method for accumulating data of shared memory buffers according to an embodiment of the present invention, a data accumulating operation is performed on shared memory buffer segments allocated by each of the shared memory servers 810 , 820 and 830 . done by performing

첫 번째 공유 메모리 서버(810)에는 제1 공유 메모리 버퍼 세그먼트 1(841) 및 제2 공유 메모리 버퍼 세그먼트 1(851)이 할당되어 있으며, 두 번째 공유 메모리 서버(820)에는 제1 공유 메모리 버퍼 세그먼트 2(842) 및 제2 공유 메모리 버퍼 세그먼트 2(852)가 할당되어 있으며, n 번째 공유 메모리 서버(830)에는 제1 공유 메모리 버퍼 세그먼트 n(843) 및 제2 공유 메모리 버퍼 세그먼트 n(853)이 할당되어 있다. 그리고, 제1 공유 메모리 버퍼 세그먼트 1(841), 제1 공유 메모리 버퍼 세그먼트 2(842) 및 제1 공유 메모리 버퍼 세그먼트 n(843) 등은 제1 공유 메모리 버퍼(840)를 구성한다. 또한, 제2 공유 메모리 버퍼 세그먼트 1(851), 제2 공유 메모리 버퍼 세그먼트 2(852) 및 제2 공유 메모리 버퍼 세그먼트 n(853) 등은 제2 공유 메모리 버퍼(850)를 구성한다.A first shared memory buffer segment 1 (841) and a second shared memory buffer segment 1 (851) are allocated to the first shared memory server 810, and the first shared memory buffer segment is allocated to the second shared memory server 820. 2 ( 842 ) and a second shared memory buffer segment 2 ( 852 ) are allocated, and an nth shared memory server ( 830 ) has a first shared memory buffer segment n ( 843 ) and a second shared memory buffer segment n ( 853 ) is assigned. In addition, the first shared memory buffer segment 1 841 , the first shared memory buffer segment 2 842 , and the first shared memory buffer segment n 843 , etc. constitute the first shared memory buffer 840 . In addition, the second shared memory buffer segment 1 ( 851 ), the second shared memory buffer segment 2 ( 852 ), and the second shared memory buffer segment n ( 853 ) constitute the second shared memory buffer ( 850 ).

각 공유 메모리 서버들(810, 820 및 830)은 제1 공유 메모리 버퍼(840)의 제1 공유 메모리 버퍼 세그먼트(841, 842 및 843)의 데이터를 제2 공유 메모리 버퍼(850)의 제2 공유 메모리 버퍼 세그먼트(851, 852 및 853)에 누적하여 공유 메모리 버퍼들 간의 데이터 누적 연산을 수행할 수 있다.Each of the shared memory servers 810 , 820 and 830 share the data of the first shared memory buffer segment 841 , 842 and 843 of the first shared memory buffer 840 with the second shared memory buffer 850 . By accumulating in the memory buffer segments 851 , 852 , and 853 , a data accumulation operation between the shared memory buffers may be performed.

도 9는 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법을 나타낸 동작 흐름도이다.9 is an operation flowchart illustrating a data accumulation operation method of the shared memory buffers shown in FIG. 8 .

도 9를 참조하면, 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 제1 공유 메모리 버퍼의 데이터를 로컬 공유 메모리 영역과 동기화한다(S901).Referring to FIG. 9 , in the method of accumulating data of the shared memory buffers shown in FIG. 8 , the distributed processing unit (see 110 of FIG. 1 ) through remote direct memory access transfers the data of the first shared memory buffer to the local shared memory area. is synchronized with (S901).

또한, 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)에 제1 공유 메모리 버퍼로부터 제2 공유 메모리 버퍼로의 데이터 누적 연산을 요청한다(S903).In addition, in the method for accumulating data of shared memory buffers shown in FIG. 8 , a distributed processing unit (see 110 in FIG. 1 ) through remote direct memory access provides a first shared memory to shared memory servers (see 120 in FIG. 1 ). A data accumulation operation is requested from the buffer to the second shared memory buffer (S903).

또한, 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 공유 메모리 서버들(도 1의 120 참조)로부터 제2 공유 메모리 버퍼 세그먼트를 잠근 이후 제1 공유 메모리 버퍼 세그먼트의 데이터를 제2 공유 메모리 버퍼 세그먼트에 누적한 결과를 수신한다(S905).In addition, the data accumulation operation method of the shared memory buffers shown in FIG. 8 is a distributed processing unit (see 110 in FIG. 1 ) through remote direct memory access, a second shared memory from shared memory servers (see 120 in FIG. 1 ) After locking the buffer segment, a result of accumulating the data of the first shared memory buffer segment in the second shared memory buffer segment is received (S905).

또한, 도 8에 도시된 공유 메모리 버퍼들의 데이터 누적 연산 방법은 원격 직접 메모리 접근을 통한 분산 처리 장치(도 1의 110 참조)가, 모든 공유 메모리 버퍼 세그먼트들에 대한 누적 연산이 완료되었는지 확인하여 데이터 누적 연산의 결과를 반환한다(S907).In addition, in the data accumulation operation method of the shared memory buffers shown in FIG. 8, the distributed processing unit (refer to 110 in FIG. 1) through remote direct memory access checks whether the accumulation operation for all shared memory buffer segments is completed, The result of the accumulation operation is returned (S907).

본 발명에서 설명하는 특정 실행들은 실시예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들로서 나타내어질 수 있다. 또한, “필수적인”, “중요하게” 등과 같이 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.The specific implementations described in the present invention are examples and do not limit the scope of the present invention in any way. For brevity of the specification, descriptions of conventional electronic components, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connection or connection members of the lines between the components shown in the drawings exemplarily represent functional connections and/or physical or circuit connections, and in an actual device, various functional connections, physical connections that are replaceable or additional may be referred to as connections, or circuit connections. In addition, unless there is a specific reference such as “essential” or “importantly”, it may not be a necessary component for the application of the present invention.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and the scope of the spirit of the present invention is not limited to the scope of the scope of the present invention. will be said to belong to

100: 원격 직접 메모리 접근을 통한 분산 처리 시스템
110: 원격 직접 메모리 접근을 통한 분산 처리 장치
120: 공유 메모리 서버 130: RDMA 지원 네트워크
210: 제어부 220: 통신부
230: 메모리 240: 연산 처리부
250: 공유 메모리 서버 접근 관리부
260: 메모리 맵핑 테이블 관리부
310: 제어부 320: 통신부
330: 메모리 340: 공유 메모리 관리부100: Distributed processing system through remote direct memory access
110: Distributed processing unit through remote direct memory access
120: shared memory server 130: RDMA support network
210: control unit 220: communication unit
230: memory 240: arithmetic processing unit
250: shared memory server access management unit
260: memory mapping table management unit
310: control unit 320: communication unit
330: memory 340: shared memory management unit

Claims

a communication unit for transmitting and receiving data by accessing remote memories provided in a plurality of shared memory servers; and
For a shared memory buffer composed of shared memory buffer segments allocated from the remote memories, a local shared memory area of a size corresponding to that of the shared memory buffer is allocated to a memory, and the shared memory buffer and the local shared memory area are separated Synchronizing shared memory access management unit;
Distributed processing device through remote direct memory access, characterized in that it comprises a.

The method according to claim 1,
The shared memory access management unit
Distributed processing unit through remote direct memory access, characterized in that the shared memory buffer and the local shared memory area are synchronized using a memory mapping table.

The method according to claim 1,
The shared memory buffer is
Distributed processing unit through remote direct memory access, characterized in that it corresponds to a virtual continuous buffer shared by a plurality of distributed processing units.

The method according to claim 1,
The shared memory buffer segments are
Distributed processing device through remote direct memory access, characterized in that corresponding to the physical memory segments of the plurality of shared memory servers.

The method according to claim 1,
The shared memory access management unit
Distributed through remote direct memory access, characterized in that calculating the size of the shared memory buffer segments corresponding to each of the shared memory servers, and requesting the shared memory servers to create and allocate the shared memory buffer segments processing unit.

The method according to claim 1,
The shared memory access management unit
Distributed processing device through remote direct memory access, characterized in that it performs a data accumulation operation between a plurality of shared memory buffers.

a communication unit for transmitting and receiving data to and from a plurality of distributed processing devices through remote direct memory access; and
a memory directly accessible to the distributed processing units;
including,
a shared memory buffer segment is allocated from the memory;
The shared memory buffer segment is characterized in that to configure a shared memory buffer with the shared memory buffer segments of other shared memory servers, the shared memory server.

allocating a local shared memory area of a size corresponding to that of the shared memory buffer in a memory for a shared memory buffer composed of shared memory buffer segments allocated from remote memories provided in a plurality of shared memory servers; and
synchronizing the shared memory buffer and the local shared memory area by accessing the remote memories to transmit and receive data;
Distributed processing method through remote direct memory access, characterized in that it comprises a.