KR101797929B1

KR101797929B1 - Assigning processes to cores in many-core platform and communication method between core processes

Info

Publication number: KR101797929B1
Application number: KR1020160101024A
Authority: KR
Inventors: 송병권; 신인재; 이원형
Original assignee: 서경대학교 산학협력단
Priority date: 2015-08-26
Filing date: 2016-08-09
Publication date: 2017-11-15
Also published as: KR20170026130A; WO2017034200A1

Abstract

본 발명은 매니코어 플랫폼에서 코어에 프로세스를 할당하는 방법 및 코어 프로세스간 통신 방법에 관한 것이다.
본 발명에서는 복수 개 코어를 포함하는 매니코어 플랫폼에 있어서, 각 코어들 사이를 연결하는 통신 버스와, 각 코어들 사이를 연결하며, 제1 코어에 구비된 레지스터에 저장된 데이터를 직접적으로 제2 코어에 구비된 레지스터에 직접 데이터를 송수신하는 사용자 동적 네트워크와, 각 코어가 억세스 가능한 공유메모리를 포함하고, 각 코어에는 코어 프로세서와, 통신 버스와 상기 사용자 동적 네트워크를 스위칭하는 코어 버스 스위치와, 사용자 동적 네트워크를 통해 수신된 데이터를 저장하는 역 다중화 버퍼 및 캐쉬 메모리를 포함하는 것을 특징으로 하는 매니코어 플랫폼이 제공된다.The present invention relates to a method for allocating a process to a core in a ManiCore platform and a method for communication between core processes.
According to the present invention, in a manifold platform including a plurality of cores, a communication bus connecting each of the cores, a data bus connecting the cores, and storing data stored in a register provided in the first core directly, A core bus switch for switching the communication bus and the user dynamic network, and a user dynamic network controller for switching the user dynamic network, And a demultiplexing buffer for storing data received through a network and a cache memory.

Description

TECHNICAL FIELD [0001] The present invention relates to a method for assigning a process to a core in a ManiCore platform, and a method for communicating between the core process and a core process.

본 발명은 매니코어 플랫폼에서 코어에 프로세스를 할당하는 방법 및 코어 프로세스간 통신 방법에 관한 것으로서, 보다 상세하게는 다수 개 코어로 이루어진 매니코어 플랫폼 상에서 프로세스를 처리할 때 코어 간의 통신 방법 및 프로세스 할당을 최적화하는 매니코어 플랫폼에서 코어에 프로세스를 할당하는 방법 및 코어 프로세스간 통신 방법에 관한 것이다.The present invention relates to a method for allocating a process to a core in a ManiCore platform and a method for communicating between core processes. More particularly, the present invention relates to a communication method and a process allocation between cores when processing a process on a multi- A method for allocating a process to a core in an optimizing < RTI ID = 0.0 > ManiCore < / RTI >

현재 프로세서의 발전은 단일 코어의 기술적인 성능 향상보다는 하나의 프로세서를 다수의 코어로 구성하고 이를 최적화하는 방향으로 나아가고 있으며, 작은 휴대용 장치에도 멀티코어 CPU(Central Processing Unit)가 장착되고 있다.The development of the current processor is moving toward the optimization and optimization of a single processor into multiple cores rather than improving the technical performance of a single core, and a multi-core CPU (Central Processing Unit) is also installed in a small portable device.

이러한 추세에 걸맞게 고성능 컴퓨팅을 위한 프로세서는 수십 개에 달하는 코어로 구성되고 있으며, 이러한 프로세서에 대한 명칭을 매니코어 프로세서 또는 매니코어 플랫폼이라고 한다.In response to this trend, processors for high-performance computing consist of dozens of cores, and the name for these processors is called the ManiCORE processor or ManiCore platform.

이렇게 하나의 프로세서에 구성되는 코어의 개수가 증가하고 이것이 보편화되면서, 다수 개 코어를 효율적으로 사용하는 병렬 처리 방법에 대한 관심도 및 중요도가 높아졌다.As the number of cores in one processor increases and this becomes common, interest and importance of parallel processing method that efficiently use multiple cores is increased.

고성능 컴퓨팅 성능을 최대로 이끌어내기 위해서는 프로세스가 다수 개 코어에 효율적으로 분배되어야 하며, 코어 간의 통신 또한 코어의 프로세스 처리를 지연시키지 않는 방향으로 수행되어야 한다.In order to maximize the performance of high-performance computing, the process must be efficiently distributed to multiple cores, and communications between the cores must also be performed in a manner that does not delay processing of the cores.

종래 전통적인 병렬 처리 방법은 프로세스가 가진 최대의 컴퓨팅 성능을 온전히 이끌어내지 못하는 단점을 가지고 있는 것이 발견되는 추세이며, 시대적인 흐름으로 볼 때 이러한 단점은 시간이 흐를수록 더 크게 부각될 것이다. 따라서 상기한 문제점을 해결할 수 있는 새로운 병렬 처리 방법이 필요하게 되었다.Conventionally, the conventional parallel processing method has a disadvantage that it can not completely bring out the maximum computing performance of the process. In view of the current trend, such a disadvantage will become more prominent as time passes. Therefore, a new parallel processing method capable of solving the above problems is needed.

대한민국공개특허 제10-2012-0066189호(2012.06.22. 공개)Korean Patent Publication No. 10-2012-0066189 (2012.06.22. Open)

매니코어 플랫폼이 발달하면서 병렬 프로그래밍에 대한 문제는 날이 갈수록 증가하고 있다. 이는 하드웨어적인 병렬구조와 소프트웨어적인 병렬구조는 차이가 존재하기 때문이다. 본 발명은 이러한 문제점을 해결하고자 하는 것으로서, 매니코어 플랫폼에서 코어를 효율적으로 할당하고 코어 간의 직접적인 통신 통로를 만들어 효율적인 병렬처리가 가능한 매니코어 플랫폼에서 코어에 프로세스를 할당하는 방법 및 코어 프로세스간 통신 방법을 제공하는 것을 목적으로 한다.With the development of the Manicore platform, parallel programming problems are increasing day by day. This is because there is a difference between a hardware parallel structure and a software parallel structure. SUMMARY OF THE INVENTION The present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a method for efficiently assigning cores in a manifold platform and creating a direct communication path between cores, And to provide the above objects.

본 발명의 상기 목적은 제1코어 및 제2코어가 포함하는 복수 개 코어가 가로 및 세로 방향으로 배열되고, 상기 복수 개 코어 사이를 매쉬 형태의 통신 버스 구조로 연결하는 매니코어 플랫폼에 있어서, 각 코어들 사이를 연결하며, 상기 제1코어에 구비된 레지스터에 저장된 데이터를 직접적으로 상기 제2코어에 구비된 레지스터에 직접 데이터를 송수신하는 사용자 동적 네트워크와, 각 코어가 억세스 가능한 공유메모리를 포함하고, 각 코어에는 코어 프로세서와, 통신 버스와 상기 사용자 동적 네트워크를 스위칭하는 코어 버스 스위치 및 캐쉬 메모리를 포함하는 것을 특징으로 하며, 코어 프로세서에는 상기 사용자 동적 네트워크를 통해 전송받는 데이터를 저장하며 FIFO 방식의 복수 개 역 다중화 큐와, 역 다중화 큐에 저장된 데이터의 순서를 변환하기 위한 역 다중화 버퍼를 포함하는 것을 특징으로 하는 매니코어 플랫폼에 의해서 달성 가능하다.The above object of the present invention is also achieved by a manifold platform in which a plurality of cores included in a first core and a second core are arranged in the horizontal and vertical directions and the plurality of cores are interconnected by a communication bus structure in a mesh form, A user dynamic network connecting the cores and directly transmitting and receiving data stored in the registers provided in the first core directly to the registers provided in the second core and a shared memory accessible to each core, Each core includes a core processor, a communication bus, a core bus switch for switching the user dynamic network, and a cache memory. The core processor stores data received through the user dynamic network, A plurality of demultiplexing queues, and an order of data stored in the demultiplexing queue In that it comprises a buffer for demultiplexing it can be achieved by a manifold core platform as claimed.

본 발명의 또 다른 목적은 제1코어 및 제2코어가 포함하는 복수 개 코어가 가로 및 세로 방향으로 배열되고, 복수 개 코어 사이를 매쉬 형태의 통신 버스 구조로 연결하는 매니코어 플랫폼에서 코어 프로세스 사이의 통신하는 방법에 있어서, 제1코어 및 상기 제2코어에는 각각 데이터가 저장되는 복수 개 역 다중화 큐가 구비되고, 상기 제1코어에서 선택된 어느 하나의 역 다중화 큐에 저장된 데이터는 사용자 동적 네트워크를 이용하여 상기 제2코어에서 선택된 어느 하나의 역 다중화 큐에 직접적으로 송신하는 것을 특징으로 하는 코어 프로세스 사이의 통신하는 방법에 의해서 달성 가능하다.It is a further object of the present invention to provide a method and apparatus for managing a plurality of cores in a manifold platform that includes a first core and a second core arranged in the horizontal and vertical directions, Wherein the first core and the second core each have a plurality of demultiplexing queues for storing data and the data stored in any one demultiplexing queue selected in the first core is transmitted to a user dynamic network And directly transmitting to any one of the demultiplexing queues selected in the second core using the method of the present invention.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로, 다수 개 코어로 구성된 매니코어 플랫폼의 코어 간 통신을 위한 장치는 코어 간에 메쉬 형식으로 구성된 통신 버스, 코어 간의 통신을 수신하는 코어측이 가지고 있는 역 다중화 버퍼(demux buffer), 및 역 다중화 버퍼와 연결되어 있는 역 다중화 큐(demux queue)를 포함한다.According to an aspect of the present invention, there is provided an apparatus for inter-core communication of a plurality of cores, comprising: a communication bus configured in a mesh format between cores; a demultiplexer A demux buffer, and a demux queue coupled to the demultiplexing buffer.

상기한 통신 버스는 매니코어 플랫폼의 모든 코어를 2차원 정방형으로 연결한 메쉬 형식의 물리적 네트워크로 코어 간의 통신 정보를 전달할 수 있다.The communication bus can transmit communication information between the cores in a mesh-type physical network in which all the cores of the ManiCore platform are connected in a two-dimensional square.

상기한 코어 간의 통신 정보는 한 개 또는 다수 개의 코어 워드 크기를 가진 데이터로, 코어에서 수행되는 프로세스가 다른 코어에서 수행되는 프로세스에게 전달할 데이터를 포함할 수 있다. 상기한 코어 워드는 매니코어 플랫폼를 구성하고 있는 코어의 아키텍쳐에 따른 워드 크기로 구성한다.The communication information between the cores may include data having one or a plurality of core word sizes, and data to be transmitted to a process performed in the core is transmitted to a process performed in another core. The core word consists of a word size according to the architecture of the core constituting the ManiC platform.

상기한 역 다중화 버퍼는 매니코어 플랫폼의 각 코어에 개별적으로 구성되어 있는 장치로 역 다중화의 출력부는 다수 개의 역 다중화 큐에 연결되어 있어서, 다른 코어에서 수신한 통신 정보를 역 다중화 큐에 전달할 수 있다.The demultiplexing buffer is individually configured in each core of the manifold platform. The demultiplexing output unit is connected to a plurality of demultiplexing queues, so that the communication information received from the other core can be transmitted to the demultiplexing queue .

상기한 역 다중화 큐는 역 다중화 버퍼에 연결된 특수 레지스터로, 코어 간의 통신 정보를 수신한 코어는 역 다중화 큐가 매핑되어 있는 다수 개의 특수 레지스터를 읽음으로써 통신 정보 내부에 담긴 데이터를 인식할 수 있다.The demultiplexing queue is a special register connected to the demultiplexing buffer. The core receiving the communication information between the cores can recognize data contained in the communication information by reading a plurality of special registers to which the demultiplexing cues are mapped.

코어 간의 직접적인 통신 방법을 본 발명에서는 사용자 동적 네트워크라고 부른다. 기본적으로 시스템 운영을 위해 코어 간의 통신을 수행하지만 한 발 더 나아가 여러 코어 간의 통신을 직접으로 제어할 수 있는 방법으로 매니코어 플랫폼에서 병렬 프로그래밍을 더욱 효율적으로 할 수 있는 방안을 마련해 준다.The direct communication method between the cores is called a user dynamic network in the present invention. Basically, communication between cores is performed for system operation. However, the method for managing parallel communication among multiple cores can be further improved.

본 발명에 따른 매니코어 플랫폼에서 코어에 프로세스를 할당하는 방법 및 코어 프로세스간 통신 방법에 의하면 매니코어 플랫폼에서 프로세스를 코어에 효율적으로 할당함으로써 매쉬 구조 형태의 실시간 연산 환경에서 효율적으로 병렬처리함으로써 자원을 활용할 수 있다. 또한, 매니코어 플랫폼의 개별적인 코어에서 구동되는 프로세스 간의 데이터 통신을 짧은 반응시간 내에 처리할 수 있다.According to the method for allocating a process to a core and the method for communicating between core processes in a ManiCore platform according to the present invention, a process is efficiently allocated to a core in a ManiCore platform, thereby effectively performing parallel processing in a real- Can be utilized. In addition, data communication between processes running on individual cores of a Manicenter platform can be handled within a short response time.

도 1은 본 발명에 따른 4x4 매니코어 플랫폼의 매쉬 구조를 나타내는 개념도.
도 2는 본 발명에 따른 매쉬구조의 매니코어 플랫폼을 구성하는 각 코어의 구조를 나타내는 개념도.
도 3은 본 발명에 따른 일 실시예의 코어 버스 스위치와 코어 프로세서의 구성도.
도 4는 시스템 스택의 구조도.
도 5는 도 1에 제시된 매니코어 플랫폼에서 쓰레드를 할당하는 방식을 설명하는 개념도.
도 6은 매니코어 플랫폼에서 메모리를 공유하는 방법을 설명하는 예시도.
도 7은 코어 버스를 통한 코어 프로세스간의 동적 네트워크를 설명하는 개념도.
도 8은 본 발명의 실시예로서, 매니코어 플랫폼에서 프로세스 코어를 할당하고 개별 코어에서 구동되는 응용 간의 공유 메모리와 사용자 동적 네트워크를 사용하는 통신 예시도.1 is a conceptual view showing a mesh structure of a 4x4 manifold platform according to the present invention;
2 is a conceptual diagram showing the structure of each core constituting a manifold platform of a mesh structure according to the present invention;
3 is a configuration diagram of a core bus switch and a core processor according to an embodiment of the present invention;
4 is a structural view of a system stack;
5 is a conceptual diagram illustrating a method of allocating threads in the manifold platform shown in FIG.
Figure 6 is an exemplary diagram illustrating a method of sharing memory in a ManiCore platform;
7 is a conceptual diagram illustrating a dynamic network between core processes over a core bus;
Figure 8 is an example of a communication using a shared memory and a user dynamic network between applications running on separate cores with a process core allocated on a manifold platform.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail.

이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.It is to be understood that the present invention is not intended to be limited to the specific embodiments but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention.

본 발명에 따른 코어간 통신 버스는 매니코어 플랫폼의 모든 코어를 2차원 정방형으로 연결한 메쉬 형식의 물리적 네트워크로, 코어 간의 통신 정보를 전달할 수 있다.The inter-core communication bus according to the present invention is a mesh-type physical network in which all cores of the ManiCore platform are connected in a two-dimensional square manner, and can transmit communication information between the cores.

본 발명에 따른 코어 간의 통신 정보는, 한 개 또는 다수 개의 코어 워드 크기를 가진 데이터로 코어에서 수행되는 프로세스가 다른 코어에서 수행되는 프로세스에게 전달할 데이터를 포함할 수 있다. 이렇나 코어 워드는 매니코어 플랫폼를 구성하고 있는 코어의 아키텍쳐에 따른 워드 크기를 갖도록 설계한다.The communication information between the cores according to the present invention may include data to be transmitted to a process in which a process performed in the core is performed in another core, with data having one or a plurality of core word sizes. The core word is designed to have a word size according to the architecture of the core that constitutes the Manicure platform.

본 발명에 따른 역 다중화 버퍼는 매니코어 플랫폼의 각 코어에 개별적으로 구성되어 있는 장치로, 역 다중화의 출력부는 다수 개의 역 다중화 큐에 연결되어 있어서 다른 코어에서 수신한 통신 정보를 역 다중화 큐에 전달할 수 있다. 역 다중화 큐는, 역 다중화 버퍼에 연결된 특수 레지스터로, 코어 간의 통신 정보를 수신한 코어는 역 다중화 큐가 매핑되어 있는 다수 개의 특수 레지스터를 읽음으로써 통신 정보 내부에 담긴 데이터를 인식할 수 있다.The demultiplexing buffer according to the present invention is individually configured in each core of the manifold platform. The demultiplexing output unit is connected to a plurality of demultiplexing queues, and the communication information received from the other core is transmitted to a demultiplexing queue . The demultiplexing queue is a special register connected to the demultiplexing buffer. The core receiving the communication information between the cores can recognize data contained in the communication information by reading a plurality of special registers to which the demultiplexing cues are mapped.

매니코어 플랫폼은 동종의 다수 개의 코어로 병렬 처리 환경을 제공할 수 있는 시스템이다. 매니코어 플랫폼은 개별적인 프로세스를 처리할 수 있는 코어가 물리적인 통신 버스를 통해 연결되어 있으며, 각 코어는 다른 코어에게 데이터를 전송할 수 있다. 도 1은 본 발명에 따른 4x4 매니코어 플랫폼의 매쉬 구조를 나타내는 개념도이다. 각 코어는 통신 버스를 이용하여 코어 간에 통신을 할 수 있다. 코어 0는 하이퍼바이저와 슈퍼바이저를 수행하여 시스템의 운영과 관련된 일을 수행하는 주요 코어가 되며 코어 1 ~ 코어 15에서 코어 0으로부터 통신 버스를 통해 작업을 할당받아 수행한다.The ManiCore platform is a system that can provide a parallel processing environment with multiple cores of the same type. The ManiCore platform has cores capable of handling individual processes connected through a physical communications bus, and each core can transfer data to other cores. 1 is a conceptual view showing a mesh structure of a 4x4 manifold platform according to the present invention. Each core can communicate between cores using a communication bus. Core 0 is a core that performs hypervisor and supervisor tasks to perform tasks related to the operation of the system. Core 0 to core 15 are assigned tasks via a communication bus from core 0.

매니코어 플랫폼에서 코어 간에 데이터 연동 기능을 제공하는 통신 버스는 사용자 동적 네트워크(User Dynamic Network)로 구현되며, 이는 매니코어 플랫폼 상에 구현된 공유 메모리(Shared Memory)보다 우선시된다. 각 코어는 독립적으로 정보처리 등의 프로세스를 처리할 수 있다.In the ManiCore platform, a communication bus that provides data interworking between cores is implemented as a user dynamic network, which takes precedence over a shared memory implemented on the ManiClient platform. Each core can independently process a process such as information processing.

사용자 동적 네트워크는 최소한의 지연시간이 필요한 코어 간의 점대점 데이터 통신에 최적화되어 있다. 사용자 동적 네트워크는 이를 위해 구성된 하드웨어적인 통신 버스에 기반하기 때문에, 하나의 코어에서 다른 코어로 데이터가 이동함에 있어 캐쉬를 거치지 않고 레지스터에서 레지스터로 바로 연결하는 방식을 취하고 있어, 종래 코어 간 통신 방법에 비해 향상된 성능을 보인다. 코어 프로세서들은 사용자 동적 네트워크를 통해 직접적으로 빠른 통신이 가능하며 병렬프로그래밍에 특히 유용하게 사용할 수 있다.The user dynamic network is optimized for point-to-point data communication between cores requiring minimal latency. Since the user dynamic network is based on a hardware communication bus configured for this purpose, when data is moved from one core to another core, a method of directly connecting from a register to a register without going through a cache is adopted. The performance is improved. Core processors are able to communicate directly through the user dynamic network and are particularly useful for parallel programming.

도 2는 본 발명에 따른 매쉬구조의 매니코어 플랫폼을 구성하는 각 코어의 구조를 나타내는 개념도이다. 코어 0는 코어 간의 통신 버스(Bus)를 이어주는 코어 버스 스위치, 코어 프로세서, 및 캐쉬 메모리를 포함하도록 구성된다. 일반적으로 코어 버스 스위치를 통해 전달된 정보는 캐쉬 메모리로 저장되지만 본 발명에서는 사용자 동적 네트워크를 이용하여 실시간으로 코어 프로세서 간에 통신이 이루어지도록 구성하였다.2 is a conceptual diagram showing the structure of each core constituting the manifold platform of the mesh structure according to the present invention. The core 0 is configured to include a core bus switch, a core processor, and a cache memory that connect a communication bus between the cores. Generally, information transmitted through a core bus switch is stored in a cache memory. However, in the present invention, communication is performed between core processors in real time using a user dynamic network.

도 3은 본 발명에 따른 일 실시예의 코어 버스 스위치와 코어 프로세서의 구성도이다. 구체적으로 코어 버스 스위치와 코어 0 프로세서를 상세하게 도시한 구성도이다. 역 다중화 큐는 사용자 동적 네트워크를 통해 전달된 정보를 저장하는 역할을 하며, 코어 간의 통신 정보를 수신한 코어는 역 다중화 큐의 주소가 기록된 레지스터를 통해 역 다중화 큐에 접근함으로써 통신 정보 내부에 담긴 데이터를 인식할 수 있다. 2차원 매쉬 구조상의 각 코어는 사용자 동적 네트워크로 전달되는 정보를 수신하기 위해 다수 개의 큐를 생성하는데, 이때 이 큐를 역 다중화 큐라고 하며, 각 큐에 대한 주소를 지정된 레지스터에 기록한다. 코어의 프로세서 상에서 실행되는 소프트웨어는 레지스터에 기록된 주소를 통해 역 다중화 큐에 접근할 수 있다.3 is a configuration diagram of a core bus switch and a core processor according to an embodiment of the present invention. Specifically, the core bus switch and the core 0 processor are shown in detail. The demultiplexing queue stores the information transmitted through the user dynamic network. The core receiving the communication information between the cores accesses the demultiplexing queue through the register in which the address of the demultiplexing queue is recorded, Data can be recognized. Each core on a two-dimensional mesh structure creates a number of queues to receive information conveyed to the user's dynamic network, which is referred to as a demultiplexing queue, and the address for each queue is written to a designated register. Software running on the core's processor can access the demultiplexed queue via the address written to the register.

사용자 동적 네트워크에서 코어와 코어 사이에 전송되는 정보는 필수적으로 특정한 식별값을 갖는다. 해당 식별값은 코어가 정보를 수신할 때, 어떠한 역 다중화 큐에 정보를 저장할 것인가를 결정하기 위해 사용되며, 큐에 정보가 저장된 이후에는 제거된다. 따라서 소프트웨어가 레지스터를 통해 역 다중화 큐에 접근하면 해당 식별값을 비롯한 헤더가 제거된 정보를 취득한다.The information transmitted between the core and the core in the user dynamic network necessarily has a specific identification value. The identification value is used to determine which demux queue the information will be stored in when the core receives the information, and is removed after the information is stored in the queue. Therefore, when the software accesses the demultiplexing queue through the register, the information including the corresponding identification value is removed.

하나의 역 다중화 큐에 어떠한 정보가 여러 개 저장되어 있을 때, 큐에 가장 먼저 저장된 정보가 소프트웨어에 의해 사용되지 않아서 그 후로 도착한 정보들이 큐에 저장된 채로 접근되지 못하는 경우를 막기 위해, 각 코어는 역 다중화 버퍼를 생성한다. 역 다중화 버퍼의 크기는 구현하기에 따라 다르며, 정보가 코어에 도착한 순서대로 역 다중화 큐에 저장되는 대신, 역 다중화 버퍼를 사용하여 큐에 적재된 정보의 순서가 변경될 수 있도록 한다.When several pieces of information are stored in one demultiplexing queue, in order to prevent the information that is stored first in the queue from being used by the software so that the arrived information can not be accessed while being stored in the queue, And generates a multiplexing buffer. The size of the demultiplexing buffer differs depending on the implementation, and instead of being stored in the demultiplexing queue in the order in which the information arrives at the core, the demultiplexing buffer can be used to change the order of the information stored in the queue.

매니코어 플랫폼의 코어들을 제어하기 위해서 하드웨어 구성을 제어하는 시스템 스택인 하이퍼 바이저 계층이 존재한다. 도 4는 시스템 스택의 구조도이다. 하이퍼 바이저는 매니 매니코어 플랫폼뿐만 아니라 시스템이 가진 모든 하드웨어들을 추상화 하고 운영하는 역할을 한다. 하이퍼 바이저 위에 어플리케이션을 위한 입출력 장치와 시스템을 제어하는 라이브러리를 제공하는 슈퍼 바이져 계층이 존재한다.There is a hypervisor layer that is a system stack that controls the hardware configuration to control the cores of the Manicenter platform. 4 is a structural view of the system stack. The hypervisor abstracts and manages all the hardware of the system as well as the Manny core platform. There is a supervisor layer that provides input / output devices for applications on the hypervisor and libraries that control the system.

매니코어 플랫폼에서 응용을 수행할 때 쓰레드를 할당하는 방법은 최대한 메인 프로세서와 가까운 코어에 쓰레드를 할당하는 것이 효율적이다. 메인 프로세서에서는 새로운 쓰레드가 생성되면 메인 프로세서가 할당된 코어의 부담을 줄여주기 위해 코어 0와 가까이 위치하는 코어에 쓰레드를 할당하여 작업을 분담한다. 도 5는 도 1에 제시된 매니코어 플랫폼에서 쓰레드를 할당하는 방식을 설명하는 개념도이다. 코어 0와 가까운 코어 1, 코어 4, 코어 5 등에 쓰레드를 할당하여 작업을 분담한다. 해당 코어들은 매니코어 플랫폼상에 구현된 공유 메모리를 통해 정보를 공유하게 된다.When assigning threads to applications on the ManiCore platform, it is efficient to allocate threads to the core as close to the main processor as possible. In the main processor, when a new thread is created, the main processor allocates a thread to the core located close to core 0 to reduce the burden on the allocated core. 5 is a conceptual diagram illustrating a method of allocating threads in the manifold platform shown in FIG. It assigns threads to core 1, core 4, and core 5 close to core 0 to share work. The cores share information through the shared memory implemented on the ManiClient platform.

매니코어 플랫폼에서 응용이 두 개 이상의 코어를 이용하면 코어들은 매니코어 플랫폼상에 구현된 공유 메모리를 통해 정보를 공유하게 된다. 도 6은 매니코어 플랫폼에서 메모리를 공유하는 방법을 설명하는 예시도이다. 코어 0는 공유 메모리를 통해 코어 1과 메모리를 공유하며 코어 2는 코어 3과 메모리를 공유한다. 해당 공유 방법은 하나의 응용에서 쓰레드 할당을 통해 시스템이 자동적으로 공유를 생성할 수 있지만 사용자에 의해 다른 코어에 할당된 두 개 이상의 응용간의 공유 메모리를 생성할 수 있다. 공유된 메모리 정보를 모두 코어간의 통신 버스를 통해 전달이 된다.On a ManiCore platform, when an application uses two or more cores, the cores share information through shared memory implemented on the ManiClient platform. 6 is an exemplary diagram illustrating a method of sharing memory in the ManiCore platform. Core 0 shares memory with core 1 through shared memory, while core 2 shares memory with core 3. The sharing method can create shared memory between two or more applications assigned to different cores by a user, although the system can automatically create a share through thread allocation in one application. All the shared memory information is transferred through the communication bus between the cores.

매니코어 플랫폼에서 코어간의 직접적인 통신을 위해 코어 버스에 사용자 동적 네트워크를 수행하는 통로를 지정하였다. 이러한 통로를 통해 공유메모리 또는 캐쉬 메모리를 통과하지 않고 직접 코어 프로세스끼리 즉각적인 정보 교환이 가능하다. 도 7은 코어 버스를 통한 코어 프로세스간의 동적 네트워크를 설명하는 개념도이다. 도 7에서는 코어 0가 코어 1, 코어 4, 및 코어 5에 사용자 동적 네트워크를 통해 정보를 전달하면 코어 1, 코어 4, 및 코어 5는 받은 정보를 가공하거나 혹은 받은 정보를 가공하지 않고 코어 6에 전달할 수 있다. 코어간의 수신된 정보들은 역 다중화 큐라는 공간에 저장된 후 처리되며 큐는 FIFO방식으로 수행된다. 사용자 동적 네트워크를 이용하면 시스템 또는 응용에서 사용자가 긴급한 데이터를 효율적이고 빠르게 전달할 수 있어 병렬프로그래밍을 효과적으로 수행한다.In the ManiCore platform, we have specified a path to perform user dynamic networking on the core bus for direct communication between cores. This pathway allows immediate exchange of information between core processes directly without passing through shared memory or cache memory. 7 is a conceptual diagram illustrating a dynamic network between core processes over a core bus. In FIG. 7, core 0 transmits information to core 1, core 4, and core 5 via the user dynamic network. Core 1, core 4, and core 5 process the received information, . The received information between the cores is stored in the space of the demultiplexed cue, and the cue is performed in the FIFO manner. With user dynamic network, it is possible for users to transmit urgent data efficiently and quickly in system or application, effectively performing parallel programming.

도 8은 본 발명의 실시예로서, 매니코어 플랫폼에서 프로세스 코어를 할당하고 개별 코어에서 구동되는 응용 간의 공유 메모리와 사용자 동적 네트워크를 사용하는 통신 예시도이다. 응용 0은 코어 0에 메인 프로세스를 구동한다. 메인 프로세스는 코어에서 생성되는 쓰레드를 각 코어 1, 2, 3, 4에 쓰레드 1, 2, 3, 4로 할당하여 하나의 응용이 구동된다. 이와 같은 방법으로 응용 0는 총 5개의 코어를 사용하여 구동되며 응용 1, 2, 3, 4는 각각 코어 5, 6, 7, 8에서 개별 구동된다. 응용 1은 공유 메모리를 통해 응용 1의 쓰레드 1이 할당된 코어 1과 공유 메모리 방식을 사용하여 메모리를 공유하게 된다. 또한, 응용 2, 3, 4는 동일한 방식으로 응용 0의 쓰레드 2, 3, 4와 메모리를 공유하여 병렬 프로그래밍이 가능하다. 그와 동시에 응용 1, 2, 3, 4는 응용 0의 메인 프로세스와 사용자 동적 네트워크를 통해 긴급한 데이터를 서로 주고받을 수 있다. 여기서 처리되는 정보는 공유메모리보다 우선시 되어 처리된다. 이처럼 본 발명을 사용하면 사용자가 자유롭게 실시간 처리를 할 수 있으며 한정된 메모리 자원을 효율적으로 활용할 수 있다.

본 발명에서는 제1코어 및 제2코어를 포함하는 복수 개 코어가 가로 및 세로 방향으로 배열되고, 상기 복수 개 코어 사이를 매쉬 형태의 통신 버스 구조로 연결하는 매니코어 플랫폼에서 코어 프로세스 사이의 통신하는 방법에 있어서, 제1코어 및 제2코어에는 각각 데이터가 저장되는 복수 개 역 다중화 큐가 구비되고, 제1코어에서 선택된 어느 하나의 역 다중화 큐에 저장된 데이터는 사용자 동적 네트워크를 이용하여 제2코어에서 선택된 어느 하나의 역 다중화 큐에 직접적으로 송신되는 것을 특징으로 하고, 제1코어에는 역 다중화 버퍼를 구비하고, 제1코어에 저장된 데이터를 제2코어에 전송하기 이전에 제1코어에서 선택된 역 다중화 큐에 저장된 복수 개 데이터 중에서 제2코어에 전송하기 원하는 데이터를 인출하는 단계를 더 구비하는 것을 특징으로 하는 코어 프로세스 사이의 통신하는 방법이 제공된다.
FIG. 8 is a diagram illustrating a communication example in which a process core is allocated in a ManiCore platform, and a shared memory and a user dynamic network are used between applications that are run on an individual core, according to an embodiment of the present invention. Application 0 drives the main process on core 0. The main process allocates threads 1, 2, 3, and 4 to each core 1, 2, 3, and 4 to generate one thread. In this way, application 0 is driven by 5 cores in total, and applications 1, 2, 3 and 4 are individually driven in core 5, 6, 7 and 8, respectively. Application 1 shares the memory using shared memory with Core 1 allocated to Thread 1 of Application 1 through shared memory. In addition, applications 2, 3, and 4 share memory with threads 2, 3, and 4 of application 0 in the same manner, and parallel programming is possible. At the same time, applications 1, 2, 3, and 4 can exchange urgent data through the main process of application 0 and the user dynamic network. The information to be processed here is given priority over the shared memory. As described above, the present invention allows a user to freely perform real-time processing and efficiently utilize limited memory resources.

In the present invention, a plurality of cores including a first core and a second core are arranged in the horizontal and vertical directions, and communication is performed between the core processes in the manifold platform connecting the plurality of cores in a mesh- Wherein the first core and the second core each have a plurality of demultiplexing queues for storing data and the data stored in any one demultiplexing queue selected in the first core is transmitted to the second core The first core is provided with a demultiplexing buffer and the data stored in the first core is transmitted to the selected core in the first core before being transmitted to the second core, And extracting data to be transmitted to the second core from a plurality of data stored in the multiplexing queue The communication method between the core process is provided that.

본 명세서의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략하였다.In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present disclosure rather unclear.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

또한 본 발명의 실시예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다.In addition, the components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, which does not mean that each component is composed of separate hardware or software constituent units. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and separate embodiments of the components are also included within the scope of the present invention, unless they depart from the essence of the present invention.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the relevant art and are to be interpreted in an ideal or overly formal sense unless explicitly defined in the present application Do not.

Claims

delete

A manifold platform for connecting a plurality of cores including a first core and a second core in a transverse and longitudinal direction and interconnecting the plurality of cores with a communication bus structure in a mesh form,
A user dynamic network that connects between the cores constituting the plurality of cores and transmits and receives data stored in the registers provided in the first cores directly to a register provided in the second cores,
Each of the cores constituting the plurality of cores includes an accessable shared memory,
In each of the cores constituting the plurality of cores
A core processor,
A core bus switch for switching the communication bus and the user dynamic network,
And a cache memory,
The core processor
A plurality of demultiplexing queues of FIFO type for storing data to be transmitted through the user dynamic network,
And a demultiplexing buffer for converting the order of the data stored in the demultiplexing queue,
Wherein the core processor further comprises a register for storing addresses of the plurality of demultiplexing queues.

3. The method of claim 2,
Wherein the data received through the user dynamic network includes an identification value for determining which demultiplexing queue is to be stored among a plurality of demultiplexing queues and the identification value is removed after the data is stored. .

delete

A method for communicating between core processes in a ManiCore platform, wherein a plurality of cores including a first core and a second core are arranged in the transverse and longitudinal directions and interconnecting the plurality of cores in a mesh communication bus structure ,
Wherein the first core and the second core each have a plurality of demultiplexing queues for storing data and the data stored in any one demultiplexing queue selected from the first core is transmitted to the second core Multiplexing queues selected from among the plurality of demultiplexing queues,
Wherein the first core is provided with a demultiplexing buffer and the data stored in the first core is transmitted to the second core among a plurality of data stored in the demultiplexing queue selected in the first core Further comprising the step of fetching data that is desired to be transmitted.