KR102419015B1

KR102419015B1 - Interrupt distribution method for numa-based devices, recording medium storing program for executing the same, and recording medium storing program for executing the same

Info

Publication number: KR102419015B1
Application number: KR1020210185158A
Authority: KR
Inventors: 함형석
Original assignee: 한화시스템 주식회사
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2022-07-08

Abstract

The present invention relates to a method for efficiently distributing (allocating) interrupt requests (IRQs) of non-uniform memory access (NUMA)-based equipment and, specifically, to an interrupt distribution method for NUMA-based equipment, a recording medium in which a program for performing the same is stored, and a computer program stored in a medium to perform the same, which can improve system performance by modifying a source code in a Linux kernel and efficiently allocate IRQs in a vector space of each core in a central processing unit (CPU).

Description

Interrupt distribution method of NUMA-based equipment, a recording medium storing a program for implementing the same, and a computer program stored in the medium for implementing the same PROGRAM FOR EXECUTING THE SAME}

본 발명은 NUMA(Non-Uniform Memory Access) 기반 장비의 인터럽트(Interrupt request, IRQ)를 효율적으로 분배(할당)하기 위한 방법에 관한 것으로, 상세하게는, 리눅스 커널(Linux Kernel) 내에서 소스 코드(source code)를 수정하여 중앙처리장치(Central Processing Unit, CPU) 내에 각 코어(core)의 벡터(vector) 공간에서 인터럽트를 효율적으로 분배함으로써 시스템의 성능을 향상시킬 수 있는 NUMA 기반 장비의 인터럽트 분배방법, 이를 구현하기 위한 프로그램이 저장된 기록매체 및 이를 구현하기 위해 매체에 저장된 컴퓨터 프로그램에 관한 것이다.The present invention relates to a method for efficiently distributing (assigning) an interrupt (Interrupt request, IRQ) of a NUMA (Non-Uniform Memory Access)-based device, and in detail, a source code ( source code) to efficiently distribute interrupts in the vector space of each core within the central processing unit (CPU), thereby improving system performance. Interrupt distribution method of NUMA-based equipment , relates to a recording medium storing a program for implementing the same, and a computer program stored in the medium for implementing the same.

최근에는 고성능 연산(High Performance Computing)을 위한 멀티코어 프로세서(Multi-Core Processor)가 널리 사용되고 있다. 이러한 멀티코어 프로세서를 구비한 멀티코어 시스템에서는 성능 향상을 위해 주변 장치들과 코어들 사이에서 발생하는 인터럽트 요청들을 어떻게 처리할지가 중요한 과제 중 하나로 대두되고 있다. Recently, a multi-core processor for high performance computing has been widely used. In a multi-core system having such a multi-core processor, how to handle interrupt requests generated between peripheral devices and cores to improve performance is emerging as one of the important tasks.

일반적으로, 리눅스 커널(Linux Kernel)에서 인터럽트를 할당하는 방법은 중앙처리장치의 각 코어(core)의 벡터(vector)라는 공간에서 이루어진다. 할당된 IRQ 번호에 따라 시스템 운용시 인터럽트가 발생할 경우 각 코어의 벡터 공간을 사용한다. In general, the method of allocating interrupts in the Linux kernel is performed in a space called a vector of each core of the central processing unit. When an interrupt occurs during system operation according to the assigned IRQ number, the vector space of each core is used.

일례로 종래의 이더넷 드라이버(Ethernet Driver)에서 인터럽트 요청에 대한 IRQ 할당방법은 다음 2가지 방법이 알려져 있다. 첫 번째 방법은 시스템 부팅과정에서 각 포트가 코어 개수만큼 IRQ를 생성하고, 이를 코어 벡터에 'Core (0)'번부터 'Core (K)'번까지 순차적으로 채워넣는 방식이다. 두 번째 방법은 시스템 운용자가 직접 'smp_affinity' 값을 수정하여 원하는 코어에 원하는 IRQ를 할당하는 방식이다.For example, the following two methods are known for the IRQ allocation method for the interrupt request in the conventional Ethernet driver. The first method is a method in which each port creates as many IRQs as the number of cores during the system booting process, and sequentially fills them from 'Core (0)' to 'Core (K)' in the core vector. The second method is to allocate the desired IRQ to the desired core by directly modifying the 'smp_affinity' value by the system operator.

도 1은 종래의 이더넷 드라이버에서 인터럽트 요청시 할당방법을 일례로 도시한 흐름도이다. 여기서, N, M, K는 자연수이다. 1 is a flowchart illustrating an example of an allocation method when an interrupt is requested in a conventional Ethernet driver. Here, N, M, and K are natural numbers.

도 1을 참조하면, 종래기술에 따른 이더넷 드라이버에에서 인터럽트 요청시 IRQ의 할당방법은 이더넷 포트 'Port (0)'에서 'Port (M)'의 IRQ (0~N)를 순차적으로 증가시키면서 코어 'Core (0) Vector'에서 'Core(K) Vector'에 할당한다(S1~S9). Referring to FIG. 1, the method of allocating an IRQ when an interrupt is requested in an Ethernet driver according to the prior art sequentially increases the IRQs (0~N) of 'Port (M)' in the Ethernet port 'Port (0)' to the core. Allocate from 'Core (0) Vector' to 'Core (K) Vector' (S1~S9).

이와 같이 리눅스 커널에서 각 코어의 벡터 공간에 IRQ를 할당하는 방법은 효율성과 무관하게 단순히 IRQ를 코어 벡터에 순차적으로 할당하는 방법을 취하고 있다. In this way, the method of allocating IRQs to the vector space of each core in the Linux kernel simply takes the method of sequentially allocating IRQs to the core vectors regardless of efficiency.

도 2는 종래기술에 따른 인터럽트의 할당 예시도이다. 2 is an exemplary diagram of an interrupt assignment according to the prior art.

도 2와 같이, 예를 들어, 이더넷 포트 'Port (0)'의 IRQ 리스트들이 모두 'Core (0)'에 할당된다. 그리고, 이더넷 포트 'Port (1)'의 IRQ 리스트들이 모두 'Core (0)'과 'Core (1)'에 할당되고, 이더넷 포트 'Port (2)'의 IRQ 리스트들도 순차적으로 할당된다.As in FIG. 2 , for example, all IRQ lists of the Ethernet port 'Port (0)' are allocated to 'Core (0)'. In addition, all IRQ lists of the Ethernet port 'Port (1)' are allocated to 'Core (0)' and 'Core (1)', and the IRQ lists of the Ethernet port 'Port (2)' are also sequentially allocated.

그러나, 종래기술에 따른 인터럽트 할당방법에서는 각 포트의 IRQ 리스트들에 대한 할당이 이루어진 상태에서 재분배를 시도하는 경우 오류(fail)가 발생한다. However, in the interrupt allocation method according to the prior art, when redistribution is attempted while the IRQ lists of each port are allocated, a failure occurs.

도 3은 도 2에 도시된 인터럽트 할당시 코어 벡터의 예시도이다. FIG. 3 is an exemplary diagram of a core vector when an interrupt is allocated as shown in FIG. 2 .

도 3과 같이, 종래기술에 따른 인터럽트 할당방법에서는 사용자가 성능 향상을 위해, 가령 'Port (0)'의 IRQ에 대해 재분배를 시도하는 경우, 'Core (1)'에는 비어있는 벡터가 없으므로, 'Core (1)'의 벡터로 가야하는 IRQ를 옮기는 과정에서 오류(fail)가 발생한다. 결국, 여유가 있는 코어에만 IRQ에 대한 원활한 이동이 가능하다. As shown in Figure 3, in the interrupt allocation method according to the prior art, when the user tries to redistribute the IRQ of 'Port (0)' to improve performance, there is no empty vector in 'Core (1)', An error occurs in the process of moving the IRQ to the vector of 'Core (1)'. After all, only cores with room for a smooth transition to IRQs are possible.

이와 같이, 종래기술에 따른 인터럽트 할당방법에서는 IRQ의 재분배시 오류가 발생할 수 있다. 이러한 현상은 코어의 수가 적은 경우에는 조금씩 완화될 수는 있으나, 코어와 포트의 수가 증가할 수록 심각해지는 문제가 있다. 이는 시스템 내에서 병목(bottleneck) 현상을 유발시켜 인터럽트 레이턴시(interrupt latency)를 증가시킴으로써 시스템 내에서 자원 활용이 효과적이지 못하고, 결국 고성능 장비임에도 평범한 성능을 내거나 오히려 스펙보다 낮은 성능으로 동작하게 되는 문제가 있었다. As such, in the interrupt allocation method according to the prior art, an error may occur during IRQ redistribution. This phenomenon can be alleviated little by little when the number of cores is small, but there is a problem that becomes more serious as the number of cores and ports increases. This causes a bottleneck in the system and increases the interrupt latency, so resource utilization in the system is not effective. there was.

KR 10-2013-0049110 A, 2013. 05. 13., "인터럽트 할당 방법 및 장치"KR 10-2013-0049110 A, 2013. 05. 13., "Method and device for allocating interrupts" KR 10-1717494 B1, 2017. 03. 13., "인터럽트 처리 장치 및 방법"KR 10-1717494 B1, 2017. 03. 13., "Interrupt handling apparatus and method"

상기한 종래기술에 따른 인터럽트 할당방법에서 발생되는 문제점들은 멀티코어 CPU에 다중 포트 이더넷 카드(port Ethernet card)를 사용하는 장비를 사용할 경우 발생하지만, 특히 서버급 장비에 멀티코어 CPU, NUMA 구조, 고성능 이더넷 칩셋(Ethernet chipset(ixgbe, i40e, Mellanox 등)을 동시에 사용할 경우 쉽게 발생된다. The problems that occur in the interrupt allocation method according to the prior art described above occur when an equipment using a multi-port Ethernet card is used for a multi-core CPU, but in particular, a multi-core CPU, NUMA structure, and high-performance Ethernet are used in server-class equipment. It occurs easily when using a chipset (Ethernet chipset (ixgbe, i40e, Mellanox, etc.) at the same time).

따라서, 본 발명은 종래기술에 따른 인터럽트 할당방법에서와 같이 단순히 IRQ를 코어 벡터에 순차적으로 할당하는 것이 아니라 초기 IRQ 할당시 코어별로 분배하여 종래기술에 따른 인터럽트 할당방법에서 지적된 문제를 예방하고, 이후 이더넷 드라이버에서 수신/송신 큐(Rx/Tx Queue)를 지정하여 성능을 향상시킬 수 있는 NUMA 기반 장비의 인터럽트 분배방법, 이를 구현하기 위한 프로그램이 저장된 기록매체 및 이를 구현하기 위해 매체에 저장된 컴퓨터 프로그램을 제공하는데 그 목적이 있다. Therefore, the present invention prevents the problems pointed out in the interrupt allocation method according to the prior art by distributing IRQs for each core during initial IRQ allocation, rather than simply sequentially allocating IRQs to the core vectors as in the interrupt allocation method according to the prior art. Interrupt distribution method of NUMA-based equipment that can improve performance by specifying receive/transmit queues (Rx/Tx Queue) in the Ethernet driver afterwards, a recording medium storing a program for implementing it, and a computer program stored in the medium to implement it Its purpose is to provide

상기한 목적을 달성하기 위한 본 발명은 NUMA(Non-Uniform Memory Access) 환경 장비 내에서 각 포트별 복수 개의 IRQ(Interrupt request)를 복수 개의 중앙처리장치(Central Processing Unit)에 각각 구성된 복수 개의 코어 벡터에 할당하는 NUMA 기반 장비의 인터럽트 분배방법에 있어서, (a) 포트들 중 NUMA 환경에 따라 해당 중앙처리장치에 연결된 해당 포트에 구성된 IRQ 리스트 중 첫 번째 IRQ를 해당 중앙처리장치의 첫 번째 코어 벡터에 할당하는 과정; (b) 해당 포트의 IRQ의 번호를 '1' 증가시키는 과정; (c) 해당 중앙처리장치의 코어 벡터의 번호를 '1' 증가시키는 과정; 및 (d) 상기 (b) 과정에서 '1' 증가된 다음 번째 번호의 IRQ를 상기 (c) 과정에서 '1' 증가된 다음 번째 번호의 코어 벡터에 할당하는 과정을 포함하는 것을 특징으로 하는 NUMA 기반 장비의 인터럽트 분배방법을 제공한다.The present invention for achieving the above object is a plurality of core vectors each configured to a plurality of central processing units (Central Processing Unit) a plurality of IRQ (Interrupt request) for each port in a NUMA (Non-Uniform Memory Access) environment equipment In the interrupt distribution method of NUMA-based equipment assigned to the process of allocating; (b) the process of increasing the number of the IRQ of the corresponding port to '1'; (c) the process of increasing the number of the core vector of the central processing unit by '1'; and (d) allocating the IRQ of the next number increased by '1' in the step (b) to the core vector of the next number increased by '1' in the step (c). Provides an interrupt distribution method for the underlying equipment.

또한, (e) 상기 (c) 과정에서 '1' 증가된 코어 벡터의 번호와 상기 중앙처리장치의 최종 번째 코어 벡터의 번호를 비교하여 상기 (c) 과정에서 '1' 증가된 코어 벡터의 번호와 상기 중앙처리장치의 최종 번째 코어 벡터의 번호를 초과하지 않는 경우에는 상기 (b) 과정에서 '1' 증가된 현재 IRQ의 번호가 최종 번째 IRQ의 번호를 초과할 때까지 상기 (b) 과정에서 상기 (d) 과정까지 반복 수행하여 해당 포트 내에 포함된 IRQ들에 대한 분배를 수행하고, 상기 (c) 과정에서 '1' 증가된 코어 벡터가 해당 중앙처리장치의 최종 번째 코어 벡터를 초과하는 경우에는 해당 중앙처리장치의 첫 번째 코어 벡터로 복귀한 후 상기 (b) 과정에서 '1' 증가된 현재 IRQ의 번호가 최종 번째 IRQ의 번호를 초과할 때까지 상기 (b) 과정에서 상기 (d) 과정까지 반복 수행하여 해당 포트 내에 포함된 IRQ들에 대한 분배를 수행하는 과정을 더 포함하는 것을 특징으로 할 수 있다. In addition, (e) the number of the core vector increased by '1' in the process (c) by comparing the number of the core vector increased by '1' in the process (c) with the number of the last core vector of the central processing unit. and in the process (b) until the number of the current IRQ increased by '1' in the process (b) exceeds the number of the last IRQ if it does not exceed the number of the final th core vector of the central processing unit. When the process (d) is repeatedly performed to distribute the IRQs included in the corresponding port, and the core vector increased by '1' in the process (c) exceeds the final th core vector of the corresponding central processing unit After returning to the first core vector of the corresponding central processing unit, in step (b), in step (d), until the number of the current IRQ increased by '1' in step (b) exceeds the number of the last IRQ. The method may further include a process of performing distribution of IRQs included in the corresponding port by repeating the process until the end of the process.

또한, 상기한 목적을 달성하기 위한 본 발명은 NUMA(Non-Uniform Memory Access) 환경 장비 내에서 각 포트별 복수 개의 IRQ(Interrupt request)를 복수 개의 중앙처리장치(Central Processing Unit)에 각각 구성된 복수 개의 코어 벡터에 할당하는 NUMA 기반 장비의 인터럽트 분배방법에 있어서, (a) 포트들 중 NUMA 환경에 따라 해당 중앙처리장치에 연결된 해당 포트에 구성된 IRQ 리스트 중 첫 번째 IRQ를 상기 중앙처리장치의 첫 번째 코어 벡터에 할당하는 과정; (b) 해당 포트의 IRQ의 번호를 '1' 증가시키는 과정; (c) 상기 (b) 과정에서 '1' 증가된 현재 IRQ의 번호와 해당 포트의 최종 번째 IRQ의 번호를 비교하는 과정; (d) 상기 (c)의 비교결과, 상기 (b) 과정에서 '1' 증가된 IRQ의 번호가 최종 번째 IRQ의 번호를 초과하지 않는 경우에는 코어 벡터의 번호를 '1' 증가시키는 과정; (e) 상기 (c) 과정에서 '1' 증가된 코어 벡터의 번호와 최종 번째 코어 벡터의 번호를 비교하는 과정; 및 (f) 상기 (e) 과정의 비교결과, 상기 (d) 과정에서 '1' 증가된 코어 벡터의 번호가 최종 번째 코어 벡터의 번호를 초과하지 않는 경우에는 상기 (b) 과정에서 '1' 증가된 현재 IRQ를 상기 (d) 과정에서 '1' 증가된 현재 코어 벡터에 할당하고, 상기 (d) 과정에서 '1' 증가된 코어 벡터의 번호가 최종 번째 코어 벡터의 번호를 초과하는 경우에는 해당 중앙처리장치의 첫 번째 코어 벡터로 복귀하여 상기 (b) 과정에서 '1' 증가된 현재 IRQ를 해당 중앙처리장치의 첫 번째 코어 벡터에 할당하는 과정을 포함하는 것을 특징으로 하는 NUMA 기반 장비의 인터럽트 분배방법을 제공한다.In addition, the present invention for achieving the above object is a NUMA (Non-Uniform Memory Access) environment equipment in a plurality of IRQ (Interrupt request) for each port in a plurality of central processing units (Central Processing Unit) each configured In the interrupt distribution method of NUMA-based equipment that is assigned to a core vector, (a) among the ports, according to the NUMA environment, the first IRQ among the IRQ list configured in the corresponding port connected to the central processing unit is assigned to the first core of the central processing unit. the process of assigning to a vector; (b) the process of increasing the number of the IRQ of the corresponding port to '1'; (c) comparing the number of the current IRQ increased by '1' in step (b) with the number of the last IRQ of the corresponding port; (d) the process of increasing the number of the core vector by '1' when, as a result of the comparison of (c), the number of IRQs increased by '1' in step (b) does not exceed the number of the last IRQ; (e) comparing the number of the core vector increased by '1' in the process (c) and the number of the final th core vector; and (f) as a result of comparison in step (e), when the number of core vectors increased by '1' in step (d) does not exceed the number of the final core vector, '1' in step (b) If the increased current IRQ is assigned to the current core vector increased by '1' in the process (d), and the number of the core vector increased by '1' in the process (d) exceeds the number of the final th core vector, Returning to the first core vector of the corresponding central processing unit and allocating the current IRQ increased by '1' in the process (b) to the first core vector of the corresponding central processing unit. Provides an interrupt distribution method.

또한, (g) 상기 (f) 과정 후, 상기 (b) 과정으로 복귀하여 상기 (b) 과정에서 상기 (f) 과정까지의 과정을 반복 수행하되, 상기 (c)의 비교결과, 상기 (b) 과정에서 '1' 증가된 현재 IRQ의 번호가 최종 번째 IRQ의 번호를 초과할 때까지 반복 수행하는 과정을 더 포함하는 것을 특징으로 할 수 있다. In addition, (g) after step (f), return to step (b) and repeat the steps from step (b) to step (f), but as a result of comparison in (c), (b) ) may further include a process of repeatedly performing until the number of the current IRQ increased by '1' in the process exceeds the number of the last IRQ.

또한, (h) 상기 (d) 과정에서 상기 (c)의 비교결과, 상기 (b) 과정에서 '1' 증가된 IRQ의 번호가 최종 번째 IRQ의 번호를 초과하는 경우에는 포트의 번호를 '1' 증가시켜 '1' 증가된 현재 포트의 IRQ를 상기 (a) 과정에서 (f) 과정까지 동일하게 수행하는 과정을 더 포함하는 것을 특징으로 할 수 있다. In addition, (h) as a result of the comparison of (c) in step (d), when the number of IRQs increased by '1' in step (b) exceeds the number of the last IRQ, the port number is set to '1'. It may be characterized in that it further includes the process of performing the same from (a) to (f) of the IRQ of the current port increased by '1' by 'increasing'.

또한, 상기한 목적을 달성하기 위한 본 발명은 상기한 NUMA 기반 장비의 인터럽트 분배방법을 구현하기 위한 프로그램이 저장된 컴퓨터 판독 가능한 기록매체를 제공한다. In addition, the present invention for achieving the above object provides a computer-readable recording medium storing a program for implementing the interrupt distribution method of the above-described NUMA-based equipment.

또한, 상기한 목적을 달성하기 위한 본 발명은 상기한 NUMA 기반 장비의 인터럽트 분배방법을을 구현하기 위한 컴퓨터 판독 가능한 기록매체에 저장된 컴퓨터 프로그램을 제공한다. In addition, the present invention for achieving the above object provides a computer program stored in a computer-readable recording medium for implementing the interrupt distribution method of the above-described NUMA-based equipment.

이상에서 설명한 바와 같이, 본 발명에 따른 NUMA 기반 장비의 인터럽트 분배방법에 의하면, 단순히 IRQ를 코어 벡터에 순차적으로 할당하는 것이 아니라 초기 IRQ 할당시 코어별로 분배하여 종래기술에 따른 인터럽트 할당방법에서 지적된 문제를 방지하고, 이를 통해 효율적인 메모리 사용을 통한 성능을 향상시킬 수 있다. As described above, according to the interrupt distribution method of a NUMA-based device according to the present invention, the IRQ is not simply allocated sequentially to the core vector, but is distributed to each core during initial IRQ allocation, as indicated in the interrupt allocation method according to the prior art. This can prevent problems and improve performance through efficient memory usage.

또한, 본 발명에 따른 NUMA 기반 장비의 인터럽트 분배방법에 의하면, IRQ 초기화시 각 포트별의 IRQ를 코어별로 하나씩 순차적으로 할당하는 방식으로 IRQ를 할당하여 배분함으로써 중앙처리장치의 코어 벡터의 공간 활용을 통한 효율적인 IRQ 분배, 그리고 인터럽트의 원활한 처리와 관리가 가능하고, 트랙픽 관리에 필요한 자원 최소화로 다른 업무 능력을 향상시킬 수 있으며, 유동적인 인터럽트 처리가 가능하다. In addition, according to the interrupt distribution method of the NUMA-based equipment according to the present invention, the space utilization of the core vector of the central processing unit is improved by allocating and distributing the IRQs in a manner that sequentially allocates the IRQs for each port one by one for each core during IRQ initialization. Through this, efficient IRQ distribution and smooth interrupt handling and management are possible, other business capabilities can be improved by minimizing the resources required for traffic management, and flexible interrupt handling is possible.

도 1은 종래의 이더넷 드라이버에서 인터럽트 요청시 할당방법을 도시한 흐름도.
도 2는 종래기술에 따른 인터럽트의 할당 예시도.
도 3은 도 2에 도시된 인터럽트 할당시 코어 벡터의 예시도.
도 4는 본 발명의 실시예에 따른 인터럽트 분배방법이 적용된 NUMA 기반 장비의 환경 예시도.
도 5는 본 발명의 실시예에 따른 인터럽트 분배방법의 흐름도.
도 6은 본 발명의 실시예에 따른 인터럽트의 할당 과정의 예시도.1 is a flowchart illustrating an allocation method when an interrupt is requested in a conventional Ethernet driver.
2 is an exemplary diagram of an interrupt assignment according to the prior art;
FIG. 3 is an exemplary diagram of a core vector upon assignment of an interrupt shown in FIG. 2 .
4 is an exemplary environment diagram of a NUMA-based device to which an interrupt distribution method according to an embodiment of the present invention is applied.
5 is a flowchart of an interrupt distribution method according to an embodiment of the present invention;
6 is an exemplary diagram of an interrupt allocation process according to an embodiment of the present invention;

이하, 첨부된 도면을 참조하여 본 발명의 실시 예들을 상세히 설명하기로 한다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현되는 것을 배제하지 않으며, 단지 본 발명의 실시예들은 본 발명의 개시가 완전하도록 하고, 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다. 도면상에서 동일 부호는 동일한 요소를 지칭한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but does not exclude that they are implemented in various different forms, only the embodiments of the present invention make the disclosure of the present invention complete, and those of ordinary skill in the art It is provided to fully inform the person of the scope of the invention. In the drawings, like reference numerals refer to like elements.

도 4는 본 발명의 실시예에 따른 인터럽트 분배방법이 적용된 NUMA 기반 장비의 환경 예시도이다. 도 5는 본 발명의 실시예에 따른 인터럽트 분배방법의 흐름도이다. 4 is an exemplary environment diagram of a NUMA-based device to which an interrupt distribution method according to an embodiment of the present invention is applied. 5 is a flowchart of an interrupt distribution method according to an embodiment of the present invention.

도 4와 같이, 본 발명의 실시예에 따른 인터럽트 분배방법이 적용된 NUMA 기반 장비로서, 일례로 멀티코어 CPU에 다중 포트 이더넷 카드를 사용하는 장비에서 적용할 수 있고, 서버급 장비에 멀티코어 CPU, NUMA 구조, 고성능 이더넷 칩셋(Ethernet chipset(ixgbe, i40e, Mellanox 등))을 동시에 사용하는 장비에 적용할 수 있다. As shown in Fig. 4, it is a NUMA-based device to which the interrupt distribution method according to the embodiment of the present invention is applied. For example, it can be applied to a device using a multi-port Ethernet card for a multi-core CPU, and a multi-core CPU, NUMA to a server-class device. Structure and high-performance Ethernet chipset (Ethernet chipset (ixgbe, i40e, Mellanox, etc.)) can be applied to equipment using simultaneously.

예를 들어, 본 발명의 실시예에 따른 인터럽트 분배방법은 NUMA 구조에 64 Core CPU(+Hyper Threading), i40e 멀티포트의 환경을 기반으로 하고, OS는 Fedora를 기반으로 한다. 다만, 버전에 따라 해당 코드가 포함된 파일/함수명이 다를 수 있다.For example, the interrupt distribution method according to the embodiment of the present invention is based on the environment of 64 Core CPU (+Hyper Threading) and i40e multiport in a NUMA structure, and the OS is based on Fedora. However, depending on the version, the file/function name containing the code may be different.

도 5는 본 발명의 실시예에 따른 NUMA 기반 장비의 인터럽트 분배 방법의 흐름도이고, 도 6은 인터럽트의 할당 과정의 예시도이다. 여기서, N, M, K는 자연수이다. 5 is a flowchart of an interrupt distribution method of a NUMA-based device according to an embodiment of the present invention, and FIG. 6 is an exemplary diagram of an interrupt allocation process. Here, N, M, and K are natural numbers.

도 5 및 도 6과 같이, 본 발명의 실시예에 따른 인터럽트 분배방법에서는 각 포트별 IRQ를 순번대로 순차적으로 코어 벡터에 각각 하나씩 순차적으로 할당한다. 이때, 각 포트별 IRQ의 할당은 모든 코어 벡터에 순차적으로 할당을 하되, NUMA 기반 환경에 따라 해당하는 중앙처리장치의 코어 벡터에 순차적으로 할당한다. 5 and 6 , in the interrupt distribution method according to the embodiment of the present invention, each IRQ for each port is sequentially allocated to the core vector one by one in sequence. In this case, the allocation of IRQs for each port is sequentially allocated to all core vectors, but sequentially allocated to the core vectors of the corresponding central processing unit according to the NUMA-based environment.

이하, 본 발명의 실시예에 따른 인터럽트 분배방법에 대해 구체적으로 설명한다. Hereinafter, an interrupt distribution method according to an embodiment of the present invention will be described in detail.

도 5 및 도 6을 참조하면, 중앙처리장치인 'CPU (0)'에 해당하는 이더넷 포트(Port (0~M))의 각 IRQ 리스트(IRQ (0~N))를 코어별(Core (0~K) Vector)로 할당한다. 5 and 6, each IRQ list (IRQ (0 to N)) of the Ethernet port (Port (0 to M)) corresponding to 'CPU (0)', which is the central processing unit, is divided by core (Core ( 0~K) is assigned as Vector).

구체적으로, 먼저, 첫 번째 중앙처리장치인 'CPU (0)'에 해당하는 이더넷 포트 중 첫 번째 포트인 'Port (0)'의 IRQ 리스트 중 첫 번째인 'IRQ (0)'를 해당 중앙처리장치 'CPU (0)'의 첫 번째 코어 벡터인 'Core (0) Vector'에 할당한다(S11).Specifically, first, 'IRQ (0)', which is the first in the IRQ list of 'Port (0)', which is the first port among the Ethernet ports corresponding to 'CPU (0)', which is the first central processing unit, is processed by the corresponding central processing unit. It is assigned to the 'Core (0) Vector', which is the first core vector of the device 'CPU (0)' (S11).

이어서, 첫 번째인 'IRQ (0)'에 대한 할당이 완료되면, 첫 번째 포트인 'Port (0)' 내의 IRQ 번호를 '1' 증가시킨 후(S12), '1' 증가된 IRQ 번호가 IRQ 리스트 중 최종 번째 IRQ 번호인 'IRQ (N)'을 초과하는지를 비교한다(S13). 즉, '1' 증가된 IRQ 번호(IRQ (x))가 'IRQ (0) < IRQ (x) ≤ IRQ (N)'인 경우에는 코어 벡터를 '1' 증가시킨다(S14). Subsequently, when the allocation for the first 'IRQ (0)' is completed, the IRQ number in the 'Port (0)', which is the first port, is incremented by '1' (S12), and then the IRQ number increased by '1' is It is compared whether it exceeds 'IRQ (N)', which is the last IRQ number in the IRQ list (S13). That is, when the IRQ number (IRQ (x)) increased by '1' is 'IRQ (0) < IRQ (x) ≤ IRQ (N)', the core vector is increased by '1' (S14).

이어서, '1' 증가된 코어 벡터가 'CPU (0)' 내의 코어 벡터 중 최종 번째인 'Core (K) Vector'를 초과하는지를 비교한다(S15), 즉, '1' 증가된 코어 벡터(Core (x) Vector)가 'Core (0) Vector < Core (x) Vector ≤ Core (K) Vector'인 경우, 과정 'S12'에서 '1' 증가된 IRQ를 과정 'S14'에서 '1' 증가된 코어 벡터에 할당한다(S16). Next, it is compared whether the '1' increased core vector exceeds the 'Core (K) Vector', which is the last 'Core (K) Vector' among the core vectors in the 'CPU (0)' (S15), that is, the '1' increased core vector (Core). If (x) Vector) is 'Core (0) Vector < Core (x) Vector ≤ Core (K) Vector', IRQ increased by '1' in process 'S12' is increased by '1' in process 'S14'. It is assigned to a core vector (S16).

'S16'→'S12'→'S13'→'S14'→'S15' 과정은 'S14'에서 '1' 증가된 코어 벡터(Core (x) Vector)가 최종 번째 코어 벡터인 'Core (K) Vector'를 초과할 때까지 반복적으로 수행하고, 이 과정을 통해 각 포트의 IRQ는 순차적으로 첫 번째 코어 벡터 'Core (0) Vector'에서 최종 번째 코어 벡터 'Core (K) Vector'에 각각 할당된다. 'S16' → 'S12' → 'S13' → 'S14' → 'S15' In the process, the core vector (Core (x) Vector) increased by '1' from 'S14' is the final core vector 'Core (K)'. Vector' is repeatedly performed, and through this process, the IRQ of each port is sequentially assigned from the first core vector 'Core (0) Vector' to the last core vector 'Core (K) Vector', respectively. .

한편, 과정 'S14'에서 '1' 증가된 코어 벡터(Core (x) Vector)가 코어 벡터 들 중 최종 번째인 'Core (K) Vector'를 초과한 경우, 즉 'Core (x) Vector > Core (K) Vector'인 경우(x = K+1인 경우)에는 첫 번째 코어 벡터인 'Core (0)'로 복귀하여 현재 할당할 IRQ를 첫 번째 코어 벡터인 'Core (0)'에 할당한다(S17, S16). On the other hand, if the core vector (Core (x) Vector) increased by '1' in process 'S14' exceeds the 'Core (K) Vector', which is the final number among the core vectors, that is, 'Core (x) Vector > Core (K) Vector' (when x = K+1), returns to the first core vector, 'Core (0)', and allocates the IRQ to be currently allocated to the first core vector, 'Core (0)'. (S17, S16).

또한, 과정 'S13'에서, '1' 증가된 IRQ 번호(IRQ (x))가 최종 번째 IRQ 번호인 'IRQ (N)'를 초과(X = N+1인 경우)하는 경우에는, 포트를 '1' 증가시키고(S18), '1' 증가된 포트(Port (x))가 최종 번째인 'Port (M)'를 초과, 즉 'Port (x) > Port (M)'(x = M+1인 경우)할 때까지 'S11'→'S12'→'S13'→'S18'→'S19'를 반복한다. In addition, in process 'S13', if the IRQ number (IRQ (x)) increased by '1' exceeds 'IRQ (N)', which is the last IRQ number (in the case of X = N+1), the port '1' is increased (S18), and the '1' increased port (Port (x)) exceeds the last 'Port (M)', that is, 'Port (x) > Port (M)' (x = M +1), repeat 'S11' → 'S12' → 'S13' → 'S18' → 'S19'.

이와 같이, 본 발명의 실시예에 따른 인터럽트 분배방법은 도 2에 도시된 종래의 인터럽트 분배방법과 다르게 이더넷 포트 'Port (0) IRQ (0~N)'에서 'Port (M) IRQ (0~N)'를 각 코어 'Core (0) Vector'에서 'Core (K) Vector'까지 순차적으로 할당한다. 이때, 'Port (0) IRQ (0~N)'에서 'Port (M) IRQ (0~N)'는 코어 'Core (0) Vector'에서 'Core (K) Vector'로 할당하되, NUMA 구조에 따라 이더넷 포트들 중 'CPU (0)'에 해당하는 이더넷 포트는 'CPU (0)'의 코어 벡터에만 IRQ를 할당한다. In this way, the interrupt distribution method according to the embodiment of the present invention is different from the conventional interrupt distribution method shown in FIG. N)' is assigned sequentially from 'Core (0) Vector' to 'Core (K) Vector' to each core. At this time, in 'Port (0) IRQ (0~N)', 'Port (M) IRQ (0~N)' is assigned from 'Core (0) Vector' to 'Core (K) Vector', but NUMA structure Accordingly, among the Ethernet ports, the Ethernet port corresponding to 'CPU (0)' allocates an IRQ only to the core vector of 'CPU (0)'.

즉, 도 6과 같이, 첫 번째 포트인 'Port (0)'의 'IRQ (0)'는 첫 번째 코어 벡터인 'Core (0) Vector'에 할당하고, 'Port (0)'의 두 번째 'IRQ (1)'은 두 번째 코어 벡터인 'Core (1) Vector'에 할당하고, 'Port (0)'의 세 번째 'IRQ (2)'는 세 번째 코어 벡터인 'Core (2) Vector'에 할당한다. 이런 방법으로 순차적으로 할당한다. That is, as shown in FIG. 6 , 'IRQ (0)' of 'Port (0)', which is the first port, is assigned to 'Core (0) Vector', which is the first core vector, and the second of 'Port (0)' 'IRQ (1)' is assigned to the second core vector, 'Core (1) Vector', and the third 'IRQ (2)' of 'Port (0)' is the third core vector, 'Core (2) Vector'. ' is assigned to Assign sequentially in this way.

한편, 'Port (0)'의 IRQ 개수가 코어 벡터의 개수보다 많은 경우, 즉 'Port (0)'의 IRQ 개수가 해당 CPU의 코어 벡터의 개수를 초과하는 경우(IRQ (N) > Core (K) Vector)에는 초과되는 IRQ 부터 첫 번째 코어 벡터인 'Core (0) Vector'에 할당한다. 그리고, 첫 번째 포트인 'Port (0)'의 IRQ에 대한 할당이 완료되면, 다음 번째 포트에 대해서도 첫 번째 포트의 IRQ 할당과 동일한 방법으로 실시하여 할당한다. On the other hand, when the number of IRQs of 'Port (0)' is greater than the number of core vectors, that is, when the number of IRQs of 'Port (0)' exceeds the number of core vectors of the CPU (IRQ (N) > Core ( K) Vector) is assigned to the 'Core (0) Vector', which is the first core vector from the IRQ that is exceeded. And, when the IRQ assignment of the first port, 'Port (0)', is completed, the next port is assigned in the same way as the IRQ assignment of the first port.

이와 같이, 본 발명의 실시예에 따른 인터럽트 분배방법에서는 종래기술에 따른 인터럽트 분배방법과 다르게 이더넷 포트 'Port (0) IRQ (0~N)'에서 'Port (M) IRQ (0~N)'을 각 코어 'Core (0) Vector'에서 'Core (K) Vector'까지 순차적으로 할당함에 따라 하나의 코어 벡터에 각 포트의 IRQ가 균일하게 할당되어 분배됨으로써 종래기술에 따른 인터럽트 분배방법과 같이 첫 번째 코어 벡터부터 모두 채우는 방식으로 IRQ를 할당하는 방식에 비해 재분배가 용이한 이점을 제공할 수 있다. As such, in the interrupt distribution method according to the embodiment of the present invention, unlike the interrupt distribution method according to the prior art, the Ethernet port 'Port (0) IRQ (0~N)' to 'Port (M) IRQ (0~N)' By sequentially allocating from 'Core (0) Vector' to 'Core (K) Vector' to each core, the IRQ of each port is uniformly allocated and distributed to one core vector. An advantage of easy redistribution can be provided compared to a method of allocating IRQs by filling all the core vectors from the second core vector.

그리고, 본 발명의 실시예에 따른 인터럽트 분배방법에서는 코어 벡터에 각 포트의 IRQ가 균일하게 분배됨에 따라 각 코어별로 여유공간이 존재한다. 따라서, 재분배시에는 코어와 IRQ를 1:1로 맵핑하는 방법으로 재분배를 수행할 수 있다. 가령, 첫 번째 포트 'Port (0)'의 첫 번째 'IRQ (0)'는 첫 번째 코어 'Core (0)'에 재분배하고, 두 번째 'IRQ (1)'은 두 번째 코어 'Core (1)'에 재분배하여 사용할 수 있다. In addition, in the interrupt distribution method according to the embodiment of the present invention, as the IRQ of each port is uniformly distributed in the core vector, a free space exists for each core. Therefore, during redistribution, redistribution can be performed by mapping the core and IRQ 1:1. For example, the first 'IRQ (0)' of the first port 'Port (0)' is redistributed to the first core 'Core (0)', the second 'IRQ (1)' is the second core 'Core (1)' )' can be redistributed and used.

그리고, 본 발명의 실시예에 따른 인터럽트 분배방법에서는 NUMA 구조에 맞도록 첫 번째 'CPU (0)'에 해당하는 이더넷 카드의 각 포트의 IRQ는 'CPU (0)'의 코어에 IRQ를 할당한다. 이는 중앙처리장치들을 넘나들 경우 데이터를 처리하는 과정에서 이더넷 포트로 들어오는 데이터가 'CPU (0)'의 메모리에서 'CPU (1)'의 메모리 공간으로 버스를 통해 전달되는 과정이 포함됨으로 성능 저하가 발생하기 때문이다. And, in the interrupt distribution method according to the embodiment of the present invention, the IRQ of each port of the Ethernet card corresponding to the first 'CPU (0)' is assigned to the core of 'CPU (0)' so as to conform to the NUMA structure. . This includes the process of transferring data coming into the Ethernet port from the memory of 'CPU (0)' to the memory space of 'CPU (1)' through the bus in the process of data processing when crossing the central processing units, so performance degradation is reduced. because it occurs

또한, 본 발명의 실시예에 따른 인터럽트 분배방법에서는 이더넷 드라이버 내에서 수신 큐(Rx queue)와 송신 큐(Tx queue)는 동일한 큐를 사용하도록 하여 동일한 데이터를 처리함에 있어 하나의 큐를 사용함으로써 성능 저하를 방지할 수 있다. In addition, in the interrupt distribution method according to the embodiment of the present invention, the receive queue (Rx queue) and the transmit queue (Tx queue) in the Ethernet driver use the same queue to process the same data. deterioration can be prevented.

한편, 본 발명의 실시예에 따른 인터럽트 분배방법은 개발 단계에서 리눅스 커널(Linux Kernel) 내에서 소스 코드(source code)를 수정하여 구현할 수 있다. 그리고, 이를 구현하기 위한 프로그램이 저장된 컴퓨터 판독 가능한 기록매체 및 컴퓨터 판독 가능한 기록매체에 저장된 컴퓨터 프로그램 역시 구현 가능함은 물론이다.On the other hand, the interrupt distribution method according to the embodiment of the present invention can be implemented by modifying the source code in the Linux kernel (Linux Kernel) in the development stage. And, of course, a computer-readable recording medium in which a program for implementing the same is stored and a computer program stored in the computer-readable recording medium can also be implemented.

즉, 본 발명의 실시예에 따른 인터럽트 분배방법은 컴퓨터로 읽을 수 있는 기록매체에 기록된 컴퓨터로 읽을 수 있는 프로그래밍 언어 코드 형태로 구현될 수도 있다. 여기서, 컴퓨터로 읽을 수 있는 기록매체는 컴퓨터에 의해 읽을 수 있고, 데이터를 저장할 수 있는 장치는 모두 가능하다. 예를 들어, 컴퓨터로 읽을 수 있는 기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광디스크, 하드 디스크 드라이브, 플래시 메모리, 솔리드 스테이트 디스크(SSD) 등이 될 수 있음은 물론이다. 또한, 여기서, 컴퓨터로 읽을 수 있는 기록 매체에 저장된 컴퓨터로 읽을 수 있는 코드 또는 프로그램은 컴퓨터 간에 연결된 네트워크를 통해 전송될 수도 있다.That is, the interrupt distribution method according to an embodiment of the present invention may be implemented in the form of a computer-readable programming language code recorded on a computer-readable recording medium. Here, any computer-readable recording medium may be computer-readable, and any device capable of storing data may be used. For example, the computer-readable recording medium may be a ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical disk, hard disk drive, flash memory, solid state disk (SSD), or the like. Also, here, the computer-readable code or program stored in the computer-readable recording medium may be transmitted through a network connected between computers.

이상에서와 같이, 본 발명의 바람직한 실시예가 특정 용어들을 사용하여 설명 및 도시되었지만, 그러한 용어는 오로지 본 발명을 명확하게 설명하기 위한 것일 뿐이다. 그리고, 본 발명의 실시예 및 기술된 용어는 다음의 청구범위의 기술적 사상 및 범위로부터 이탈되지 않고서 여러 가지 변경 및 변화가 가해질 수 있는 것은 자명한 일이다. 이와 같이 변형된 실시예들은 본 발명의 사상 및 범위로부터 개별적으로 이해되어져서는 안되며, 본 발명의 청구범위 안에 속한다고 해야 할 것이다.As described above, although preferred embodiments of the present invention have been described and illustrated using specific terms, such terms are only for clearly describing the present invention. In addition, it is obvious that various changes and changes can be made to the embodiments and described terms of the present invention without departing from the spirit and scope of the following claims. Such modified embodiments should not be separately understood from the spirit and scope of the present invention, but should be considered to fall within the scope of the claims of the present invention.

CPU (0), CPU (1) : 중앙처리장치
Core (0) Vector~ Core (K) Vector : 코어 벡터
Port (0) ~ Port (M) : 이더넷 포트
IRQ (0) ~ IRQ (N) : 인터럽트 요청CPU (0), CPU (1): Central processing unit
Core (0) Vector~ Core (K) Vector : Core Vector
Port (0) ~ Port (M) : Ethernet port
IRQ (0) to IRQ (N): Interrupt request

Claims

Interrupt distribution method of NUMA-based equipment that allocates multiple IRQs (Interrupt requests) for each port within NUMA (Non-Uniform Memory Access)-based equipment to a plurality of core vectors configured in a plurality of central processing units, respectively In
(a) allocating, by the NUMA-based equipment, a first IRQ among the IRQ lists configured in the corresponding port connected to the corresponding CPU according to the NUMA environment among the ports to the first core vector of the corresponding CPU;
(b) the NUMA-based device increases the number of the IRQ of the corresponding port by '1';
(c) the process by which the NUMA-based equipment increases the number of the core vector of the corresponding central processing unit by '1'; and
(d) allocating, by the NUMA-based device, the IRQ of the next number increased by '1' in the step (b) to the core vector of the next number increased by '1' in the step (c);
Interrupt distribution method of NUMA-based equipment comprising a.

The method of claim 1,
(e) the NUMA-based equipment compares the number of the core vector increased by '1' in the process (c) with the number of the final th core vector of the central processing unit, and the number of the core vector increased by '1' in the process (c) If it does not exceed the number of the core vector and the number of the last core vector of the central processing unit, the number of the current IRQ increased by '1' in the process (b) exceeds the number of the last IRQ ( In the process b), the process (d) is repeatedly performed to distribute the IRQs included in the corresponding port, and the core vector increased by '1' in the process (c) is the last core vector of the corresponding central processing unit. If it exceeds, after returning to the first core vector of the central processing unit, in the process (b), until the number of the current IRQ increased by '1' in the process (b) exceeds the number of the last IRQ. Interrupt distribution method for NUMA-based equipment, characterized in that it further comprises the step of repeatedly performing up to (d) to distribute IRQs included in the corresponding port.

Interrupt distribution method of NUMA-based equipment that allocates multiple IRQs (Interrupt requests) for each port within NUMA (Non-Uniform Memory Access)-based equipment to a plurality of core vectors configured in a plurality of central processing units, respectively In
(a) allocating, by the NUMA-based equipment, a first IRQ among the IRQ lists configured in the corresponding port connected to the corresponding CPU according to the NUMA environment among the ports to the first core vector of the corresponding CPU;
(b) the NUMA-based device increases the number of the IRQ of the corresponding port by '1';
(c) comparing, by the NUMA-based device, the number of the current IRQ increased by '1' in the step (b) with the number of the last IRQ of the corresponding port;
(d) As a result of the comparison of (c), the NUMA-based device sets the number of the core vector to '1' when the number of IRQs increased by '1' in the process (b) does not exceed the number of the last IRQ. process of increasing;
(e) a process in which the NUMA-based device compares the number of the core vector increased by '1' in the process (c) and the number of the final th core vector; and
(f) When the NUMA-based device compares the step (e), the number of the core vector increased by '1' in the step (d) does not exceed the number of the final th core vector, the step (b) The current IRQ increased by '1' is assigned to the current core vector increased by '1' in the process (d), and the number of the core vector increased by '1' in the process (d) is the number of the final th core vector the process of allocating the current IRQ increased by '1' in the process (b) to the first core vector of the corresponding central processing unit by returning to the first core vector of the corresponding central processing unit if it is exceeded;
Interrupt distribution method of NUMA-based equipment comprising a.

4. The method of claim 3,
(g) the NUMA-based device returns to the process (b) after the process (f) and repeats the process from the process (b) to the process (f), but the comparison result of (c); The interrupt distribution method of a NUMA-based device, characterized in that it further comprises the step of repeatedly performing until the number of the current IRQ increased by '1' in the step (b) exceeds the number of the last IRQ.

5. The method of claim 4,
(h) When the NUMA-based device compares (c) in the step (d), the number of the IRQ increased by '1' in the step (b) exceeds the number of the last IRQ, the port number The interrupt distribution method of a NUMA-based device, characterized in that it further comprises the step of performing the IRQ of the current port increased by '1' by '1' in the same manner from the steps (a) to (f).

A computer-readable recording medium storing a program for implementing the method of any one of claims 1 to 5 for distributing an interrupt of a NUMA-based device.

A computer program stored in a computer-readable recording medium for implementing the method of any one of claims 1 to 5 for distributing an interrupt of a NUMA-based device.