WO2023155785A1 - 一种网卡配置方法、装置、设备及存储介质 - Google Patents

一种网卡配置方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023155785A1
WO2023155785A1 PCT/CN2023/076039 CN2023076039W WO2023155785A1 WO 2023155785 A1 WO2023155785 A1 WO 2023155785A1 CN 2023076039 W CN2023076039 W CN 2023076039W WO 2023155785 A1 WO2023155785 A1 WO 2023155785A1
Authority
WO
WIPO (PCT)
Prior art keywords
die
network card
bound
queue
cpu
Prior art date
Application number
PCT/CN2023/076039
Other languages
English (en)
French (fr)
Inventor
梁颖欣
叶志勇
Original Assignee
北京火山引擎科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京火山引擎科技有限公司 filed Critical 北京火山引擎科技有限公司
Publication of WO2023155785A1 publication Critical patent/WO2023155785A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting

Definitions

  • the present disclosure relates to the technical field of data storage, and in particular to a network card configuration method, device, equipment and storage medium.
  • the Linux interrupt binding scheme under the multi-core processor system mainly uses the Irqbalance service. Specifically, based on the idle central processing unit (Central Processing Unit, CPU for short) in the current system, it is selected for processing interrupts.
  • the requesting CPU means randomly assigning interrupt requests to idle CPUs. It can be seen that the allocation of interrupt requests has no directionality, resulting in a low hit rate of cached data, which affects the performance of the network card.
  • the embodiment of the present disclosure provides a network card configuration method, which performs directional binding of cores for the network card queue, and binds the RPS/XPS of the interrupt number corresponding to the network card queue and the network card queue to the same Die. Since the CPUs on the same Die share the cache, the cache hit rate can be improved, thereby improving the performance of the network card as a whole.
  • the present disclosure provides a network card configuration method, the method comprising:
  • the network card queue to be bound has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request;
  • the Die set includes at least one Die, and each Die includes a plurality of central processing unit CPUs, the Multiple CPUs share the same cache;
  • the first core network card queue to be bound is bound to the CPU on the first Die in the Die set; wherein, the first Die is any Die in the Die set;
  • the method before binding the first core network card queue to be bound to the CPU on the first Die in the Die set, the method further includes:
  • the binding the first core network card queue to be bound to the CPU on the first Die in the Die set includes:
  • the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number and the first Die Before the CPU is bound also include:
  • each CPU is divided into at least two hyperthreads HT, and the binding the first core network card queue to be bound to the CPU on the first Die in the Die set includes: :
  • binding the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number to the CPU on the first Die include:
  • the method before binding the first core network card queue to be bound to the CPU on the first Die in the Die set, the method further includes:
  • the Die running the application program APP corresponding to the first network card queue to be bound in the Die set is determined as the first Die.
  • the method before binding the first core network card queue to be bound to the CPU on the first Die in the Die set, the method further includes:
  • the Irqbalance service is used to distribute interrupt requests based on interrupt load balancing on the CPU;
  • the present disclosure provides a network card configuration device, the device comprising:
  • the first determination module is configured to determine a queue of network cards to be bound with a core in the target network card; wherein, the queue of network cards to be bound with a core has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request;
  • the second determining module is used to determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node; wherein, the die set includes at least one Die, and each Die includes a plurality of central processing unit CPU, the multiple CPUs share the same cache;
  • the first binding module is configured to bind the first core network card queue to be bound to the CPU on the first Die in the Die set; wherein, the first Die is any Die in the Die set ;
  • the third determining module is configured to determine the interrupt number corresponding to the first network card queue to be bound as the first interrupt number
  • the second binding module is configured to perform the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number with the CPU on the first Die bound.
  • the present disclosure provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device is made to implement the above method.
  • the present disclosure provides a network card configuration device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor executes the computer program , implement the above method.
  • the present disclosure provides a computer program product, where the computer program product includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the above method is implemented.
  • An embodiment of the present disclosure provides a network card configuration method. First, determine the queue of network cards to be bound to the core in the target network card, wherein the queue of network cards to be bound to the core has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request , and then, determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node, wherein the Die set includes at least one Die, each Die includes multiple central processing unit CPUs, and multiple CPUs share the same A cache that binds the first queue of core-to-be-bound NICs to the CPU on the first Die in the Die set.
  • the first Die can be any Die in the Die set.
  • the interrupt number of is determined as the first interrupt number, and the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number are bound to the CPU on the first Die. It can be seen that the embodiment of the present disclosure performs directional core binding for the network card queue, and binds the RPS/XPS of the interrupt number corresponding to the network card queue and the network card queue to the same Die. Since the CPUs on the same Die share the cache, it can improve Cache hit rate, thereby improving network card performance as a whole.
  • FIG. 1 is a flowchart of a network card configuration method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a Die topology provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a network card configuration device provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a network card configuration device provided by an embodiment of the present disclosure.
  • the Linux interrupt binding scheme under the multi-core processor system mainly uses the Irqbalance service. Specifically, based on the idle central processing unit (Central Processing Unit, CPU for short) in the current system, it is selected for processing interrupts.
  • the requesting CPU means randomly assigning interrupt requests to idle CPUs. It can be seen that the allocation of interrupt requests has no directionality, resulting in a low hit rate of cached data, which affects the performance of the network card.
  • the embodiment of the present disclosure provides a network card configuration method. First, determine the queue of the network card to be bound to the core in the target network card, wherein the queue of the network card to be bound to the core and at least one interrupt number There is a corresponding relationship, the interrupt number is used to identify the type of interrupt request, and then, determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node, wherein the Die set includes at least one Die, and each Die Contains multiple central processing unit CPUs, multiple CPUs share the same cache, binds the first core network card queue to be bound to the CPU on the first Die in the Die set, and the first Die can be any one in the Die set Die, furthermore, determine the interrupt number corresponding to the first to-be-bound core network card queue as the first interrupt number, and control the flow RPS of the receiving queue data packet of the first interrupt number and/or the sending queue
  • the embodiment of the present disclosure performs directional core binding for the network card queue, and assigns the corresponding The RPS/XPS of the interrupt number and the network card queue are bound to the same Die. Since the CPUs on the same Die share the cache, the cache hit rate can be improved, thereby improving the performance of the network card as a whole.
  • an embodiment of the present disclosure provides a method for configuring a network card.
  • FIG. 1 it is a flow chart of a method for configuring a network card provided by an embodiment of the present disclosure. The method includes:
  • S101 Determine a queue of core network cards to be bound in the target network card.
  • the queue of network cards to be bound has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request.
  • a hard interrupt is an electrical signal.
  • an interrupt will be generated and the electrical signal will be sent to the interrupt controller through the bus.
  • the interrupt controller will send the electrical signal to the central Processor CPU, the CPU will immediately stop the running task, jump to the entry address of the interrupt handler, and perform interrupt processing.
  • the interrupt source generating an interrupt request may include a hardware device, for example, may include a network card, a disk, a keyboard, a clock, and the like.
  • the interrupt request signal generated by the interrupt source contains a specific identifier, which can enable the computer to determine which device made the interrupt request.
  • the interrupt request signal is the interrupt number.
  • the interrupt number is the code assigned to each interrupt source by the system. The CPU needs to pass The interrupt number finds the entry address of the interrupt handler to realize interrupt processing.
  • a network card is a piece of computer hardware designed to allow computers to communicate on a computer network, and may include single-queue network cards and multi-queue network cards, wherein a multi-queue network card refers to a network card that includes multiple network card queues.
  • the target network card is a multi-queue network card
  • the target network card has the maximum number of queues supported, and the current number of queues used by the target network card may be less than the maximum number of queues
  • the queue of the target network card to be bound to the core network card may include the target network card All the corresponding queues or the currently used queues corresponding to the target network card may also be a part of the currently used queues specified in advance.
  • the network card queue to be bound to the core in the target network card is determined.
  • the currently used 16 network card queues (such as queue0-queue15) can be determined as the queues of the target network card to be bound to the core network card.
  • a queue of network cards to be bound may correspond to one interrupt number, and may also correspond to multiple interrupt numbers.
  • the interrupt number is used to identify the type of the interrupt request IRQ.
  • the number of interrupt numbers available for use, one queue of NICs to be bound can correspond to one interrupt number; if the number of queues of NICs to be bound is greater than the number of available interrupt numbers, a queue of network cards to be bound can correspond to multiple interrupt numbers.
  • the computer system includes 16 available interrupt numbers (such as IRQ0 ⁇ IRQ15), if the determined target network card has 14 queues (such as queue0 ⁇ queue13) to be bound to the core network card, then one to be bound core network card
  • the queue can correspond to an interrupt number (for example, queue0 corresponds to IRQ0, queue13 corresponds to IRQ13, etc.).
  • S102 Determine a non-uniform memory access NUMA node corresponding to the target network card, and a die set corresponding to the NUMA node.
  • the Die set includes at least one Die, each Die includes multiple CPUs, and multiple CPUs share the same cache.
  • non-uniform memory access can divide a computer into multiple nodes, and each node corresponds to a set of dies, and a set of dies can contain one or more dies.
  • each Die can contain multiple CPUs inside, among which, each Die communicates through I/O Die, and each NUMA node is connected and information exchanged through interconnection modules.
  • each CPU on the same Die shares the same cache (Cache), which is also called L3 cache.
  • Cache which is also called L3 cache.
  • FIG. 2 it is a schematic diagram of a Die topology provided by an embodiment of the present disclosure, wherein NUMA divides the computer into two nodes (such as NUMA0, NUMA1), and the Die set corresponding to each node contains 4 Dies respectively.
  • the Die set corresponding to NUMA0 includes Die0 ⁇ Die3
  • the Die set corresponding to NUMA1 includes Die4 ⁇ Die7.
  • Each Die contains 8 CPUs, and the 8 CPUs included in the same Die share one L3Cache. /O Die to communicate.
  • a target network card corresponds to a NUMA node, and after the NUMA node corresponding to the target network card is determined, the Die set corresponding to the NUMA node may be determined.
  • the Linux system you can use the /sys/class/net/ ⁇ dev>/device/numa_node command to determine the NUMA node corresponding to the target network card.
  • each Die set contains 4 Dies (for example, the first Die set includes Die0 ⁇ Die3; the second Die set includes Die4 ⁇ Die7), through the above command Assume that the determined NUMA node corresponding to the target network card is NUMA1, and then, the second Die set corresponding to NUMA1 may be determined, where the second Die set includes Die4 ⁇ Die7.
  • S103 Bind the first network card queue to be bound to the CPU on the first Die in the Die set.
  • the first Die is any Die in the Die set.
  • each of the queues of NICs to be bound to be bound to the core of the target NIC is bound to the CPU on the Die corresponding to the target NIC, So that the interrupt requests corresponding to the same NIC queue to be bound in the target NIC can be allocated to the queue bound to the NIC queue to be bound Process on a given CPU.
  • the first network card queue to be bound to a core may be any network card queue to be bound to a core in the target network card
  • the first Die may be any Die in the Die set. According to the following steps A1-A3, the first core network card queue to be bound can be bound to the CPU on the first Die in the Die set.
  • Step A1 determine the total number of interrupt numbers corresponding to the queue of network cards to be bound in the target network card, and the number of Dies in the die set corresponding to the NUMA node.
  • the target network card determines the corresponding relationship between the network card queue to be bound to the core network card in the target network card and the interrupt number, and determine the total number of interrupt numbers corresponding to the network card queue to be bound to the core network card in the target network card, for example , based on the above example, it is determined that the queues of the core network cards to be bound in the target network card include 16 network card queues (such as queue0 ⁇ queue15), and each queue of core network cards to be bound corresponds to an interrupt number, then the target network card can be determined.
  • the total number of interrupt numbers corresponding to the queue is 16.
  • the NUMA node corresponding to the target network card and the Die set corresponding to the NUMA node can be determined, and the number of Dies in the Die set corresponding to the NUMA node can be determined. For example, based on the above example, determine the corresponding Die set of the target network card.
  • the NUMA node is NUMA1
  • the Die set corresponding to NUMA1 includes 4 Dies (such as Die4-Die7).
  • Step A2 based on the total number of interrupt numbers and the number of dies, group the queues of network cards to be bound to cores in the target network card to obtain at least one group.
  • the network card queues to be bound in the target network card are grouped. Specifically, the quotient of dividing the total number of interrupt numbers by the number of Dies can be determined as each The number of interrupt numbers corresponding to the group, and then divide the queues of network adapters to be bound corresponding to the interrupt numbers of each group into one group. For example, based on the above example, the total number of determined interrupt numbers is 16 and the determined number of Dies is 4, then it can be determined that the queues of the network cards to be bound to cores in the target network card can be divided into 4 groups, and each group contains 4 interrupt numbers Corresponding queue of NICs to be bound.
  • Step A3 Bind the queue of network interface cards to be bound included in the first group of the at least one group to the CPU on the first Die in the Die set.
  • the first group may be any queue group of network cards to be bound, and the first group includes the first queue of network cards to be bound.
  • the queues of the core network cards to be bound in the same group are bound to the CPUs on the same Die in the Die set, preferably, the queues of the core network cards to be bound are not divided into the same group It can be bound to different Dies in the Die collection to achieve load balancing.
  • bind the first group of core NIC queues to be bound (including queue0 ⁇ queue3) to the CPU on Die4; bind the second group of core NIC queues to be bound (including queue4 ⁇ queue7) to the Bind the CPU; bind the third group of core network card queues to be bound (including queue8 ⁇ queue11) to the CPU on Die6; bind the fourth group of core network card queues to be bound (including queue12 ⁇ queue15) are bound to the CPU on Die7; and so on.
  • the interrupt request balancing Irqbalance service is used to optimize interrupt distribution, and redistribute interrupts by periodically counting the interrupt load balance on each CPU.
  • the Irqbalance service it is first necessary to detect whether the target network card has opened the Irqbalance service. If it is determined that the target network card is in the state of opening the Irqbalance service, the Irqbalance service needs to be shut down. In the Linux system, the Irqbalance service can be shut down by the systemctl stop irqbalance command. If the Irqbalance service is not shut down, the Irqbalance service will automatically cover the parameters of the core binding in the embodiment of the disclosure, so that the network card configuration that binds the network card queue to be bound with the CPU on the Die in the Die set is invalid.
  • S104 Determine the interrupt number corresponding to the first network card queue to be bound as the first interrupt number.
  • one or more corresponding to the first queue to be bound with a core network card can be determined. interrupt number, and use the interrupt number corresponding to the first network card queue to be bound as the first interrupt number.
  • IRQ0 can be used as the first interrupt number
  • the core network card queue is queue7, and the queue7 of the first core network card to be bound has a corresponding relationship with the interrupt number IRQ7, then IRQ7 can be used as the first interrupt number; and so on.
  • the first interrupt number may include multiple numbers. For example, if the first network card queue to be bound is queue1, and the first The queue queue1 of the network card to be bound has a corresponding relationship with the interrupt number IRQ1 and the interrupt number IRQ2, so IRQ1 and IRQ2 can be used as the first interrupt number.
  • S105 Bind the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number to the CPU on the first Die.
  • one item in the RPS (Receive Packet Steering, receiving queue data packet control flow) of the first interrupt number and the XPS (Transmit Packet Steering, sending queue data packet control flow) of the first interrupt number or multiple items, are bound to the CPU on the first Die bound to the first core network card queue to be bound in S103 above.
  • the queue of the first network card to be bound is queue0
  • the first interrupt number is IRQ0
  • Die4 the first interrupt number can be determined as Die4.
  • RPS and/or XPS with interrupt number IRQ0 is bound to any CPU on Die4.
  • the CPU on the first Die can be further restricted , to ensure that the CPU processing the sending request and the CPU sending the data packet are the same CPU. Firstly, it is determined that the CPU bound to the first network card queue to be bound is the first CPU, and then, the RPS and/or XPS of the first interrupt number is bound to the first CPU.
  • the RPS and/or XPS of the first interrupt number can be bound to the first CPU to ensure processing and sending The requesting CPU and the outgoing CPU are the same CPU.
  • each CPU can be divided into at least two hyperthreads HT, and then the receiving queue data packet of the first interrupt number can be controlled flow RPS and/or sending queue according to the following steps B1-B2
  • the data packet control flow XPS is bound to the CPU on the first Die.
  • Step B1 bind the queue of the first network card to be bound to the first HT on the first Die in the Die set.
  • each CPU can be divided into at least two hyper-threads (HT, Hyper-Threading), wherein, two or more hyper-threads HT in one CPU can run simultaneously.
  • HT Hyper-Threading
  • CPU0 on Die4 can be divided into two hyperthreads HT, namely CPU32 and CPU160
  • CPU1 on Die4 can also be divided into two hyperthreads HT, respectively CPU34 and CPU162; and so on.
  • the first network card queue to be bound is bound to one of the hyperthread HTs (assumed to be the first HT) of a certain CPU on the first Die in the Die set.
  • the first queue to be bound to the core network card is queue0
  • the Die in the Die set corresponding to the first queue to be bound to the core network card queue0 is Die4, wherein Die4 may include multiple CPUs (such as CPU0-CPU7 )
  • the queue queue0 of the first network card to be bound can be bound to one of the hyper-threads HT (such as CPU32 or CPU160) of CPU0 on Die4.
  • Step B2 bind the receiving queue data packet control flow RPS and/or the sending queue data packet control flow XPS of the first interrupt number to the CPU where the first HT is located.
  • one or more of the RPS of the first interrupt number and the XPS of the first interrupt number are bound to the CPU where the first hyperthread HT determined in the above step B1 is located.
  • the RPS and/or XPS of the first interrupt number may be bound to another hyperthread HT of the CPU where the first hyperthread HT determined in the above step B1 is located.
  • the first core network card to be bound Queue queue0 is bound to one of the hyperthreads HT (such as CPU32) of CPU0 on Die4, then the RPS and/or XPS of the first interrupt number can be bound to another hyperthread HT of CPU0 (such as CPU160), In this way, the network card queue to be bound to the core can be bound to the same CPU as the RPS and/or XPS corresponding to the interrupt number, thereby improving the performance of the network card.
  • the network card configuration method firstly, determine the network card queue to be bound to the core in the target network card, wherein the queue of the network card to be bound to the core has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request, Then, determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node, wherein the Die set includes at least one Die, and each Die includes multiple central processing unit CPUs, and multiple CPUs share the same Cache, bind the first queue of core-to-be-bound NICs to the CPU on the first Die in the Die set.
  • the first Die can be any Die in the Die set.
  • the interrupt number is determined as the first interrupt number, and the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number are bound to the CPU on the first Die. It can be seen that the embodiment of the present disclosure performs directional core binding for the network card queue, and binds the RPS/XPS of the interrupt number corresponding to the network card queue and the network card queue to the same Die. Since the CPUs on the same Die share the cache, it can improve Cache hit rate, thereby improving network card performance as a whole.
  • the queue of network interface cards to be bound can be bound to the CPU on the first Die in the Die set.
  • Step C1 determine the Die in the Die set running the APP corresponding to the first network card queue to be bound as the first Die.
  • the Die running the APP corresponding to the queue of the first network card to be bound may include one or more Dies.
  • the determined NUMA node corresponding to the target network card is NUMA1
  • the determined Die set corresponding to NUMA1 is the second Die set, wherein the second Die set includes Die4-Die7.
  • RFS Receiveive Flow Steering, the control of the flow at the receiving end
  • RFS can record in the table the CPU running the APP corresponding to the queue to be bound to the core network card, based on the global socket flow table (rps_sock_flow_table) and the device flow table (rps_dev_flow_table)
  • the CPU used by the current flow may be determined among one or more CPUs running the APP corresponding to the first network card queue to be bound, and then the Die where the CPU is located is determined to be the first Die.
  • the CPUs running the APP corresponding to the queue of the first network card to be bound
  • determine that the CPU used by the current flow is CPU8, assuming that the Die where CPU8 is located is Die5 , the Die where the CPU 16 is located is Die6, then it can be determined that the first Die in the Die set running the APP corresponding to the first network card queue to be bound is Die5.
  • Step C2 bind the first queue of core network cards to be bound to the CPU on the first Die.
  • this step first, after determining the first Die in the Die set that runs the app corresponding to the first queue of core-to-be-bound NICs, connect the first queue of core-to-be-bound NICs (such as queue0) to the first Die (such as Die5) Bind the CPU (CPU8) of the first to-be-bound core network card queue, and then bind the receiving queue data packet control flow RPS of the interrupt number corresponding to the first core network card queue to be bound to the CPU on the first die.
  • the first to-be-bound core network card queue The interrupt number corresponding to the core network card queue queue0 is IRQ0
  • the receiving queue data packet control flow RPS corresponding to the interrupt number IRQ0 is bound to CPU8 on Die5.
  • the embodiments of the present disclosure can ensure that the CPU where the APP corresponding to the network card queue to be bound is located is the same CPU as the CPU that interrupts data processing after the hard interrupt, thereby making full use of the CPU cache and improving the cache data hit rate.
  • the present disclosure also provides a network card configuration device.
  • FIG. 3 it is a schematic structural diagram of a network card configuration device provided by an embodiment of the present disclosure.
  • the network card configuration device 300 includes:
  • the first determining module 301 is configured to determine a queue of network cards to be bound with a core in the target network card; wherein, the queue of network cards to be bound with a core has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request;
  • the second determination module 302 is configured to determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node; wherein, the die set includes at least one Die, and each Die includes multiple a central processing unit CPU, and the multiple CPUs share the same cache;
  • the first binding module 303 is configured to bind the first core network card queue to be bound to the CPU on the first Die in the Die set; wherein the first Die is any one of the Die sets Die;
  • the third determining module 304 is configured to determine the interrupt number corresponding to the first network card queue to be bound as the first interrupt number
  • the second binding module 305 is configured to connect the receive queue data packet control flow RPS of the first interrupt number and/or the send queue data packet control flow XPS of the first interrupt number with the CPU on the first Die to bind.
  • the device further includes:
  • the third determining module is used to determine the total number of interrupt numbers corresponding to the network card queue to be bound in the target network card, and the number of Dies in the die set corresponding to the NUMA node;
  • a grouping module configured to group the network card queues to be bound in the target network card based on the total number of interrupt numbers and the number of dies, to obtain at least one group;
  • the first binding module 303 includes:
  • the first binding submodule is configured to bind the core network card queue to be bound included in the first group of the at least one group with the CPU on the first Die in the Die set; wherein, the first The group includes the queue of the first network card to be bound.
  • the device further includes:
  • the fourth determination module is used to determine the CPU bound to the first core network card queue to be bound on the first Die as the first CPU;
  • the third binding module is configured to bind the RPS and/or XPS of the first interrupt number with the first CPU.
  • each CPU is divided into at least two hyperthreads HT, and the first binding module 303 includes:
  • the second binding submodule is configured to bind the first core network card queue to be bound to the first HT on the first Die in the Die set;
  • the second binding module 305 includes:
  • the third binding submodule is configured to bind the RPS and/or XPS of the first interrupt number with the CPU where the first HT is located.
  • the device further includes:
  • the fifth determining module is configured to determine, as the first Die, the Die running the application program APP corresponding to the first network card queue to be bound in the Die set.
  • the device further includes:
  • the detection module is used to detect whether the target network card opens an interrupt request balancing Irqbalance service; wherein, the Irqbalance service is used to distribute interrupt requests based on interrupt load balancing on the CPU;
  • a shutdown module configured to shut down the Irqbalance service if it is determined that the target network card is in the state of enabling the Irqbalance service.
  • the network card configuration device firstly, determine the queue of the network card to be bound to the core in the target network card, wherein the queue of the network card to be bound to the core has a corresponding relationship with at least one interrupt number, and the interrupt number is used to identify the type of the interrupt request, Then, determine the non-uniform memory access NUMA node corresponding to the target network card, and the die set corresponding to the NUMA node, wherein the Die set includes at least one Die, and each Die includes multiple central processing unit CPUs, and multiple CPUs share the same Cache, bind the first queue of core-to-be-bound NICs to the CPU on the first Die in the Die set.
  • the first Die can be any Die in the Die set.
  • the interrupt number is determined as the first interrupt number, and the receiving queue data packet control flow RPS of the first interrupt number and/or the sending queue data packet control flow XPS of the first interrupt number are bound to the CPU on the first Die. It can be seen that the embodiment of the present disclosure performs directional core binding for the network card queue, and binds the RPS/XPS of the interrupt number corresponding to the network card queue and the network card queue to the same Die. Since the CPUs on the same Die share the cache, it can improve Cache hit rate, thereby improving network card performance as a whole.
  • an embodiment of the present disclosure also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device realizes this
  • the network card configuration method described in the embodiment is disclosed.
  • the embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the network card configuration method described in the embodiment of the present disclosure is implemented.
  • an embodiment of the present disclosure also provides a network card configuration device 400, as shown in FIG. 4 , which may include:
  • Processor 401 memory 402 , input device 403 and output device 404 .
  • the number of processors 401 in the network card configuration device may be one or more, and one processor is taken as an example in FIG. 4 .
  • the processor 401 , the memory 402 , the input device 403 and the output device 404 may be connected through a bus or in other ways, wherein connection through a bus is taken as an example in FIG. 4 .
  • the memory 402 can be used to store software programs and modules, and the processor 401 executes various functional applications and data processing of the network card configuration device by running the software programs and modules stored in the memory 402 .
  • the memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, and the like.
  • the memory 402 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • the input device 403 can be used to receive input numbers or character information, and generate signal input related to user settings and function control of the network card configuration device.
  • the processor 401 loads the executable files corresponding to the processes of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the executable files stored in the memory 402. Application programs, so as to realize various functions of the above-mentioned network card configuration device.

Abstract

本公开提供了一种网卡配置方法、装置、设备及存储介质,该方法包括:确定目标网卡中的待绑核网卡队列,待绑核网卡队列与至少一个中断号具有对应关系,确定目标网卡对应的NUMA节点和该节点对应的Die集合,Die集合包括的每个Die包含的多个CPU共享同一个缓存,将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定,将第一待绑核网卡队列对应的第一中断号的RPS/XPS与第一Die上的CPU进行绑定。可见,本公开针对网卡队列进行定向绑核,并将网卡队列对应的中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。

Description

一种网卡配置方法、装置、设备及存储介质
相关申请的交叉引用
本申请要求于2022年02月16日提交的,申请号为202210142679.2、发明名称为“一种网卡配置方法、装置、设备及存储介质”的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开涉及数据存储技术领域,尤其涉及一种网卡配置方法、装置、设备及存储介质。
背景技术
随着网络科技的迅猛发展,网卡的应用环境变得日渐复杂,企业应用也对服务器的网卡性能提出了更高的要求,网卡的性能在一定程度上决定了服务器的整体性能,因此,为企业提供简单有效的网卡性能调优手段,也是今后网络服务的重点。
目前的相关技术中,多核处理器系统下的Linux中断绑定方案主要采用的Irqbalance服务,具体的,基于当前系统中处于空闲的中央处理器(Central Processing Unit,简称CPU),选择用于处理中断请求的CPU,即将中断请求随机分配给空闲的CPU,可见对中断请求的分配没有指向性,导致缓存数据的命中率较低,从而影响网卡性能。
发明内容
为了解决上述技术问题,本公开实施例提供了一种网卡配置方法,针对网卡队列进行定向绑核,并将网卡队列对应的中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。
第一方面,本公开提供了一种网卡配置方法,所述方法包括:
确定目标网卡中的待绑核网卡队列;其中,所述待绑核网卡队列与至少一个中断号具有对应关系,所述中断号用于标识中断请求的类型;
确定所述目标网卡对应的非统一内存访问NUMA节点,以及所述NUMA节点对应的管芯Die集合;其中,所述Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,所述多个CPU共享同一个缓存;
将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一Die为所述Die集合中的任意一个Die;
确定所述第一待绑核网卡队列对应的中断号,作为第一中断号;
将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定。
一种可选的实施方式中,所述将所述第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
确定所述目标网卡中的待绑核网卡队列对应的中断号总数量,以及所述NUMA节点对应的管芯Die集合中的Die数量;
基于所述中断号总数量和所述Die数量,对所述目标网卡中的待绑核网卡队列进行分组,得到至少一个分组;
相应的,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定,包括:
将所述至少一个分组中的第一分组包括的待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一分组中包括第一待绑核网卡队列。
一种可选的实施方式中,所述将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定之前,还包括:
确定所述第一Die上与所述第一待绑核网卡队列绑定的CPU,作为第一CPU;
将所述第一中断号的RPS和/或XPS与所述第一CPU进行绑定。
一种可选的实施方式中,每个CPU被划分为至少两个超线程HT,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定,包括:
将第一待绑核网卡队列与所述Die集合中的第一Die上的第一HT进行绑定;
相应的,所述将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定,包括:
将所述第一中断号的RPS和/或XPS与所述第一HT所在的CPU进行绑定。
一种可选的实施方式中,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
将所述Die集合中运行有第一待绑核网卡队列对应的应用程序APP的Die确定为第一Die。
一种可选的实施方式中,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
检测所述目标网卡是否开启中断请求均衡Irqbalance服务;其中,所述Irqbalance服务用于基于CPU上的中断负载均衡进行中断请求的分配;
如果确定所述目标网卡处于开启Irqbalance服务的状态,则关停所述Irqbalance服务。
第二方面,本公开提供了一种网卡配置装置,所述装置包括:
第一确定模块,用于确定目标网卡中的待绑核网卡队列;其中,所述待绑核网卡队列与至少一个中断号具有对应关系,所述中断号用于标识中断请求的类型;
第二确定模块,用于确定所述目标网卡对应的非统一内存访问NUMA节点,以及所述NUMA节点对应的管芯Die集合;其中,所述Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,所述多个CPU共享同一个缓存;
第一绑定模块,用于将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一Die为所述Die集合中的任意一个Die;
第三确定模块,用于确定所述第一待绑核网卡队列对应的中断号,作为第一中断号;
第二绑定模块,用于将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定。
第三方面,本公开提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现上述的方法。
第四方面,本公开提供了一种网卡配置设备,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现上述的方法。
第五方面,本公开提供了一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现上述的方法。
本公开实施例提供的技术方案与现有技术相比至少具有如下优点:
本公开实施例提供了一种网卡配置方法,首先,确定目标网卡中的待绑核网卡队列,其中,待绑核网卡队列与至少一个中断号具有对应关系,中断号用于标识中断请求的类型,然后,确定目标网卡对应的非统一内存访问NUMA节点,以及NUMA节点对应的管芯Die集合,其中,Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,多个CPU共享同一个缓存,将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定,第一Die可以为Die集合中的任意一个Die,进而,将第一待绑核网卡队列对应的中断号确定为第一中断号,并将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。可见,本公开实施例针对网卡队列进行定向绑核,并将网卡队列对应的中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例, 并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种网卡配置方法的流程图;
图2为本公开实施例提供的一种Die拓扑结构的示意图;
图3为本公开实施例提供的一种网卡配置装置的结构示意图;
图4为本公开实施例提供的一种网卡配置设备的结构示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
随着网络科技的迅猛发展,网卡的应用环境变得日渐复杂,企业应用也对服务器的网卡性能提出了更高的要求,网卡的性能在一定程度上决定了服务器的整体性能,因此,为企业提供简单有效的网卡性能调优手段,也是今后网络服务的重点。
目前的相关技术中,多核处理器系统下的Linux中断绑定方案主要采用的Irqbalance服务,具体的,基于当前系统中处于空闲的中央处理器(Central Processing Unit,简称CPU),选择用于处理中断请求的CPU,即将中断请求随机分配给空闲的CPU,可见对中断请求的分配没有指向性,导致缓存数据的命中率较低,从而影响网卡性能。
为了提高缓存数据的命中率,从而提升网卡性能,本公开实施例提供了一种网卡配置方法,首先,确定目标网卡中的待绑核网卡队列,其中,待绑核网卡队列与至少一个中断号具有对应关系,中断号用于标识中断请求的类型,然后,确定目标网卡对应的非统一内存访问NUMA节点,以及NUMA节点对应的管芯Die集合,其中,Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,多个CPU共享同一个缓存,将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定,第一Die可以为Die集合中的任意一个Die,进而,将第一待绑核网卡队列对应的中断号确定为第一中断号,并将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。可见,本公开实施例针对网卡队列进行定向绑核,并将网卡队列对应的 中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。
基于此,本公开实施例提供了一种网卡配置方法,参考图1,为本公开实施例提供的一种网卡配置方法的流程图,该方法包括:
S101:确定目标网卡中的待绑核网卡队列。
其中,待绑核网卡队列与至少一个中断号具有对应关系,中断号用于标识中断请求的类型。
实际应用中,硬中断是一种电信号,当设备有某种事情发生的时候,则会产生中断,并通过总线把电信号发送给中断控制器,进而,中断控制器把电信号发送给中央处理器CPU,CPU将立即停止正在运行的任务,跳到中断处理程序的入口地址,进行中断处理。其中,产生中断请求(Interrupt ReQuest,简称IRQ)的中断源可以包括硬件设备,例如可以包括网卡、磁盘、键盘、时钟等。中断源产生的中断请求信号包含特定的标识,可以使得计算机能够确定是哪个设备提出的中断请求,该中断请求信号即为中断号,中断号为系统分配给每个中断源的代号,CPU需要通过中断号找到中断处理程序的入口地址,实现中断处理。
实际应用中,网卡是一块被设计用来允许计算机在计算机网络上进行通讯的计算机硬件,可以包括单队列网卡和多队列网卡,其中,多队列网卡是指一个网卡上包括多个网卡队列。
本公开实施例中,目标网卡为多队列网卡,目标网卡具有支持的最大队列数,目标网卡的当前使用队列数可以小于最大队列数,其中,目标网卡中的待绑核网卡队列可以包括目标网卡对应的所有队列或者目标网卡对应的当前使用队列,也可以是预先指定的部分当前使用队列等。
一种可选的实施方式中,从目标网卡的网卡队列中,确定目标网卡中的待绑核网卡队列。在Linux系统中,可以通过ethtool命令查看目标网卡支持的最大队列数以及当前使用队列数,例如,通过ethtool-l<dev>combined<queue#>命令,确定目标网卡最多可以支持63个网卡队列,以表明目标网卡属于多队列网卡,确定当前使用了16个网卡队列。然后,可以将当前使用的16个网卡队列(如queue0~queue15)确定为目标网卡中的待绑核网卡队列。
本公开实施例中,一个待绑核网卡队列可以对应一个中断号,还可以对应多个中断号,中断号用于标识中断请求IRQ的类型,其中,如果待绑核网卡队列数量小于或等于可供使用的中断号数量,则一个待绑核网卡队列可以对应一个中断号;如果待绑核网卡队列数量 大于可供使用的中断号数量,则一个待绑核网卡队列可以对应多个中断号。例如,假设计算机系统中包括16个可供使用的中断号(如IRQ0~IRQ15),如果确定的目标网卡中的待绑核网卡队列包括14个(如queue0~queue13),则一个待绑核网卡队列可以对应一个中断号(如queue0对应IRQ0,queue13对应IRQ13等)。
S102:确定目标网卡对应的非统一内存访问NUMA节点,以及该NUMA节点对应的管芯Die集合。
其中,Die集合包括至少一个Die,每个Die包含多个CPU,多个CPU共享同一个缓存。
本公开实施例中,非统一内存访问(Non Uniform Memory Access,简称NUMA)可以将一台计算机分成多个节点,每个节点分别对应一个管芯Die集合,一个Die集合可以包含一个或多个Die,每个Die内部可以包含多个CPU,其中,各个Die之间通过I/O Die进行通信,各个NUMA节点之间通过互联模块进行连接和信息交互。
本公开实施例中,同一个Die上的各个CPU共享同一个缓存(Cache),也称为L3缓存。
如图2所述,为本公开实施例提供的一种Die拓扑结构的示意图,其中,NUMA将计算机分成两个节点(如NUMA0、NUMA1),每个节点对应的Die集合分别包含4个Die,NUMA0对应的Die集合包含Die0~Die3,NUMA1对应的Die集合包含Die4~Die7,每个Die内部包含8个CPU,同一个Die内部包含的8个CPU共享一个L3Cache,并且,各个Die之间通过I/O Die进行通信。
本公开实施例中,一个目标网卡对应一个NUMA节点,在确定目标网卡对应的NUMA节点后,可以确定该NUMA节点对应的Die集合。在Linux系统中,可以通过/sys/class/net/<dev>/device/numa_node命令确定目标网卡对应的NUMA节点,例如,假设计算机包含两个NUMA节点(如NUMA0、NUMA1),每个节点分别对应一个Die集合(如第一Die集合、第二Die集合),每个Die集合分别包含4个Die(如第一Die集合包含Die0~Die3;第二Die集合包含Die4~Die7),通过上述命令,假设确定目标网卡对应的NUMA节点为NUMA1,进而,可以确定NUMA1对应的第二Die集合,其中第二Die集合中包括Die4~Die7。
S103:将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定。
其中,第一Die为Die集合中的任意一个Die。
本公开实施例中,基于上述S101中确定的待绑核网卡队列以及上述S102中确定的Die集合,将目标网卡的各个待绑核网卡队列与该目标网卡对应的Die上的CPU进行绑定,使得目标网卡中的同一待绑核网卡队列对应的中断请求均能够分配到与该待绑核网卡队列绑 定的CPU上进行处理。
一种可选的实施方式中,第一待绑核网卡队列可以为目标网卡中的任意一个待绑核网卡队列,第一Die可以为Die集合中的任意一个Die。根据如下步骤A1-A3,可以将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定。
步骤A1,确定目标网卡中的待绑核网卡队列对应的中断号总数量,以及NUMA节点对应的管芯Die集合中的Die数量。
该步骤中,基于上述S101可以确定目标网卡中的待绑核网卡队列以及待绑核网卡队列与中断号的对应关系,可以确定目标网卡中的待绑核网卡队列对应的中断号总数量,例如,基于上述举例,确定目标网卡中的待绑核网卡队列包括16个网卡队列(如queue0~queue15),一个待绑核网卡队列分别对应一个中断号,则可以确定目标网卡中的待绑核网卡队列对应的中断号总数量为16个。
该步骤中,基于上述S102可以确定目标网卡对应的NUMA节点,以及该NUMA节点对应的Die集合,可以确定该NUMA节点对应的Die集合中的Die数量,例如,基于上述举例,确定目标网卡对应的NUMA节点为NUMA1,NUMA1对应的Die集合包含4个Die(如Die4~Die7)。
步骤A2,基于中断号总数量和Die数量,对目标网卡中的待绑核网卡队列进行分组,得到至少一个分组。
该步骤中,基于上述步骤A1确定的中断号总数量和Die数量,对目标网卡中的待绑核网卡队列进行分组,具体的,可以将中断号总数量除以Die数量的商,确定为每组对应的中断号数量,然后将各组的中断号对应的待绑核网卡队列划分为一组。例如,基于上述举例,确定的中断号总数量为16个以及确定的Die数量为4个,则可以确定目标网卡中的待绑核网卡队列可以分为4组,每组各包含4个中断号对应的待绑核网卡队列。
步骤A3,将至少一个分组中的第一分组包括的待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定。
其中,第一分组可以为任意一个待绑核网卡队列组,第一分组中包括第一待绑核网卡队列。
该步骤中,基于上述步骤A2中的分组,将处于同一组的待绑核网卡队列与Die集合中的同一个Die上的CPU进行绑定,优选的,不分同组的待绑核网卡队列可以分别与Die集合中不同的Die进行绑定,从而达到负载均衡。例如,基于上述举例,分别将第一组待绑核网卡队列(包含queue0~queue3)与Die4上的CPU进行绑定;将第二组待绑核网卡队列(包含queue4~queue7)与Die5上的CPU进行绑定;将第三组待绑核网卡队列(包含queue8~queue11)与Die6上的CPU进行绑定;将第四组待绑核网卡队列(包含 queue12~queue15)与Die7上的CPU进行绑定;等等。
一种可选的实施方式中,在将待绑核网卡队列与Die集合中的Die上的CPU进行绑定之前,首先需要检测目标网卡是否开启中断请求均衡Irqbalance服务,如果确定目标网卡处于开启Irqbalance服务的状态,则需要关停Irqbalance服务,避免本公开实施例提供的网卡配置无效。
实际应用中,中断请求均衡Irqbalance服务用于优化中断分配,通过周期性的统计各个CPU上的中断负载均衡情况,对中断进行再分配。
本公开实施例中,首先需要检测目标网卡是否开启Irqbalance服务,如果确定目标网卡处于开启Irqbalance服务的状态,则需要关停Irqbalance服务,在Linux系统中,可以通过systemctl stop irqbalance命令关停Irqbalance服务。如果不关停Irqbalance服务,则Irqbalance服务会自动覆盖本公开实施例中绑核的参数,使得将待绑核网卡队列与Die集合中的Die上的CPU进行绑定的网卡配置无效。
S104:确定第一待绑核网卡队列对应的中断号,作为第一中断号。
本公开实施例中,基于上述S103中确定的第一待绑核网卡队列,以及待绑核网卡队列与至少一个中断号具有对应关系,可以确定该第一待绑核网卡队列对应的一个或多个中断号,并将该第一待绑核网卡队列对应的中断号,作为第一中断号。例如,基于上述举例,如果第一待绑核网卡队列为queue0,以及该第一待绑核网卡队列queue0与中断号IRQ0具有对应关系,则可以将IRQ0作为第一中断号;如果第一待绑核网卡队列为queue7,以及该第一待绑核网卡队列queue7与中断号IRQ7具有对应关系,则可以将IRQ7作为第一中断号;等等。
本公开实施例中,如果第一待绑核网卡队列与多个中断号具有对应关系,则第一中断号可以包括多个,例如,如果第一待绑核网卡队列为queue1,以及该第一待绑核网卡队列queue1与中断号IRQ1、中断号IRQ2均具有对应关系,则可以将IRQ1以及IRQ2作为第一中断号。在后续将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定时,需要分别将中断号IRQ1的接收队列数据包控制流RPS和/或中断号IRQ1的发送队列数据包控制流XPS与第一Die上的CPU进行绑定,以及将中断号IRQ2的接收队列数据包控制流RPS和/或中断号IRQ2的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。
S105:将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。
本公开实施例中,将第一中断号的RPS(Receive Packet Steering,接收队列数据包控制流)和第一中断号的XPS(Transmit Packet Steering,发送队列数据包控制流)中的一项 或多项,与上述S103中与第一待绑核网卡队列绑定的第一Die上的CPU进行绑定。例如,基于上述举例,假设第一待绑核网卡队列为queue0,则确定第一中断号为IRQ0,并且,基于上述S103,确定queue0绑定的CPU所在的第一Die为Die4,则可以将第一中断号IRQ0的RPS和/或XPS与Die4上的任意一个CPU进行绑定。在Linux系统中,可以通过编辑/sys/class/net/<dev>/queues/rx-<n>/rps_cpus文件进行RPS绑核,可以通过编辑/sys/class/net/<dev>/queues/tx-<n>/xps_cpus文件进行XPS绑核。
一种可选的实施方式中,当将第一中断号的RPS和/或第一中断号的XPS,与第一Die上的CPU进行绑定时,还可以对第一Die上的CPU进一步限制,以保证处理发送请求的CPU和向外发送数据包的CPU是同一个CPU。首先,确定与第一待绑核网卡队列绑定的CPU为第一CPU,进而,将第一中断号的RPS和/或XPS与该第一CPU进行绑定。
本公开实施例中,在确定与第一待绑核网卡队列绑定的第一CPU后,可以将第一中断号的RPS和/或XPS,与该第一CPU进行绑定,以保证处理发送请求的CPU和向外发送数据包的CPU是同一个CPU。
一种可选的实施方式中,每个CPU可以被划分为至少两个超线程HT,则可以根据如下步骤B1-B2,将第一中断号的接收队列数据包控制流RPS和/或发送队列数据包控制流XPS,与第一Die上的CPU进行绑定。
步骤B1,将第一待绑核网卡队列与Die集合中的第一Die上的第一HT进行绑定。
该步骤中,每个CPU可以被划分为至少两个超线程(HT,Hyper-Threading),其中,一个CPU中的两个超线程HT或多个超线程HT可以同时运行。例如,Die4上的CPU0可以被划分为两个超线程HT,分别为CPU32和CPU160;Die4上的CPU1也可以被划分为两个超线程HT,分别为CPU34和CPU162;等等。
该步骤中,将第一待绑核网卡队列与Die集合中的第一Die上的某个CPU的其中一个超线程HT(假设为第一HT)进行绑定。例如,基于上述举例,假设第一待绑核网卡队列为queue0,并确定第一待绑核网卡队列queue0对应的Die集合中的Die为Die4,其中,Die4可以包括多个CPU(如CPU0~CPU7),则可以将第一待绑核网卡队列queue0与Die4上的CPU0的其中一个超线程HT(如CPU32或CPU160)进行绑定。
步骤B2,将第一中断号的接收队列数据包控制流RPS和/或发送队列数据包控制流XPS,与该第一HT所在的CPU进行绑定。
该步骤中,将第一中断号的RPS和第一中断号的XPS中的一项或多项,与上述步骤B1中确定的第一超线程HT所在的CPU进行绑定,优选的,如果一个CPU对应两个超线程HT,则可以将第一中断号的RPS和/或XPS,与上述步骤B1中确定的第一超线程HT所在的CPU的另一个超线程HT进行绑定。例如,基于上述举例,如果将第一待绑核网卡 队列queue0与Die4上的CPU0的其中一个超线程HT(如CPU32)进行绑定,则可以将第一中断号的RPS和/或XPS与CPU0的另一个超线程HT(如CPU160)进行绑定,以使得待绑核网卡队列可以与其对应的中断号的RPS和/或XPS绑定在同一CPU上,从而提升网卡性能。
本公开实施例提供的网卡配置方法中,首先,确定目标网卡中的待绑核网卡队列,其中,待绑核网卡队列与至少一个中断号具有对应关系,中断号用于标识中断请求的类型,然后,确定目标网卡对应的非统一内存访问NUMA节点,以及NUMA节点对应的管芯Die集合,其中,Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,多个CPU共享同一个缓存,将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定,第一Die可以为Die集合中的任意一个Die,进而,将第一待绑核网卡队列对应的中断号确定为第一中断号,并将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。可见,本公开实施例针对网卡队列进行定向绑核,并将网卡队列对应的中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。
基于上述实施例中将待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定时,由于并没有考虑待绑核网卡队列对应的应用程序APP运行在哪个CPU上,因此可能会出现待绑核网卡队列绑定的CPU与待绑核网卡队列对应的APP所在的CPU不是同一个CPU的问题,无法充分利用缓存,影响缓存命中率,进而影响网卡性能。
一种可选的实施方式中,可以根据如下步骤C1-C2,将待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定。
步骤C1,将Die集合中运行有第一待绑核网卡队列对应的APP的Die确定为第一Die。
该步骤中,在基于上述S102确定目标网卡对应的NUMA节点,以及该NUMA节点对应的Die集合后,确定在该Die集合中运行有第一待绑核网卡队列对应的APP的CPU所在的Die,作为第一Die;其中,运行有第一待绑核网卡队列对应的APP的Die可以包括一个或多个。
举例说明,假设基于上述S102,确定的目标网卡对应的NUMA节点为NUMA1,以及确定的NUMA1对应的Die集合为第二Die集合,其中,第二Die集合包含Die4~Die7。在Linux系统中,RFS(Receive Flow Steering,接收端流的控制)可以将运行有待绑核网卡队列对应的APP的CPU记录在表中,基于全局socket流表(rps_sock_flow_table)和设备流表(rps_dev_flow_table)可以在运行有第一待绑核网卡队列对应的APP的一个或多个CPU中确定当前流使用的CPU,进而,确定该CPU所在的Die为第一Die。例如,基于全局socket 流表和设备流表,在运行有第一待绑核网卡队列对应的APP的一个或多个CPU(如CPU8和CPU16)中,确定当前流使用的CPU为CPU8,假设CPU8所在的Die为Die5,CPU16所在的Die为Die6,则可以确定Die集合中运行有第一待绑核网卡队列对应的APP的第一Die为Die5。
步骤C2,将第一待绑核网卡队列与第一Die上的CPU进行绑定。
该步骤中,首先,在确定Die集合中运行有第一待绑核网卡队列对应的APP的第一Die之后,将第一待绑核网卡队列(如queue0)与第一Die(如Die5)上的CPU(CPU8)进行绑定,进而,将第一待绑核网卡队列对应的中断号的接收队列数据包控制流RPS与该第一Die上的CPU进行绑定,例如,假设第一待绑核网卡队列queue0对应的中断号为IRQ0,则将中断号IRQ0对应的接收队列数据包控制流RPS与Die5上的CPU8进行绑定。
本公开实施例能够保证待绑核网卡队列对应的APP所在的CPU与硬中断后续中断数据处理的CPU为同一个CPU,从而充分利用了CPU缓存,提高了缓存数据的命中率。
与上述实施例基于同一个发明构思,本公开还提供了一种网卡配置装置,参考图3,为本公开实施例提供的一种网卡配置装置的结构示意图,所述网卡配置装置300包括:
第一确定模块301,用于确定目标网卡中的待绑核网卡队列;其中,所述待绑核网卡队列与至少一个中断号具有对应关系,所述中断号用于标识中断请求的类型;
第二确定模块302,用于确定所述目标网卡对应的非统一内存访问NUMA节点,以及所述NUMA节点对应的管芯Die集合;其中,所述Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,所述多个CPU共享同一个缓存;
第一绑定模块303,用于将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一Die为所述Die集合中的任意一个Die;
第三确定模块304,用于确定所述第一待绑核网卡队列对应的中断号,作为第一中断号;
第二绑定模块305,用于将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定。
一种可选的实施方式中,所述装置还包括:
第三确定模块,用于确定所述目标网卡中的待绑核网卡队列对应的中断号总数量,以及所述NUMA节点对应的管芯Die集合中的Die数量;
分组模块,用于基于所述中断号总数量和所述Die数量,对所述目标网卡中的待绑核网卡队列进行分组,得到至少一个分组;
相应的,所述第一绑定模块303,包括:
第一绑定子模块,用于将所述至少一个分组中的第一分组包括的待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一分组中包括第一待绑核网卡队列。
一种可选的实施方式中,所述装置还包括:
第四确定模块,用于确定所述第一Die上与所述第一待绑核网卡队列绑定的CPU,作为第一CPU;
第三绑定模块,用于将所述第一中断号的RPS和/或XPS与所述第一CPU进行绑定。
一种可选的实施方式中,每个CPU被划分为至少两个超线程HT,所述第一绑定模块303,包括:
第二绑定子模块,用于将第一待绑核网卡队列与所述Die集合中的第一Die上的第一HT进行绑定;
相应的,所述第二绑定模块305,包括:
第三绑定子模块,用于将所述第一中断号的RPS和/或XPS与所述第一HT所在的CPU进行绑定。
一种可选的实施方式中,所述装置还包括:
第五确定模块,用于将所述Die集合中运行有第一待绑核网卡队列对应的应用程序APP的Die确定为第一Die。
一种可选的实施方式中,所述装置还包括:
检测模块,用于检测所述目标网卡是否开启中断请求均衡Irqbalance服务;其中,所述Irqbalance服务用于基于CPU上的中断负载均衡进行中断请求的分配;
关停模块,用于如果确定所述目标网卡处于开启Irqbalance服务的状态,则关停所述Irqbalance服务。
本公开实施例提供的网卡配置装置中,首先,确定目标网卡中的待绑核网卡队列,其中,待绑核网卡队列与至少一个中断号具有对应关系,中断号用于标识中断请求的类型,然后,确定目标网卡对应的非统一内存访问NUMA节点,以及NUMA节点对应的管芯Die集合,其中,Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,多个CPU共享同一个缓存,将第一待绑核网卡队列与Die集合中的第一Die上的CPU进行绑定,第一Die可以为Die集合中的任意一个Die,进而,将第一待绑核网卡队列对应的中断号确定为第一中断号,并将第一中断号的接收队列数据包控制流RPS和/或第一中断号的发送队列数据包控制流XPS与第一Die上的CPU进行绑定。可见,本公开实施例针对网卡队列进行定向绑核,并将网卡队列对应的中断号的RPS/XPS与该网卡队列绑定到同一个Die上,由于同一Die上的CPU共享缓存,因此可以提高缓存命中率,从而整体上提升网卡性能。
除了上述方法和装置以外,本公开实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现本公开实施例所述的网卡配置方法。
本公开实施例还提供了一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现本公开实施例所述的网卡配置方法。
另外,本公开实施例还提供了一种网卡配置设备400,参见图4所示,可以包括:
处理器401、存储器402、输入装置403和输出装置404。网卡配置设备中的处理器401的数量可以一个或多个,图4中以一个处理器为例。在本公开的一些实施例中,处理器401、存储器402、输入装置403和输出装置404可通过总线或其它方式连接,其中,图4中以通过总线连接为例。
存储器402可用于存储软件程序以及模块,处理器401通过运行存储在存储器402的软件程序以及模块,从而执行网卡配置设备的各种功能应用以及数据处理。存储器402可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等。此外,存储器402可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。输入装置403可用于接收输入的数字或字符信息,以及产生与网卡配置设备的用户设置以及功能控制有关的信号输入。
具体在本实施例中,处理器401会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储器402中,并由处理器401来运行存储在存储器402中的应用程序,从而实现上述网卡配置设备的各种功能。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (10)

  1. 一种网卡配置方法,其特征在于,所述方法包括:
    确定目标网卡中的待绑核网卡队列;其中,所述待绑核网卡队列与至少一个中断号具有对应关系,所述中断号用于标识中断请求的类型;
    确定所述目标网卡对应的非统一内存访问NUMA节点,以及所述NUMA节点对应的管芯Die集合;其中,所述Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,所述多个CPU共享同一个缓存;
    将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一Die为所述Die集合中的任意一个Die;
    确定所述第一待绑核网卡队列对应的中断号,作为第一中断号;
    将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定。
  2. 根据权利要求1所述的方法,其特征在于,所述将所述第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
    确定所述目标网卡中的待绑核网卡队列对应的中断号总数量,以及所述NUMA节点对应的管芯Die集合中的Die数量;
    基于所述中断号总数量和所述Die数量,对所述目标网卡中的待绑核网卡队列进行分组,得到至少一个分组;
    相应的,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定,包括:
    将所述至少一个分组中的第一分组包括的待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一分组中包括第一待绑核网卡队列。
  3. 根据权利要求1所述的方法,其特征在于,所述将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定之前,还包括:
    确定所述第一Die上与所述第一待绑核网卡队列绑定的CPU,作为第一CPU;
    将所述第一中断号的RPS和/或XPS与所述第一CPU进行绑定。
  4. 根据权利要求1所述的方法,其特征在于,每个CPU被划分为至少两个超线程HT,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定,包括:
    将第一待绑核网卡队列与所述Die集合中的第一Die上的第一HT进行绑定;
    相应的,所述将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定,包括:
    将所述第一中断号的RPS和/或XPS与所述第一HT所在的CPU进行绑定。
  5. 根据权利要求1所述的方法,其特征在于,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
    将所述Die集合中运行有第一待绑核网卡队列对应的应用程序APP的Die确定为第一Die。
  6. 根据权利要求1所述的方法,其特征在于,所述将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定之前,还包括:
    检测所述目标网卡是否开启中断请求均衡Irqbalance服务;其中,所述Irqbalance服务用于基于CPU上的中断负载均衡进行中断请求的分配;
    如果确定所述目标网卡处于开启Irqbalance服务的状态,则关停所述Irqbalance服务。
  7. 一种网卡配置装置,其特征在于,所述装置包括:
    第一确定模块,用于确定目标网卡中的待绑核网卡队列;其中,所述待绑核网卡队列与至少一个中断号具有对应关系,所述中断号用于标识中断请求的类型;
    第二确定模块,用于确定所述目标网卡对应的非统一内存访问NUMA节点,以及所述NUMA节点对应的管芯Die集合;其中,所述Die集合包括至少一个Die,每个Die包含多个中央处理器CPU,所述多个CPU共享同一个缓存;
    第一绑定模块,用于将第一待绑核网卡队列与所述Die集合中的第一Die上的CPU进行绑定;其中,所述第一Die为所述Die集合中的任意一个Die;
    第三确定模块,用于确定所述第一待绑核网卡队列对应的中断号,作为第一中断号;
    第二绑定模块,用于将所述第一中断号的接收队列数据包控制流RPS和/或所述第一中断号的发送队列数据包控制流XPS与所述第一Die上的CPU进行绑定。
  8. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现如权利要求1-6任一项所述的方法。
  9. 一种网卡配置设备,其特征在于,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如权利要求1-6任一项所述的方法。
  10. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现如权利要求1-6任一项所述的方法。
PCT/CN2023/076039 2022-02-16 2023-02-15 一种网卡配置方法、装置、设备及存储介质 WO2023155785A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210142679.2 2022-02-16
CN202210142679.2A CN114490085B (zh) 2022-02-16 2022-02-16 一种网卡配置方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023155785A1 true WO2023155785A1 (zh) 2023-08-24

Family

ID=81479638

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/076039 WO2023155785A1 (zh) 2022-02-16 2023-02-15 一种网卡配置方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN114490085B (zh)
WO (1) WO2023155785A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490085B (zh) * 2022-02-16 2023-09-19 北京火山引擎科技有限公司 一种网卡配置方法、装置、设备及存储介质
CN115473757A (zh) * 2022-09-29 2022-12-13 展讯通信(上海)有限公司 智能终端的动态网卡驱动管理系统、方法、装置及设备
CN117112044B (zh) * 2023-10-23 2024-02-06 腾讯科技(深圳)有限公司 基于网卡的指令处理方法、装置、设备和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150339168A1 (en) * 2014-05-23 2015-11-26 Osr Open Systems Resources, Inc. Work queue thread balancing
CN109840147A (zh) * 2019-01-21 2019-06-04 郑州云海信息技术有限公司 一种实现多队列网卡绑定cpu的方法及系统
CN111813547A (zh) * 2020-06-30 2020-10-23 武汉虹旭信息技术有限责任公司 基于dpdk的数据包处理方法及装置
CN112306693A (zh) * 2020-11-18 2021-02-02 支付宝(杭州)信息技术有限公司 数据包的处理方法和设备
CN114490085A (zh) * 2022-02-16 2022-05-13 北京火山引擎科技有限公司 一种网卡配置方法、装置、设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3206339B1 (en) * 2014-10-31 2018-12-12 Huawei Technologies Co., Ltd. Network card configuration method and resource management center
CN106101019A (zh) * 2016-06-22 2016-11-09 浪潮电子信息产业股份有限公司 一种基于中断绑定的多队列网卡性能调优方法
CN112363833B (zh) * 2020-11-10 2023-01-31 海光信息技术股份有限公司 一种网络数据包的内存分配方法、装置及相关设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150339168A1 (en) * 2014-05-23 2015-11-26 Osr Open Systems Resources, Inc. Work queue thread balancing
CN109840147A (zh) * 2019-01-21 2019-06-04 郑州云海信息技术有限公司 一种实现多队列网卡绑定cpu的方法及系统
CN111813547A (zh) * 2020-06-30 2020-10-23 武汉虹旭信息技术有限责任公司 基于dpdk的数据包处理方法及装置
CN112306693A (zh) * 2020-11-18 2021-02-02 支付宝(杭州)信息技术有限公司 数据包的处理方法和设备
CN114490085A (zh) * 2022-02-16 2022-05-13 北京火山引擎科技有限公司 一种网卡配置方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114490085A (zh) 2022-05-13
CN114490085B (zh) 2023-09-19

Similar Documents

Publication Publication Date Title
WO2023155785A1 (zh) 一种网卡配置方法、装置、设备及存储介质
US10884799B2 (en) Multi-core processor in storage system executing dynamic thread for increased core availability
US6336177B1 (en) Method, system and computer program product for managing memory in a non-uniform memory access system
US9710408B2 (en) Source core interrupt steering
US10120586B1 (en) Memory transaction with reduced latency
US5991797A (en) Method for directing I/O transactions between an I/O device and a memory
US6711643B2 (en) Method and apparatus for interrupt redirection for arm processors
WO2018035856A1 (zh) 实现硬件加速处理的方法、设备和系统
US8478926B1 (en) Co-processing acceleration method, apparatus, and system
US6249830B1 (en) Method and apparatus for distributing interrupts in a scalable symmetric multiprocessor system without changing the bus width or bus protocol
US20180225155A1 (en) Workload optimization system
CN108900626B (zh) 一种云环境下数据存储方法、装置及系统
EP2618257B1 (en) Scalable sockets
US11449456B2 (en) System and method for scheduling sharable PCIe endpoint devices
WO2009123492A1 (en) Optimizing memory copy routine selection for message passing in a multicore architecture
Wasi-ur-Rahman et al. A comprehensive study of MapReduce over lustre for intermediate data placement and shuffle strategies on HPC clusters
US20190007483A1 (en) Server architecture having dedicated compute resources for processing infrastructure-related workloads
WO2024082985A1 (zh) 一种安装有加速器的卸载卡
US10289306B1 (en) Data storage system with core-affined thread processing of data movement requests
Trivedi et al. RStore: A direct-access DRAM-based data store
CN110447019B (zh) 存储器分配管理器及由其执行的用于管理存储器分配的方法
Li et al. Improving spark performance with zero-copy buffer management and RDMA
CN114281516A (zh) 一种基于numa属性的资源分配方法及装置
JP2780662B2 (ja) マルチプロセッサシステム
WO2024027395A1 (zh) 一种数据处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23755787

Country of ref document: EP

Kind code of ref document: A1