CN114124589A - SOC intelligent network card and task scheduling method - Google Patents

SOC intelligent network card and task scheduling method Download PDF

Info

Publication number
CN114124589A
CN114124589A CN202111281620.3A CN202111281620A CN114124589A CN 114124589 A CN114124589 A CN 114124589A CN 202111281620 A CN202111281620 A CN 202111281620A CN 114124589 A CN114124589 A CN 114124589A
Authority
CN
China
Prior art keywords
scheduling
task
core unit
fcfs
dwrr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111281620.3A
Other languages
Chinese (zh)
Inventor
温强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Weilang Technology Co ltd
Original Assignee
Beijing Weilang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Weilang Technology Co ltd filed Critical Beijing Weilang Technology Co ltd
Priority to CN202111281620.3A priority Critical patent/CN114124589A/en
Publication of CN114124589A publication Critical patent/CN114124589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an SOC intelligent network card and a task scheduling method, wherein the SOC intelligent network card comprises a host communication module and a hybrid scheduling module, and the host communication module is used for communicating with a host server to obtain a task request; the hybrid scheduling module comprises an FCFS scheduling core unit, a DWRR scheduling core unit, a monitoring unit and a scheduling core selection unit; the FCFS scheduling core unit is used for receiving and executing the task request; the DWRR dispatching core unit is used for receiving the task from the FCFS dispatching core unit when the tail delay of the task executed on the FCFS dispatching core unit exceeds a first tail delay threshold value; the monitoring unit detects whether the tail delay of the task exceeds a first tail delay threshold or whether the tail delay of the task is lower than a second tail delay threshold; the dispatching core selecting unit is used for distributing the tasks with the tail delay exceeding a first tail delay threshold value to the DWRR dispatching core unit, and distributing the tasks with the tail delay lower than a second tail delay threshold value to the FCFS dispatching core unit.

Description

SOC intelligent network card and task scheduling method
Technical Field
The invention relates to the technical field of intelligent network card application, in particular to an SOC intelligent network card and a task scheduling method.
Background
Data center servers (host servers) now typically host a wide variety of applications, especially distributed applications and different competing multi-tenant applications. These applications have different types of offload and different computational requirements. More importantly, the execution behavior of these applications by the computing modules is also different. The execution time of different offload tasks may vary by an order of magnitude and the computational cost of the host server consuming cycles may vary. The phenomenon of long tail delay is serious due to disordered resource sharing of the data center, interference among multiple tenants and sudden load, and therefore user experience is seriously influenced.
In addition, Tail delays (Tail Latency) of Remote Procedure Calls (RPCs) caused by different application programs are different, and there are two main reasons for generating high Tail delay, one is that a memory and a cache hierarchy are on a critical path, which may interfere with memory access among different programs and cause resource competition; the other is sub-optimal scheduling.
In the prior art, functions of a network are offloaded to an intelligent network card based on an FPGA, such as ClickNP and amazon cloud. The solution is mainly to adopt the traditional specific field acceleration method to unload some application programs of the host server to the FPGA for execution. Although the application programs have certain parallelism and certainty and can be used for improving the performance of the customized logic design on the FPGA, the application programs with complex data structures and algorithms cannot be realized on the intelligent network card based on the FPGA, and the realization period is long and the difficulty is high.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art.
In order to solve the above technical problem, the technical solution adopted by the present invention is to provide an SOC intelligent network card, including:
the host communication module is used for communicating with the host server to acquire the task request;
the hybrid scheduling module comprises an FCFS (FCFS) scheduling core unit, a DWRR (DWRR) scheduling core unit, a monitoring unit and a scheduling core selecting unit; the FCFS scheduling core unit is used for receiving and executing the task request; the DWRR dispatching core unit is used for receiving the task from the FCFS dispatching core unit and generating a DWRR dispatching queue when the tail delay of the task executed on the FCFS dispatching core unit exceeds a first tail delay threshold value; the monitoring unit is configured to detect whether a tail latency of a task executing on the FCFS scheduling core unit exceeds the first tail latency threshold and to detect whether a tail latency of a task executing on the DWRR scheduling core unit is below a second tail latency threshold; the scheduling core selection unit is configured to allocate the tasks whose tail delay executed on the FCFS scheduling core unit exceeds the first tail delay threshold value into the DWRR scheduling core unit, and to allocate the tasks whose tail delay executed on the DWRR scheduling core unit is lower than the second tail delay threshold value into the FCFS scheduling core unit.
In the above apparatus, the FCFS scheduling core unit sets the first tail delay threshold in advance, where the first tail delay threshold refers to a statistical value of tail delays of tasks executed by an existing intelligent network card under an FCFS scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
In the apparatus, the DWRR scheduling core unit is preset with the second tail delay threshold, where the second tail delay threshold refers to a statistical value of tail delays of tasks executed by an existing intelligent network card under the DWRR scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
In the device, an average request delay threshold is preset on the FCFS scheduling core unit, and when the monitoring unit detects that the average request delay time of the working of the FCFS scheduling core unit is greater than the average request delay threshold, the host communication module migrates the one with the highest load ratio on the smart network card to the host server; when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is smaller than the average request delay threshold value, some loads are extracted from the host server to the intelligent network card through the host communication module.
In the above apparatus, the average request delay threshold refers to an average value of request delays when all the computation cores on the smart network card process different tasks.
In the above apparatus, the DWRR scheduling core unit is preset with a deficit delay threshold, and when the monitoring unit detects that a deficit counter of a task in the DWRR scheduling queue is greater than the deficit delay threshold, the task is preferentially run.
In the above apparatus, the deficit delay threshold refers to a value of a deficit counter when the task delay reaches (1- α) × the second tail delay threshold, α representing a hysteresis factor.
The invention also provides a task scheduling method, which is applied to the SOC intelligent network card, wherein the SOC intelligent network card comprises a host communication module and a hybrid scheduling module, and the hybrid scheduling module comprises an FCFS scheduling core unit, a DWRR scheduling core unit, a monitoring unit and a scheduling core selection unit;
the host communication module acquires a task request from a host server;
the host communication module sends the task request to the FCFS dispatching core unit, and a first tail delay threshold value is preset on the FCFS dispatching core unit;
when the monitoring unit detects that the tail delay of the task executed on the FCFS scheduling core unit exceeds the first tail delay threshold, the scheduling core selecting unit allocates the task to the DWRR scheduling core unit for execution, and generates a DWRR scheduling queue on the DWRR scheduling core unit, wherein a second tail delay threshold is preset on the DWRR scheduling core unit;
when the monitoring unit detects that the tail delay of the task executed on the DWRR scheduling core unit is lower than the second tail delay threshold value, the scheduling core selecting unit allocates the task to the FCFS scheduling core unit for execution.
In the method, an average request delay threshold value is preset on the FCFS scheduling core unit, and when the monitoring unit detects that the average request delay time of the working of the FCFS scheduling core unit is greater than the average request delay threshold value, the host communication module migrates the one with the highest load ratio on the smart network card to the host server; when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is smaller than the average request delay threshold value, some loads are extracted from the host server to the intelligent network card through the host communication module.
In the method, a deficit delay threshold is preset on the DWRR scheduling core unit, and when the monitoring unit detects that a deficit counter of a task in the DWRR scheduling queue is greater than the deficit delay threshold, the task is preferentially run.
According to the technical scheme provided by the application, the method at least has the following beneficial effects: by adopting the SOC intelligent network card in the application, the execution condition and the related delay information of the task are detected in real time through the monitoring unit in the hybrid scheduling module, then the task is distributed to the FCFS scheduling core unit or the DWRR scheduling core unit for execution through the scheduling core selection unit in the hybrid scheduling module according to the task information provided by the monitoring unit, the advantages of the FCFS scheduling core unit and the DWRR scheduling core unit are fully utilized, the task scheduling time is saved, particularly when the task with variable execution cost is aimed at, the calculation resource utilization rate of the intelligent network card can be maximally improved, the calculation efficiency can be ensured, and the tail delay cannot be increased or the capacity of the intelligent network card for transmitting flow cannot be damaged.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a block diagram of a structure of an SOC intelligent network card provided in the embodiment of the present application;
fig. 2 is a flowchart of a task scheduling method according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The terms appearing in the embodiments of the present application are explained below:
the high variance task comprises the following steps: meaning that the response times of the application services are more diverse.
Low variance tasks: the application service has low response time dispersion and small difference.
FCFS scheduling core unit: performing FCFS algorithm (First Come First serve algorithm)
DWRR dispatching core unit: performing DWRR algorithm (differential Weighted Round Robin, differential weight polling algorithm)
A deficit counter: i.e., the down counter, is given an initial value, the count is gradually decremented.
Tail delay: the delay of a small number of responses in the system is higher than the delay of the mean, i.e. a high delay that is rarely seen by the client.
The application provides an SOC intelligent network card and a task scheduling method, and aims to solve the problems that an existing intelligent network card is low in computing resource utilization rate and high in tail delay.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The embodiment of the first aspect of the present application provides an SOC intelligent network card, where a hybrid scheduling module is added to an existing SOC intelligent network card architecture, as shown in fig. 1, the SOC intelligent network card architecture in the present application includes a computing module 10, a board-level storage module 20, a traffic scheduling and control module 30, a host communication module 40, and a hybrid scheduling module 50.
The computing module 10 includes a general purpose ARM or MIPS multi-core processor, and a hardware accelerator (e.g., cryptography, pattern matching engine, neural network accelerator, etc.) and a dedicated function (e.g., encryption/decryption, hashing, pattern matching, compression, etc.) for packet processing (e.g., deep packet inspection, packet buffer management, etc.).
The board-level storage module 20 mainly includes a fast self-managed memory, a slower L2/DRAM, an L1/L2 cache of a processor, a private cache of various accelerators, an SSD (solid state disk) and a DRAM (dynamic random access memory) of an external smart card, and the like.
The traffic scheduling and control module 30 includes a traffic control module for transmitting data packets between the TX/RX ports and the data packet buffers, and an internal traffic manager for transmitting the data packets to the core of the intelligent network card or a switch of the intelligent network card.
The host communication module 40 includes high bandwidth ports, RDMA, link layer interfaces, PCIe and DMA engines, and the like. The host communication module 40 is used to communicate with a host server to obtain task requests.
Hybrid scheduling module 50 includes FCFS scheduling core unit 51, DWRR scheduling core unit 52, scheduling core selection unit 53, and monitoring unit 54. The FCFS scheduling core unit 51 is configured to receive and execute a task request; the DWRR scheduling core unit 52 is configured to receive the task from the FCFS scheduling core unit 51 and generate a DWRR scheduling queue when a tail delay of the task executed on the FCFS scheduling core unit 51 exceeds a first tail delay threshold; scheduling core selection unit 53 is configured to allocate a task whose tail latency executed on FCFS scheduling core unit 51 exceeds a first tail latency threshold into DWRR scheduling core unit 52, and to allocate a task whose tail latency executed on DWRR scheduling core unit 52 is below a second tail latency threshold into FCFS scheduling core unit 51; the monitoring unit 54 is configured to detect whether a tail latency of a task executing on the FCFS scheduling core unit 51 exceeds a first tail latency threshold and to detect whether a tail latency of a task executing on the DWRR scheduling core unit 52 is below a second tail latency threshold. In order to efficiently and accurately implement the conversion of tasks between the FCFS scheduling core unit 51 and the DWRR scheduling core unit 52, the monitoring unit 54 needs to collect the following information: the request execution delay distribution of each task; counting the execution cost and variance of each unloading task; and thirdly, the utilization rate of each processor core on the SOC intelligent network card comprises the utilization rate of the CPU core and the utilization rates of various hardware accelerators.
It should be noted that different modules or units of the intelligent network card are interconnected and connected through a high-bandwidth memory bus. These computing resources allow hosts to relieve the burden of general purpose computing (including complex algorithms and data structures) of a data center without sacrificing performance and program versatility.
In some embodiments of the present application, the FCFS scheduling core unit 51 is preset with a first tail delay threshold, where the first tail delay threshold refers to a statistical value of tail delays of tasks executed by an existing intelligent network card under the FCFS scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
In some embodiments of the present application, the DWRR scheduling core unit 52 sets a second tail delay threshold in advance, where the second tail delay threshold refers to a statistical value of tail delays of tasks executed by an existing intelligent network card under the DWRR scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
In some specific embodiments of the present application, the FCFS scheduling core unit 51 is further preset with an average request delay threshold, and when the monitoring unit 54 detects that the average request delay time of the working of the FCFS scheduling core unit 51 is greater than the average request delay threshold, the host communication module 40 migrates the one with the highest load ratio on the smart network card to the host server; when the monitoring unit 54 detects that the average request delay time of the FCFS scheduling core unit 51 is smaller than the average request delay threshold, some load is extracted from the host server to the smart card through the host communication module 40.
In some embodiments of the present application, the average request delay threshold refers to an average value of request delays when all the computing cores on the smart card process different tasks.
In some embodiments of the present application, the DWRR scheduling core unit 52 is further preset with a deficit delay threshold, and when the monitoring unit 54 detects that the deficit counter of the task in the DWRR scheduling queue is greater than the deficit delay threshold, the task is preferentially executed.
In some embodiments of the present application, the deficit delay threshold refers to a value of a deficit counter when the task delay reaches (1- α) × the second tail delay threshold, α representing a hysteresis factor.
By adopting the SOC intelligent network card in the application, the execution condition and the related delay information of the task are detected in real time through the monitoring unit in the hybrid scheduling module, then the task is distributed to the FCFS scheduling core unit or the DWRR scheduling core unit for execution through the scheduling core selection unit in the hybrid scheduling module according to the task information provided by the monitoring unit, the advantages of the FCFS scheduling core unit and the DWRR scheduling core unit are fully utilized, the task scheduling time is saved, particularly when the task with variable execution cost is aimed at, the calculation resource utilization rate of the intelligent network card can be maximally improved, the calculation efficiency can be ensured, and the tail delay cannot be increased or the capacity of the intelligent network card for transmitting flow cannot be damaged.
The embodiment of the second aspect of the present application provides a task scheduling method, based on the SOC intelligent network card provided in the embodiment of the first aspect, where the SOC intelligent network card includes a host communication module 40 and a hybrid scheduling module 50, and the hybrid scheduling module 50 includes an FCFS scheduling core unit 51, a DWRR scheduling core unit 52, a scheduling core selection unit 53, and a monitoring unit 54. As shown in fig. 2, the task scheduling method includes the following steps:
step S110: the host communication module obtains a task request from a host server.
In the present application, the task request includes, but is not limited to, an uninstalling task of the application program, and the smart network card receives a corresponding task request from the host server.
Specifically, the host server first creates a control command for a DMA (Direct Memory Access) or other interface, including information such as a command header and a packet buffer address, and then writes them into a command ring. And the DMA engine in the host communication module takes out the command from the command ring, reads data from the memory of the host server and writes the data into the board-level storage module of the intelligent network card. And then, carrying out corresponding processing on the data packet according to the processor type of the data packet, and sending the processed data to a receiving and sending interface through a DMA engine.
Step S120: the host communication module sends the task request to an FCFS (FCFS) scheduling core unit, and a first tail delay threshold value is preset on the FCFS scheduling core unit.
In the application, after receiving a corresponding task request, an intelligent network card preferentially sends the task request to an FCFS (fast forward dispatching) core unit in a hybrid dispatching module for execution, and the FCFS core unit executes a task in the dispatching core according to an algorithm rule of first-come first-served.
It should be noted that the first tail delay threshold refers to a statistical value of tail delays of tasks executed by the existing intelligent network card under the FCFS scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
It should be noted that the number of the FCFS scheduling core units set on the smart network card is not limited to one, and all the FCFS scheduling core units share one task running queue to fully utilize the parallelism of task execution.
Step S130: when the monitoring unit detects that the tail delay of the task executed on the FCFS scheduling core unit exceeds a first tail delay threshold value, the scheduling core selecting unit allocates the task to the DWRR scheduling core unit to be executed, and generates a DWRR scheduling queue on the DWRR scheduling core unit, and a second tail delay threshold value is preset on the DWRR scheduling core unit.
In the application, when the monitoring unit detects that the tail delay of the task executed on the FCFS dispatch core unit exceeds a first tail delay threshold value, the task is a task with high variance, the task is extruded into the DWRR dispatch core unit to be executed through the dispatch core selection unit, the DWRR dispatch core unit receives the task and generates a DWRR dispatch queue on the DWRR dispatch core unit, and the task in the DWRR dispatch queue is executed according to a differential weight polling dispatch algorithm.
Furthermore, a deficit delay threshold is preset on the DWRR scheduling core unit, after the DWRR scheduling queue is generated on the DWRR scheduling core unit, the DWRR scheduling core unit scans the tasks in the DWRR scheduling queue in a polling mode, and when a deficit counter of a certain task is greater than the deficit delay threshold, the DWRR scheduling core unit can preferentially execute the task.
It should be noted that the second tail delay threshold refers to a statistical value of tail delays of tasks executed by the existing smart network card under the DWRR scheduling algorithm, and the statistical value obeys a gaussian distribution of μ +3 σ.
The deficit delay threshold refers to the value of the deficit counter when the task delay reaches (1- α) × the second tail delay threshold, α representing a hysteresis factor.
It should be noted that the number of the DWRR scheduling core units arranged on the intelligent network card is not limited to one, and all the DWRR scheduling core units share one task running queue to fully utilize the parallelism of task execution.
Step S140: when the monitoring unit detects that the tail delay of the task executed on the DWRR scheduling core unit is lower than a second tail delay threshold value, the scheduling core selecting unit allocates the task to the FCFS scheduling core unit for execution.
In this application, when the monitoring unit detects that the tail delay of the task executed on the DWRR scheduling core unit is lower than the second tail delay threshold, the task is a task with low variance, and the task is popped from the DWRR scheduling queue through the scheduling core selecting unit and is sent back to the FCFS scheduling core unit for execution.
Furthermore, an average request delay threshold is preset on the FCFS scheduling core unit, and when the monitoring unit detects that the average request delay time of the work of the FCFS scheduling core unit is greater than the average request delay threshold, it indicates that a queuing phenomenon exists on the intelligent network card, and some task requests are not sent to the FCFS scheduling core unit. At the moment, a task request with the highest load ratio on the intelligent network card can be migrated to the host server through the host communication module to be executed; when the monitoring unit detects that the average request delay time of the work of the FCFS scheduling core unit is smaller than the average request delay threshold, it indicates that the tasks executed on the FCFS scheduling core unit are not full, and the intelligent network card can extract some task requests from the host server through the host communication module to the intelligent network card for execution. Through the steps, a load balance between the host server and the intelligent network card can be realized, and in order to minimize the synchronization cost between the host server and the intelligent network card, a scheduling core can be specially arranged on the FCFS scheduling core unit to execute the scheduling task.
It should be noted that the average request delay threshold refers to an average value of request delays when all the computation cores on the smart card process different tasks.
By adopting the task scheduling method based on the SOC intelligent network card, after the SOC intelligent network card obtains a task request from a host server, the task request is sent to an FCFS scheduling core unit for execution, the FCFS scheduling core unit performs efficient scheduling on the task with low service time dispersity according to a first-come first-serve algorithm, when the task request is detected to belong to the task with large service time difference, the intelligent network card extrudes the task request from the FCFS scheduling core unit and distributes the task request to a DWRR scheduling core unit for execution, and the DWRR scheduling core unit performs efficient execution on the tasks in a DWRR scheduling queue according to a differential weight polling scheduling algorithm. A hybrid scheduling management mechanism combining the FCFS algorithm and the DWRR algorithm provides efficient computing support for an intelligent network card shared by multiple tenants and multiple application programs, and schedules various tasks, so that various application programs are dynamically and perceptibly unloaded, the resource utilization rate of the intelligent network card is provided to the maximum extent, tail response delay is reduced, and interaction between a host server and the intelligent network card is relieved.
While the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. An SOC intelligent network card, comprising:
the host communication module is used for communicating with the host server to acquire the task request;
the hybrid scheduling module comprises an FCFS (FCFS) scheduling core unit, a DWRR (DWRR) scheduling core unit, a monitoring unit and a scheduling core selecting unit; the FCFS scheduling core unit is used for receiving and executing the task request; the DWRR dispatching core unit is used for receiving the task from the FCFS dispatching core unit and generating a DWRR dispatching queue when the tail delay of the task executed on the FCFS dispatching core unit exceeds a first tail delay threshold value; the monitoring unit is configured to detect whether a tail latency of a task executing on the FCFS scheduling core unit exceeds the first tail latency threshold and to detect whether a tail latency of a task executing on the DWRR scheduling core unit is below a second tail latency threshold; the scheduling core selection unit is configured to allocate the tasks whose tail delay executed on the FCFS scheduling core unit exceeds the first tail delay threshold value into the DWRR scheduling core unit, and to allocate the tasks whose tail delay executed on the DWRR scheduling core unit is lower than the second tail delay threshold value into the FCFS scheduling core unit.
2. The SOC intelligent network card according to claim 1, wherein an average request delay threshold value is preset on the FCFS scheduling core unit, and when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is greater than the average request delay threshold value, the host communication module migrates the one with the highest load ratio on the intelligent network card to the host server; when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is smaller than the average request delay threshold value, some loads are extracted from the host server to the intelligent network card through the host communication module.
3. The SOC smart card of claim 1, wherein a deficit delay threshold is preset on the DWRR scheduling core unit, and when the monitoring unit detects that a deficit counter of a task in the DWRR scheduling queue is greater than the deficit delay threshold, the task is preferentially run.
4. A task scheduling method is applied to the SOC intelligent network card, the SOC intelligent network card comprises a host communication module and a hybrid scheduling module, the hybrid scheduling module comprises an FCFS scheduling core unit, a DWRR scheduling core unit, a monitoring unit and a scheduling core selection unit, and the task scheduling method is characterized by comprising the following steps:
the host communication module acquires a task request from a host server;
the host communication module sends the task request to the FCFS dispatching core unit, and a first tail delay threshold value is preset on the FCFS dispatching core unit;
when the monitoring unit detects that the tail delay of the task executed on the FCFS scheduling core unit exceeds the first tail delay threshold, the scheduling core selecting unit allocates the task to the DWRR scheduling core unit for execution, and generates a DWRR scheduling queue on the DWRR scheduling core unit, wherein a second tail delay threshold is preset on the DWRR scheduling core unit;
when the monitoring unit detects that the tail delay of the task executed on the DWRR scheduling core unit is lower than the second tail delay threshold value, the scheduling core selecting unit allocates the task to the FCFS scheduling core unit for execution.
5. The task scheduling method according to claim 4, wherein an average request delay threshold is preset on the FCFS scheduling core unit, and when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is greater than the average request delay threshold, the host communication module migrates the one with the highest load ratio on the smart network card to the host server; when the monitoring unit detects that the average request delay time of the FCFS scheduling core unit is smaller than the average request delay threshold value, some loads are extracted from the host server to the intelligent network card through the host communication module.
6. The task scheduling method according to claim 4, wherein a deficit delay threshold is preset on the DWRR scheduling core unit, and when the monitoring unit detects that a deficit counter of a task in the DWRR scheduling queue is greater than the deficit delay threshold, the task is preferentially run.
CN202111281620.3A 2021-11-01 2021-11-01 SOC intelligent network card and task scheduling method Pending CN114124589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111281620.3A CN114124589A (en) 2021-11-01 2021-11-01 SOC intelligent network card and task scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111281620.3A CN114124589A (en) 2021-11-01 2021-11-01 SOC intelligent network card and task scheduling method

Publications (1)

Publication Number Publication Date
CN114124589A true CN114124589A (en) 2022-03-01

Family

ID=80380261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111281620.3A Pending CN114124589A (en) 2021-11-01 2021-11-01 SOC intelligent network card and task scheduling method

Country Status (1)

Country Link
CN (1) CN114124589A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180246766A1 (en) * 2016-09-02 2018-08-30 Telefonaktiebolaget Lm Ericsson (Publ) Systems and Methods of Managing Computational Resources
CN113079107A (en) * 2021-04-02 2021-07-06 无锡职业技术学院 Dynamic priority congestion control method for programmable switching network
CN113157447A (en) * 2021-04-13 2021-07-23 中南大学 RPC load balancing method based on intelligent network card

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180246766A1 (en) * 2016-09-02 2018-08-30 Telefonaktiebolaget Lm Ericsson (Publ) Systems and Methods of Managing Computational Resources
CN113079107A (en) * 2021-04-02 2021-07-06 无锡职业技术学院 Dynamic priority congestion control method for programmable switching network
CN113157447A (en) * 2021-04-13 2021-07-23 中南大学 RPC load balancing method based on intelligent network card

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MING LIU: ""Offloading Distributed Applications onto SmartNICs using iPipe"", 《19: PROCEEDINGS OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION》, pages 1 - 8 *

Similar Documents

Publication Publication Date Title
US20240264871A1 (en) Storage transactions with predictable latency
US10467725B2 (en) Managing access to a resource pool of graphics processing units under fine grain control
CN110892380B (en) Data processing unit for stream processing
US8949847B2 (en) Apparatus and method for managing resources in cluster computing environment
US10897428B2 (en) Method, server system and computer program product for managing resources
CN105900063A (en) Method for scheduling in multiprocessing environment and device therefor
CN110602156A (en) Load balancing scheduling method and device
US20120117242A1 (en) Service linkage system and information processing system
WO2020019743A1 (en) Traffic control method and device
CN107977271B (en) Load balancing method for data center integrated management system
Kim et al. On the resource management of kubernetes
CN112087332B (en) Virtual network performance optimization system under cloud edge cooperation
WO2022271239A1 (en) Queue scaling based, at least, in part, on processing load
CN109729113A (en) Method, server system and computer program product for managing dedicated processing resources
Komarasamy et al. A novel approach for Dynamic Load Balancing with effective Bin Packing and VM Reconfiguration in cloud
CN112148474B (en) Loongson big data all-in-one self-adaptive task segmentation method and system for load balancing
WO2020166423A1 (en) Resource management device and resource management method
CN115378885B (en) Virtual machine service network bandwidth management method and device under super fusion architecture
CN114124589A (en) SOC intelligent network card and task scheduling method
CN112311695B (en) On-chip bandwidth dynamic allocation method and system
Peng et al. BQueue: A coarse-grained bucket QoS scheduler
CN113656150A (en) Deep learning computing power virtualization system
CN114567520B (en) Method for realizing collective communication, computer equipment and communication system
Gu et al. A Transformable NVMeoF Queue Design for Better Differentiating Read and Write Request Processing
US11886911B2 (en) End-to-end quality of service mechanism for storage system using prioritized thread queues

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220301

RJ01 Rejection of invention patent application after publication