CN114064237A - Multi-tenant management system and method based on intelligent network card - Google Patents

Multi-tenant management system and method based on intelligent network card Download PDF

Info

Publication number
CN114064237A
CN114064237A CN202111281994.5A CN202111281994A CN114064237A CN 114064237 A CN114064237 A CN 114064237A CN 202111281994 A CN202111281994 A CN 202111281994A CN 114064237 A CN114064237 A CN 114064237A
Authority
CN
China
Prior art keywords
program
unloaded
unit
network card
intelligent network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111281994.5A
Other languages
Chinese (zh)
Inventor
温强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Weilang Technology Co ltd
Original Assignee
Beijing Weilang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Weilang Technology Co ltd filed Critical Beijing Weilang Technology Co ltd
Priority to CN202111281994.5A priority Critical patent/CN114064237A/en
Publication of CN114064237A publication Critical patent/CN114064237A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44594Unloading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a multi-tenant management system and a management method based on an intelligent network card, wherein the management system comprises tenants, the intelligent network card and a host, the intelligent network card comprises a compiling time component, a running time component, an enhanced dRMT (design rule support) assembly line, a configurable scheduler, an on-chip internet, a computing unit and a host communication unit, and the enhanced dRMT assembly line, the configurable scheduler, the computing unit and the host communication unit are all connected with the on-chip internet; sharing resources on the intelligent network card of the compiling time component are configured in advance; the runtime component dynamically allocates shared resources required by each tenant; the data packets of a plurality of programs to be unloaded of the enhanced dRMT pipeline are transmitted to the on-chip interconnection network; the configurable scheduler transmits a data packet of a program to be unloaded to a corresponding computing unit through the on-chip internet for processing; and the host communication unit transmits the data packet of the program to be unloaded, which is processed by the computing unit, to the host.

Description

Multi-tenant management system and method based on intelligent network card
Technical Field
The invention relates to the technical field of intelligent network card application, in particular to a multi-tenant management system and a multi-tenant management method based on an intelligent network card.
Background
Different applications on different tenants have different processing requirements, i.e., different network offload and different offload types have different requirements. In order to obtain better benefits from the intelligent network card, the service provider urgently needs to execute a plurality of unloading tasks on the intelligent network card at the same time, rather than only running a single program. In order to enable the programs to be unloaded of the multiple tenants to fairly use the shared resources of the programmable intelligent network card and avoid causing the blocking problem, a reasonable management and scheduling mechanism must be designed for the intelligent network card to ensure that the programs to be unloaded of the multiple tenants reasonably and efficiently use the shared resources.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is to provide a multi-tenant management system based on an intelligent network card, which comprises tenants, the intelligent network card and a host, and is characterized in that the intelligent network card comprises a compiling time component, a running time component, an enhanced dRMT (design rule for testing) pipeline, a configurable scheduler, an on-chip internet, a computing unit and a host communication unit, wherein the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit are all connected with the on-chip internet;
the compiling time component is used for pre-configuring shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of the tenants;
the runtime component is used for dynamically adjusting the shared resources required by the programs to be uninstalled according to the running conditions of the programs to be uninstalled of a plurality of tenants;
the enhanced dRMT pipeline is used for receiving data packets of the programs to be uninstalled from a plurality of tenants and transmitting the data packets of the programs to be uninstalled to the on-chip interconnection network;
the configurable scheduler is configured to receive, through the on-chip internet, a data packet of the to-be-uninstalled program from the enhanced dRMT pipeline, and transmit, according to shared resource information preconfigured by the compile time component for the to-be-uninstalled program and shared resource information of the to-be-uninstalled program dynamically adjusted by the runtime component, the data packet of the to-be-uninstalled program to the corresponding computing unit through the on-chip internet for processing;
the host communication unit is used for receiving the processed data packet of the program to be unloaded from the computing unit through the on-chip internet and transmitting the processed data packet of the program to be unloaded to a host.
In the above system, the compile time component includes:
the resource sharing strategy unit is used for pre-configuring shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of the tenants;
the resource use checking unit is used for checking the pre-configured shared resources used by a plurality of tenants;
and the program execution linking unit is used for integrating the parsers special for a plurality of tenants and the system-level parser.
In the above system, the runtime component comprises:
and the memory allocator during the operation is used for dynamically adjusting the memory resources required by the programs to be unloaded according to the operation conditions of the programs to be unloaded.
In the above system, the run-time memory allocator comprises:
the register array unit is used for storing the use state of the memory resources on the intelligent network card;
the resource allocation table unit is used for recording the current memory resource allocation of the program to be unloaded;
and the memory reallocation unit is used for performing reallocation of memory resources on the intelligent network card in cooperation with the configurable scheduler.
In the above system, the enhanced dRMT pipeline integrates RDMA engine units and DMA engine units.
In the system, the configurable scheduler comprises an RAM read-write unit, a logic scheduling array unit and an on-chip cache unit;
the RAM read-write unit is used for writing the data packet of the program to be unloaded into the on-chip cache unit, and storing shared resource information which is pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded which is dynamically adjusted by the running time component into the logic scheduling array unit;
the logic scheduling array unit comprises a plurality of scheduling queues, and is used for sequencing and caching tasks in the scheduling queues according to shared resource information pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by the running time component, and transmitting data packets of the program to be unloaded to corresponding computing units through the on-chip interconnection network when the computing units corresponding to the scheduling queues are idle;
and the on-chip cache unit is used for storing the data packet of the program to be unloaded.
In the system, the on-chip interconnection network adopts a shared memory type crossbar internal interconnection network.
In the above system, the on-chip internetwork includes:
a buffer logic unit, configured to receive the data packet of the to-be-unloaded program from the enhanced dRMT pipeline and the configurable scheduler, and send the data packet of the to-be-unloaded program to the corresponding computing unit;
the flow control logic unit is used for managing the flow control of the intelligent network card;
the scheduling logic unit is used for determining a processing mode of the data packet of the program to be unloaded according to a cache scheduling algorithm and comprises information of a cache queue;
and the switching network unit is used for respectively establishing communication connection with the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit so as to dynamically transmit the data packet of the program to be unloaded.
The invention also provides a multi-tenant management method based on the intelligent network card, which is applied to the management system, wherein the management system comprises tenants, the intelligent network card and a host, the intelligent network card comprises a compiling time component, a running time component, an enhanced dRMT pipeline, a configurable scheduler, an on-chip internet, a computing unit and a host communication unit, and the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit are all connected with the on-chip internet; the method is characterized by comprising the following steps:
the compiling time component pre-configures shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of a plurality of tenants;
the running time component dynamically adjusts the shared resources required by the program to be unloaded according to the running condition of the program to be unloaded;
when the data packet of the program to be uninstalled is transmitted from a tenant to the enhanced dRMT pipeline, the enhanced dRMT pipeline analyzes the data packet of the program to be uninstalled and generates data packet configuration information programmed by the tenant through matching-action table processing, and the enhanced dRMT pipeline transmits the analyzed data packet of the program to be uninstalled to the configurable scheduler through the on-chip interconnection network;
after receiving the data of the program to be unloaded, the configurable scheduler transmits a data packet of the program to be unloaded to a corresponding computing unit for processing through the on-chip interconnection network according to shared resource information which is pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded which is dynamically adjusted by the running time component;
the computing unit transmits the processed data packet of the program to be unloaded to the host communication unit through the on-chip internet;
and after receiving the processed data packet of the program to be unloaded, the host communication unit transmits the processed data packet of the program to be unloaded to a host.
In the method, the configurable scheduler comprises an RAM read-write unit, a logic scheduling array unit and an on-chip cache unit;
after the configurable scheduler receives the data of the program to be unloaded, the RAM read-write unit writes the data packet of the program to be unloaded into the on-chip cache unit, and stores the shared resource information of the program to be unloaded, which is pre-configured by the compiling time component for the program to be unloaded, and the shared resource information of the program to be unloaded, which is dynamically adjusted by the running time component, into the logic scheduling array unit;
the logic scheduling array unit comprises a plurality of scheduling queues, each scheduling queue corresponds to a service category, each scheduling queue corresponds to a computing unit, the logic scheduling array unit sequences and caches tasks in the scheduling queues according to shared resource information pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by the running time component, and when the computing units corresponding to the scheduling queues are detected to be idle, data packets of the program to be unloaded are transmitted to the corresponding computing units through the on-chip interconnection network.
According to the technical scheme provided by the application, the method at least has the following beneficial effects: the shared resources on the intelligent network card are pre-configured through the compiling time component, and the shared resources with corresponding sizes are pre-distributed to the multiple programs to be unloaded, so that the programs to be unloaded of multiple tenants can effectively and fairly use the shared resources on the intelligent network card; the shared resources required by the program to be unloaded are dynamically adjusted in real time according to the running condition of the program to be unloaded through the running time component, so that the condition that the pre-configured shared resources are insufficient is avoided; the data packet of the program to be unloaded is transmitted through the enhanced dRMT pipeline, so that lower transmission delay is ensured; the configurable scheduler executes a scheduling algorithm to realize management and scheduling of the programs to be unloaded of a plurality of tenants and maintain load balance among different computing units so as to further ensure low delay of the system; high speed communication capability is provided for packet exchange between the enhanced dRMT pipeline, the configurable scheduler, the compute unit and the host communication unit through a shared-memory on-chip interconnect network to further ensure system low latency. By adopting the management system and the management method in the technical scheme, the performance isolation of multiple tenants is effectively realized in a mode of combining software and hardware, the delay is reduced, and the resource utilization rate of the intelligent network card is improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a block diagram of a multi-tenant management system based on an intelligent network card according to an embodiment of the present application;
FIG. 2 is a block diagram of a compile-time component according to an embodiment of the present disclosure;
FIG. 3 is a block diagram of a runtime component provided by an embodiment of the present application;
FIG. 4 is a block diagram of an enhanced dRMT pipeline provided by an embodiment of the present application;
FIG. 5 is a block diagram of a configurable scheduler according to an embodiment of the present application;
fig. 6 is a block diagram of a structure of an on-chip interconnection network according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The application provides a multi-tenant management system and a multi-tenant management method based on an intelligent network card, and aims to solve the problem that programs to be unloaded of multiple tenants cannot effectively and fairly use shared resources on the intelligent network card in the prior art.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The embodiment of the first aspect of the present application provides a multi-tenant management system based on an intelligent network card, as shown in fig. 1, the multi-tenant management system includes tenants, an intelligent network card and a host, the intelligent network card includes a compile time component 10, a runtime component 20, an enhanced dRMT pipeline 30, a configurable scheduler 40, an on-chip internet 50, a computing unit 60 and a host communication unit 70, the compile time component 10 is implemented by a software stack, the runtime component 20, the enhanced dRMT pipeline 30, the configurable scheduler 40, the on-chip internet 50, the computing unit 60 and the host communication unit 70 are implemented by a hardware stack, and the enhanced dRMT pipeline 30, the configurable scheduler 40, the computing unit 60 and the host communication unit 70 are all in communication connection with the on-chip internet 50.
Where a tenant refers to a user who rents servers and other hardware computing services of a data center.
The compile time component 10 takes the information of the programs to be uninstalled of a plurality of tenants as input, integrates the information, and provides a common data packet processing function for the information. Specifically, the compile time component 10 parses information of programs to be unloaded of multiple tenants to obtain respective demand information of the programs to be unloaded, where in this application, the demand information includes, but is not limited to, memory bandwidth demand, and then the compile time component 10 pre-configures shared resources on the smart network card according to the demand information, that is, pre-allocates certain shared resources for different programs to be unloaded on different tenants, and statically generates shared resource allocation information on the smart network card, so that each tenant can be ensured to execute within the allocated shared resources, and mutual interference between tenants does not occur.
The runtime component 20 dynamically adjusts the shared resources required by each program to be uninstalled according to the operating conditions of the programs to be uninstalled of multiple tenants, and the compile time component 10 and the runtime component 20 are combined with each other to fully exert the advantages of static scheduling execution and dynamic resource allocation, thereby achieving the maximum performance improvement. It should be noted that the step of dynamically adjusting the shared resources required by each program to be uninstalled by the runtime component 20 is performed throughout the entire execution process of the program to be uninstalled.
The enhanced dRMT pipeline 30 receives packets of programs to be offloaded from multiple tenants and delivers the packets of programs to be offloaded to the on-chip interconnect network 50.
The configurable scheduler 40 receives the data packet of the to-be-uninstalled program from the enhanced dRMT pipeline 30 through the on-chip internet 50, and transmits the data packet of the to-be-uninstalled program to the corresponding computing unit 60 through the on-chip internet 50 for processing according to the shared resource information pre-configured by the compile-time component 10 for the to-be-uninstalled program and the shared resource information of the to-be-uninstalled program dynamically adjusted by the runtime component 20. It should be noted that the configurable scheduler 40 can execute a scheduling algorithm to manage and schedule the multi-tenant program to be uninstalled.
Further, the scheduler 40 may be configured to buffer packets of the to-be-offloaded program and the scheduling queue by executing a scheduling algorithm, adjust the order in which the packets of the plurality of to-be-offloaded programs are processed, and maintain load balance among the different computing units 60 to ensure low latency. It should be noted that the configurable scheduler 40 supports scheduling algorithms such as FCFS (First Come First Serve), DRR (default Round Robin), hybrid algorithm, and the like. In the present application, a policy-aware scheduling algorithm is used for implementation, i.e., the corresponding scheduling algorithm is switched according to the state in the operation process.
The on-chip internetwork 50 adopts a shared memory type, and can provide high-speed communication capability for the packet exchange among the enhanced dRMT pipeline 30, the configurable scheduler 40, the computing unit 60 and the host communication unit 70 through the shared memory type internetwork; when the data processing work on the intelligent network card is overlarge, the work needs to be fragmented, the work content of each fragmented work can be regarded as a tile, and the shared memory type internet can also provide high-performance interconnection for different computing tiles and the tiles from DMA and Ethernet MAC, so that the cache resources on the intelligent network card can be fully used, and the optimal time delay and throughput performance are realized.
The computing unit 60 receives the data packet of the program to be unloaded through the on-chip internet 50, reads the program to be unloaded, places the read data packet of the program to be unloaded in a local buffer area to perform corresponding computing processing on the data packet, and transmits the processed data packet of the program to be unloaded to the host communication unit 70 through the on-chip internet 50.
The host communication unit 70 receives the processed packet of the program to be uninstalled from the computing unit 60 through the on-chip internet 50, and transmits the processed packet of the program to be uninstalled to the host.
In some embodiments of the present application, as shown in fig. 2, the compile time component 10 includes a resource sharing policy unit 11, a resource usage checking unit 12, and a program execution linking unit 13.
Wherein the content of the first and second substances,
the resource sharing policy unit 11 pre-configures shared resources on the intelligent network card according to respective demand information of programs to be unloaded of multiple tenants;
the resource usage checking unit 12 checks the preconfigured shared resource used by the multiple tenants, and checks the usage of the preconfigured shared resource by each tenant, so as to ensure the isolation among the multiple tenants and ensure that the execution of the multiple tenants does not interfere with each other;
the program execution linking unit 13 integrates a plurality of tenant-specific resolvers (parsers) and system-level resolvers. It should be noted that, after analyzing the command to be executed by the to-be-uninstalled program, the analyzer sends the command to the intelligent network card for execution.
In some embodiments of the present application, as shown in fig. 3, the runtime component 20 includes a runtime memory allocator 21, and the runtime memory allocator 21 dynamically adjusts memory resources required by each of the to-be-uninstalled programs according to the running conditions of the to-be-uninstalled programs.
It should be noted that the memory resource on the smart network card is limited, and is usually only tens of MB, and the flow of the program to be unloaded in the running process is variable, and the memory resource required by the program to be unloaded is dynamically adjusted, so that the utilization rate of the shared resource can be effectively improved.
Further, runtime memory allocator 21 includes register array unit 211, resource allocation table unit 212, and memory reallocation unit 213.
Wherein the content of the first and second substances,
the register array unit 211 stores the use state of the memory resource on the smart network card;
the resource allocation table unit 212 records the current memory allocation of the program to be unloaded;
the memory reallocation unit 213 reallocates the memory resources on the smart card with the aid of the configurable scheduler 40. It should be noted that each program to be uninstalled from multiple tenants is allocated a set of memory parameters (including offset and size of allocated memory). To dynamically reallocate the size of memory, runtime memory reallocation unit 213 changes the offset and size of memory required for each to-be-uninstalled program of each tenant according to allocation policies and updated memory parameters.
In some embodiments of the present application, as shown in fig. 4, an RDMA engine unit 31(Remote Direct Memory Access) and a DMA engine unit 32(Direct Memory Access) are integrated on the enhanced dRMT pipeline 30 to support some applications to directly forward data packets from a portal or other smart card to a host or other smart card for maximum performance benefit. The RDMA engine unit 31 can solve the delay problem of server-side data processing in network transmission, and it can directly direct the data packets that are not processed by any service to the host memory or other intelligent network card through the RDMA engine to ensure lower delay. The DMA engine unit 32 allows hardware devices of different speeds to communicate without relying on a large interrupt load of the CPU.
In some embodiments of the present application, as shown in fig. 5, the configurable scheduler 40 includes a RAM read/write unit 41, a logic scheduling array unit 42, and an on-chip buffer unit 43 with a certain size.
Wherein the content of the first and second substances,
the RAM read-write unit 41 writes the data packet of the program to be unloaded into the on-chip cache unit 43, and stores the shared resource information pre-configured by the compile time component 10 for the program to be unloaded and the shared resource information of the program to be unloaded dynamically adjusted by the runtime component 20 into the logic scheduling array unit 42;
the logic scheduling array unit 42 includes a plurality of scheduling queues, and the logic scheduling array unit 42 executes a scheduling algorithm according to shared resource information pre-configured by the compiling time component 10 for the program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by the running time component 20, performs sequencing and caching on tasks in the plurality of scheduling queues, and transmits a data packet of the program to be unloaded to a corresponding computing unit 60 through the on-chip interconnection network 50 when the computing unit 60 corresponding to the scheduling queue is idle;
the on-chip cache unit 43 stores packets of programs to be unloaded.
In some embodiments of the present application, the on-chip interconnection network 50 uses a shared memory type Crossbar (Crossbar) internal interconnection network, and the Crossbar has non-blocking, low-delay, and high-performance characteristics, and can configure connected input nodes and output nodes, and has no intermediate stage, and can ensure that each port can operate at a linear speed. Therefore, each program to be uninstalled in the intelligent network card can be independently connected to the high-performance shared memory on-chip interconnection network, so as to ensure that a plurality of programs to be uninstalled can be simultaneously transmitted or received at a linear speed.
Further, as shown in fig. 6, the on-chip internetwork 50 includes a cache logic unit 51, a flow control logic unit 52, a scheduling logic unit 53 and a switching network unit 54.
Wherein the content of the first and second substances,
the buffer logic unit 51 receives the data packets of the program to be offloaded from the enhanced dRMT pipeline 30 and the configurable scheduler 40 at linear speed, and sends the data packets of the program to be offloaded to the corresponding computing unit 60 at linear speed;
the flow control logic unit 52 manages the flow control of the intelligent network card;
the scheduling logic unit 53 determines the processing mode of the data packet of the program to be unloaded according to the cache scheduling algorithm, and includes the information of the cache queue; it should be noted that the scheduling algorithm executed by the scheduling logic unit 53 is the same as the scheduling logic array unit 42;
switching network element 54 establishes communication connections with enhanced dRMT pipeline 30, configurable scheduler 40, computational element 60, and host communication element 70, respectively, to dynamically transmit packets of the program to be offloaded.
In some embodiments of the present application, to support the diversity of programs to be offloaded, an intelligent network card connects the computing unit 60 to the on-chip internet 50. These computing units 60 are independent and may be hardware accelerators or CPU processing cores. The offload function may be encapsulated in a computing unit 60, and the computing unit 60 reads the data packet to be processed, places the data packet in a local buffer, and then performs the offload function on the data packet.
The embodiment of the second aspect of the present application provides a multi-tenant management method based on an intelligent network card, which is applied to the above management system, the management system includes tenants, an intelligent network card and a host, the intelligent network card includes a compile time component 10, a runtime component 20, an enhanced dRMT pipeline 30, a configurable scheduler 40, an on-chip internet 50, a computing unit 60 and a host communication unit 70, and the enhanced dRMT pipeline 30, the configurable scheduler 40, the computing unit 60 and the host communication unit 70 are all in communication connection with the on-chip internet 50. The multi-tenant management method comprises the following steps:
the compiling time component pre-configures shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of a plurality of tenants;
the running time component dynamically adjusts shared resources required by the program to be unloaded according to the running condition of the program to be unloaded;
when a data packet of a program to be unloaded is transmitted from a tenant to an enhanced dRMT pipeline, the enhanced dRMT pipeline analyzes the data packet of the program to be unloaded and generates data packet configuration information programmed by the tenant through matching-action table processing, and the enhanced dRMT pipeline transmits the analyzed data packet of the program to be unloaded to a configurable scheduler through an on-chip interconnection network;
after receiving the data of the program to be unloaded, the configurable scheduler transmits a data packet of the program to be unloaded to a corresponding computing unit for processing through an on-chip interconnection network according to shared resource information which is pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded which is dynamically adjusted by the running time component;
the computing unit transmits the processed data packet of the program to be unloaded to the host communication unit through the on-chip internet;
and after receiving the processed data packet of the program to be unloaded, the host communication unit transmits the processed data packet of the program to be unloaded to the host.
It should be noted that, in the present application, tenant programming refers to that a tenant programs an API function in a runtime environment.
Further, the configurable scheduler 40 includes a RAM read-write unit 41, a logic scheduling array unit 42, and an on-chip buffer unit 43. The processing steps after the scheduler 40 is configured to receive the data packet of the program to be unloaded are as follows:
the RAM read-write unit writes a data packet of a program to be unloaded into the on-chip cache unit, and stores shared resource information of the program to be unloaded, which is pre-configured by the compiling time component for the program to be unloaded, and shared resource information of the program to be unloaded, which is dynamically adjusted by the running time component, into the logic scheduling array unit;
the logic scheduling array unit comprises a plurality of scheduling queues, each scheduling queue corresponds to a service type, each scheduling queue corresponds to a computing unit, the logic scheduling array unit executes a scheduling algorithm according to shared resource information pre-configured by a compiling time component for a program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by an operating time component, the tasks in the scheduling queues are sorted and cached, and when the computing units corresponding to the scheduling queues are detected to be idle, data packets of the program to be unloaded are transmitted to the corresponding computing units through an on-chip interconnection network.
By adopting the multi-tenant management system and the management method based on the intelligent network card, the shared resources on the intelligent network card are pre-configured through the compiling time component, and the shared resources with corresponding sizes are pre-distributed for the programs to be unloaded, so that the programs to be unloaded of the tenants can effectively and fairly use the shared resources on the intelligent network card; the shared resources required by the program to be unloaded are dynamically adjusted in real time according to the running condition of the program to be unloaded through the running time component, so that the condition that the pre-configured shared resources are insufficient is avoided; the data packet of the program to be unloaded is transmitted through the enhanced dRMT pipeline, so that lower transmission delay is ensured; the configurable scheduler executes a scheduling algorithm to realize management and scheduling of the programs to be unloaded of multiple tenants and maintain load balance among different computing units so as to further ensure low delay of the system; high speed communication capability is provided for packet exchange between the enhanced dRMT pipeline, the configurable scheduler, the compute unit and the host communication unit through a shared-memory on-chip interconnect network to further ensure system low latency.
The multi-tenant management system in the technical scheme integrates various high-performance hardware modules, and is combined with the multi-tenant management method, so that the performance isolation of multiple tenants is effectively realized, the delay is reduced, and the resource utilization rate of the intelligent network card is improved.
While the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A multi-tenant management system based on an intelligent network card comprises tenants, the intelligent network card and a host, and is characterized in that the intelligent network card comprises a compiling time component, a running time component, an enhanced dRMT (reduced media transport) pipeline, a configurable scheduler, an on-chip internet, a computing unit and a host communication unit, wherein the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit are all connected with the on-chip internet;
the compiling time component is used for pre-configuring shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of the tenants;
the runtime component is used for dynamically adjusting the shared resources required by the programs to be uninstalled according to the running conditions of the programs to be uninstalled of a plurality of tenants;
the enhanced dRMT pipeline is used for receiving data packets of the programs to be uninstalled from a plurality of tenants and transmitting the data packets of the programs to be uninstalled to the on-chip interconnection network;
the configurable scheduler is configured to receive, through the on-chip internet, a data packet of the to-be-uninstalled program from the enhanced dRMT pipeline, and transmit, according to shared resource information preconfigured by the compile time component for the to-be-uninstalled program and shared resource information of the to-be-uninstalled program dynamically adjusted by the runtime component, the data packet of the to-be-uninstalled program to the corresponding computing unit through the on-chip internet for processing;
the host communication unit is used for receiving the processed data packet of the program to be unloaded from the computing unit through the on-chip internet and transmitting the processed data packet of the program to be unloaded to a host.
2. The intelligent network card based multi-tenant management system of claim 1, wherein the compile time component comprises:
the resource sharing strategy unit is used for pre-configuring shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of the tenants;
the resource use checking unit is used for checking the pre-configured shared resources used by a plurality of tenants;
and the program execution linking unit is used for integrating the parsers special for a plurality of tenants and the system-level parser.
3. The intelligent network card-based multi-tenant management system of claim 1, wherein the runtime component comprises:
and the memory allocator during the operation is used for dynamically adjusting the memory resources required by the programs to be unloaded according to the operation conditions of the programs to be unloaded.
4. The intelligent network card-based multi-tenant management system of claim 3, wherein the runtime memory allocator comprises:
the register array unit is used for storing the use state of the memory resources on the intelligent network card;
the resource allocation table unit is used for recording the current memory resource allocation of the program to be unloaded;
and the memory reallocation unit is used for performing reallocation of memory resources on the intelligent network card in cooperation with the configurable scheduler.
5. The multi-tenant management system based on an intelligent network card according to claim 1,
the enhanced dRMT pipeline integrates RDMA engine units and DMA engine units.
6. The multi-tenant management system based on an intelligent network card according to claim 1, wherein the configurable scheduler comprises a RAM read-write unit, a logic scheduling array unit and an on-chip cache unit;
the RAM read-write unit is used for writing the data packet of the program to be unloaded into the on-chip cache unit, and storing shared resource information which is pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded which is dynamically adjusted by the running time component into the logic scheduling array unit;
the logic scheduling array unit comprises a plurality of scheduling queues, and is used for sequencing and caching tasks in the scheduling queues according to shared resource information pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by the running time component, and transmitting data packets of the program to be unloaded to corresponding computing units through the on-chip interconnection network when the computing units corresponding to the scheduling queues are idle;
and the on-chip cache unit is used for storing the data packet of the program to be unloaded.
7. The multi-tenant management system based on an intelligent network card according to claim 1,
the on-chip interconnection network adopts a shared memory type cross switch internal interconnection network.
8. The intelligent network card-based multi-tenant management system of claim 7, wherein the on-chip internetwork comprises:
a buffer logic unit, configured to receive the data packet of the to-be-unloaded program from the enhanced dRMT pipeline and the configurable scheduler, and send the data packet of the to-be-unloaded program to the corresponding computing unit;
the flow control logic unit is used for managing the flow control of the intelligent network card;
the scheduling logic unit is used for determining a processing mode of the data packet of the program to be unloaded according to a cache scheduling algorithm and comprises information of a cache queue;
and the switching network unit is used for respectively establishing communication connection with the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit so as to dynamically transmit the data packet of the program to be unloaded.
9. A multi-tenant management method based on an intelligent network card is applied to the management system, the management system comprises tenants, the intelligent network card and a host, the intelligent network card comprises a compiling time component, a running time component, an enhanced dRMT (design rule for design rule) pipeline, a configurable scheduler, an on-chip internet, a computing unit and a host communication unit, and the enhanced dRMT pipeline, the configurable scheduler, the computing unit and the host communication unit are all connected with the on-chip internet; the method is characterized by comprising the following steps:
the compiling time component pre-configures shared resources on the intelligent network card according to the respective demand information of the programs to be unloaded of a plurality of tenants;
the running time component dynamically adjusts the shared resources required by the program to be unloaded according to the running condition of the program to be unloaded;
when the data packet of the program to be uninstalled is transmitted from a tenant to the enhanced dRMT pipeline, the enhanced dRMT pipeline analyzes the data packet of the program to be uninstalled and generates data packet configuration information programmed by the tenant through matching-action table processing, and the enhanced dRMT pipeline transmits the analyzed data packet of the program to be uninstalled to the configurable scheduler through the on-chip interconnection network;
after receiving the data of the program to be unloaded, the configurable scheduler transmits a data packet of the program to be unloaded to a corresponding computing unit for processing through the on-chip interconnection network according to shared resource information which is pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded which is dynamically adjusted by the running time component;
the computing unit transmits the processed data packet of the program to be unloaded to the host communication unit through the on-chip internet;
and after receiving the processed data packet of the program to be unloaded, the host communication unit transmits the processed data packet of the program to be unloaded to a host.
10. The intelligent network card-based multi-tenant management method according to claim 9, wherein the configurable scheduler comprises a RAM read-write unit, a logic scheduling array unit and an on-chip cache unit;
after the configurable scheduler receives the data of the program to be unloaded, the RAM read-write unit writes the data packet of the program to be unloaded into the on-chip cache unit, and stores the shared resource information of the program to be unloaded, which is pre-configured by the compiling time component for the program to be unloaded, and the shared resource information of the program to be unloaded, which is dynamically adjusted by the running time component, into the logic scheduling array unit;
the logic scheduling array unit comprises a plurality of scheduling queues, each scheduling queue corresponds to a service category, each scheduling queue corresponds to a computing unit, the logic scheduling array unit sequences and caches tasks in the scheduling queues according to shared resource information pre-configured by the compiling time component for the program to be unloaded and shared resource information of the program to be unloaded dynamically adjusted by the running time component, and when the computing units corresponding to the scheduling queues are detected to be idle, data packets of the program to be unloaded are transmitted to the corresponding computing units through the on-chip interconnection network.
CN202111281994.5A 2021-11-01 2021-11-01 Multi-tenant management system and method based on intelligent network card Pending CN114064237A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111281994.5A CN114064237A (en) 2021-11-01 2021-11-01 Multi-tenant management system and method based on intelligent network card

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111281994.5A CN114064237A (en) 2021-11-01 2021-11-01 Multi-tenant management system and method based on intelligent network card

Publications (1)

Publication Number Publication Date
CN114064237A true CN114064237A (en) 2022-02-18

Family

ID=80236489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111281994.5A Pending CN114064237A (en) 2021-11-01 2021-11-01 Multi-tenant management system and method based on intelligent network card

Country Status (1)

Country Link
CN (1) CN114064237A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140359113A1 (en) * 2013-05-30 2014-12-04 Sap Ag Application level based resource management in multi-tenant applications
CN109324900A (en) * 2012-09-12 2019-02-12 萨勒斯福斯通讯有限公司 For the message queue in on-demand service environment based on the resource-sharing bidded
CN109739633A (en) * 2019-01-08 2019-05-10 深圳市网心科技有限公司 A kind of shared management of computing method and relevant apparatus
CN113190529A (en) * 2021-04-29 2021-07-30 电子科技大学 Multi-tenant data sharing storage system suitable for MongoDB database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324900A (en) * 2012-09-12 2019-02-12 萨勒斯福斯通讯有限公司 For the message queue in on-demand service environment based on the resource-sharing bidded
US20140359113A1 (en) * 2013-05-30 2014-12-04 Sap Ag Application level based resource management in multi-tenant applications
CN109739633A (en) * 2019-01-08 2019-05-10 深圳市网心科技有限公司 A kind of shared management of computing method and relevant apparatus
CN113190529A (en) * 2021-04-29 2021-07-30 电子科技大学 Multi-tenant data sharing storage system suitable for MongoDB database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高蕾;杨燕;钟华;于谨维;: "面向多租户的门户资源管理框架", 计算机工程与设计, no. 08, 16 August 2012 (2012-08-16) *

Similar Documents

Publication Publication Date Title
US10678584B2 (en) FPGA-based method for network function accelerating and system thereof
US8726295B2 (en) Network on chip with an I/O accelerator
US7516456B2 (en) Asymmetric heterogeneous multi-threaded operating system
US8930618B2 (en) Smart memory
KR101827369B1 (en) Apparatus and method for managing data stream distributed parallel processing service
EP1805626B1 (en) External data interface in a computer architecture for broadband networks
US20090125706A1 (en) Software Pipelining on a Network on Chip
US20060190614A1 (en) Non-homogeneous multi-processor system with shared memory
US20020188691A1 (en) Pipelined high speed data transfer mechanism
García-Dorado et al. High-performance network traffic processing systems using commodity hardware
US20080086575A1 (en) Network interface techniques
CN101902504A (en) Avionic full-duplex switched-type Ethernet network card and integration method thereof
US20080162877A1 (en) Non-Homogeneous Multi-Processor System With Shared Memory
CN114095251A (en) SSLVPN realization method based on DPDK and VPP
Verner et al. Scheduling periodic real-time communication in multi-GPU systems
US8832332B2 (en) Packet processing apparatus
CN114064237A (en) Multi-tenant management system and method based on intelligent network card
CN109144722B (en) Management system and method for efficiently sharing FPGA resources by multiple applications
EP4036730A1 (en) Application data flow graph execution using network-on-chip overlay
US20060155955A1 (en) SIMD-RISC processor module
CN111245794B (en) Data transmission method and device
CN114039894B (en) Network performance optimization method, system, device and medium based on vector packet
US20230289197A1 (en) Accelerator monitoring framework
CN117271074A (en) Service-oriented lightweight heterogeneous computing cluster system
Li et al. Bi-Transfer: A Data Packet Allocation Module with Chaining Transmission Mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination