CN112416572A

CN112416572A - Resource pooling method, system, terminal and storage medium for memory cloud platform

Info

Publication number: CN112416572A
Application number: CN202011148251.6A
Authority: CN
Inventors: 许溢允
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2021-02-26
Anticipated expiration: 2040-10-23
Also published as: CN112416572B

Abstract

The invention provides a resource pooling method, a system, a terminal and a storage medium for a memory cloud platform, wherein the method comprises the following steps: establishing an Ethernet communication link between a service node and a memory, wherein the memory comprises a control board card and an accelerator card; collecting network addresses of all service nodes and all memories in a cluster, and generating an address list, wherein the cluster comprises a plurality of service nodes and a plurality of memories; distributing the address list to a service node of a cluster. The application needing acceleration in the invention can transmit data to the acceleration card by two modes of memory or MAC. And the memory resources which can be divided by the user are not limited by the host, so that the memory resources can be more flexibly distributed and deployed and can be seamlessly butted with the existing server cloud ecological environment.

Description

Resource pooling method, system, terminal and storage medium for memory cloud platform

Technical Field

The invention relates to the technical field of cloud platforms, in particular to a method, a system, a terminal and a storage medium for pooling resources of a memory cloud platform.

Background

The memory heterogeneous accelerator card is used for performing storage calculation on a source operand sent by a CPU by utilizing the high-speed storage capacity of a memory and returning a calculated result to the CPU, so that the high-performance storage calculation capacity is realized. With the expansion of the application of the memory heterogeneous accelerator card in the cloud data center, the memory accelerator card starts to be deployed in a large scale, the existing deployment mode generally acquires a machine-card binding mode, namely, each memory accelerator card is directly plugged into a standard bus interface of a server through a memory slot, when a user applies for using a memory instance, the user generally allocates a set of virtual machine environment for the user, and the user accesses and uses the board card under the virtual machine.

At present, memory cloud service manufacturers in China almost adopt a single-computer single-card and single-computer multi-card binding mode, namely one card is inserted into one server or a plurality of cards are inserted into one server, under the machine-card binding mode, a memory is tightly coupled with a CPU, a user can only access the memory card through the CPU at a host end, the memory board cards which can be divided by each user are limited by the number of the bound board cards, no direct data communication link exists between the board cards, and if the board cards have communication requirements, data needs to be forwarded through the CPU.

The above-mentioned machine-card binding framework causes the server and the memory body to be tightly coupled, and the increase of the board card needs the matching server, and the memory body board card can not be separately dispatched under the condition of separating from the CPU. Moreover, as no link for direct communication exists between the boards, the flexible deployment requirement of the service cannot be met, and an effective distributed acceleration architecture cannot be formed. The above-mentioned mode of machine-card binding is essentially that memory boards are separated from each other, which does not deal with the resource pooling of memories. If there is a communication requirement between the boards, the memory cards need to exchange data of each memory card in a soft Switch mode through a memory bus topology on the host, and in this mode, a large amount of CPU resources are occupied, and the efficiency increase caused by the acceleration of the memory is partially offset.

Disclosure of Invention

In view of the above-mentioned deficiencies of the prior art, the present invention provides a method, a system, a terminal and a storage medium for pooling resources of a memory cloud platform, so as to solve the above-mentioned technical problems.

In a first aspect, the present invention provides a method for pooling resources of a memory cloud platform, including:

connecting a plurality of acceleration cards to a control board card, wherein the control board card and the acceleration cards form a memory disk cabinet;

the memory disk cabinet is added into the cluster by interconnecting the control board card and service nodes in the cluster through Ethernet, the service nodes comprise a CPU and a direct-connected memory, the direct-connected memory comprises a direct-connected accelerator card and a direct-connected control board card, and the CPU is connected with the direct-connected memory through a memory bus;

generating a network address for the memory disk cabinet, and storing the network address of the memory disk cabinet to the address list;

and allocating memory resources for each service node according to the traffic of each service node in the cluster, and allocating corresponding network addresses to each service node according to a resource allocation scheme.

Further, the method further comprises:

the direct connection control board card is in communication connection with a remote memory through an Ethernet interface, and the remote memory comprises a memory disk cabinet and a direct connection memory of other service nodes;

after the direct connection control board card receives the acceleration data sent by the CPU, whether a local direct connection acceleration card is in an idle state is judged:

if so, the direct connection control board card sends the acceleration data to the direct connection acceleration card;

if not, the direct connection control board card encapsulates the acceleration data into a data packet and sends the data packet to a remote memory.

Further, the allocating memory resources to each service node according to the traffic of each service node in the cluster includes:

allocating a corresponding number of accelerator cards to service nodes according to the service volume of the service nodes in the cluster;

and issuing the network address of the allocated memory of the service node as an optional address to the service node, and recording the optional address into a register configuration file of a direct-connected memory of the service node.

Further, the allocating a corresponding number of accelerator cards to the service nodes according to the traffic of the service nodes in the cluster includes:

calculating the average cluster traffic and the average distribution number of the accelerator cards;

calculating a ratio value of the actual traffic of the target service node to the average task amount;

and calculating the product of the average distribution quantity and the proportion value, and taking the product as the distribution quantity of the target service node.

acquiring the number of direct connection accelerator cards of the target service node;

judging whether the number of local direct connection acceleration cards is less than the number to be distributed:

if yes, distributing all the direct-connected accelerator cards of the target service node to the target service node, and distributing the remote accelerator cards with the difference quantity between the quantity of the direct-connected accelerator cards and the quantity to be distributed to the target service node;

if not, selecting the direct connection acceleration cards with the quantity which is required to be distributed from the direct connection acceleration cards to distribute to the target service nodes, and marking the unselected direct connection acceleration cards as unallocated acceleration cards.

Further, the method further comprises:

judging whether the accelerator card of the memory disk cabinet is distributed or not:

and if so, allocating the unallocated direct-connection acceleration card of the service node.

In a second aspect, the present invention provides a system for pooling resources of a memory cloud platform, comprising:

the disk cabinet establishing unit is configured to connect a plurality of acceleration cards to a control board card, and the control board card and the acceleration cards form a memory disk cabinet;

the memory interconnection unit is configured to connect the memory disk cabinet into the cluster by performing Ethernet interconnection between the control board card and a service node in the cluster, the service node comprises a CPU and a direct-connection memory, the direct-connection memory comprises a direct-connection accelerator card and a direct-connection control board card, and the CPU is connected with the direct-connection memory through a memory bus;

the address generating unit is configured to generate a network address for the memory disk cabinet and store the network address of the memory disk cabinet in the address list;

and the resource allocation unit is configured to allocate memory resources to each service node according to the traffic of each service node in the cluster, and allocate corresponding network addresses to each service node according to the resource allocation scheme.

Further, the system further comprises:

the node setting unit is configured for the direct connection control board card to be in communication connection with a remote memory through an Ethernet interface, and the remote memory comprises a memory disk cabinet and direct connection memories of other service nodes;

the state judgment unit is configured to judge whether a local direct connection acceleration card is in an idle state or not after the direct connection control board card receives acceleration data sent by the CPU;

the data directly-sending unit is configured to send the acceleration data to the directly-connected accelerator card if the directly-connected accelerator card is in an idle state;

and the data forwarding unit is configured to package the acceleration data into a data packet and send the data packet to a remote memory if the direct connection acceleration card is not in an idle state.

In a third aspect, a terminal is provided, including:

a processor, a memory, wherein,

the memory is used for storing a computer program which,

the processor is used for calling and running the computer program from the memory so as to make the terminal execute the method of the terminal.

In a fourth aspect, a computer storage medium is provided having stored therein instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.

The beneficial effect of the invention is that,

according to the resource pooling method, system, terminal and storage medium for the Memory cloud platform, provided by the invention, the accelerator cards in the Memory cloud platform are interconnected through the MAC, so that on one hand, a machine-card binding form is reserved, and on the other hand, a BOX OF Memory mode is introduced. Various types OF Memory accelerator cards (including Intel chips and Memory manufacturer chips) in the BOX OF Memory carry out data interaction through MAC interfaces, so that the close coupling OF the Memory and the CPU is decoupled. And the accelerator card in the BOX OF Memory is also interconnected with other accelerator cards through MAC. Under the above structure, the application needing acceleration can transmit data to the acceleration card by two modes, namely memory or MAC. And the memory resources which can be divided by the user are not limited by the host, so that the memory resources can be more flexibly distributed and deployed and can be seamlessly butted with the existing server cloud ecological environment.

In addition, the invention has reliable design principle, simple structure and very wide application prospect.

Drawings

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.

FIG. 2 is a schematic block diagram of a system of one embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention. The execution subject in fig. 1 may be a resource pooling system of a memory cloud platform.

As shown in fig. 1, the method includes:

step 110, connecting a plurality of acceleration cards to a control board card, wherein the control board card and the acceleration cards form a memory disk cabinet;

step 120, the memory disk cabinet is added into the cluster by interconnecting the control board card and a service node in the cluster through Ethernet, the service node comprises a CPU and a direct-connected memory, the direct-connected memory comprises a direct-connected accelerator card and a direct-connected control board card, and the CPU is connected with the direct-connected memory through a memory bus;

step 130, generating a network address for the memory disk cabinet, and storing the network address of the memory disk cabinet in the address list;

step 140, allocating memory resources to each service node according to the traffic of each service node in the cluster, and allocating corresponding network addresses to each service node according to the resource allocation scheme.

In order to facilitate understanding of the present invention, the principle of the resource pooling method of the memory cloud platform of the present invention is combined with the resource pooling process of the memory cloud platform in the embodiment to further describe the resource pooling method of the memory cloud platform provided by the present invention.

Specifically, the resource pooling method of the memory cloud platform comprises the following steps:

s1, constructing a connection structure of the memories in the cluster.

The service node comprises a CPU and a direct connection memory, the direct connection memory comprises a direct connection accelerator card and a direct connection control board card, and the CPU is connected with the direct connection memory through a memory bus. The direct connection control board card is in communication connection with a remote memory through an Ethernet interface (MAC interface), and the remote memory is a memory in the cluster except the direct connection memory.

The cloud platform cluster comprises a plurality of server nodes, each server node is provided with a direct connection memory, and the direct connection memories of different server nodes are in communication connection through MAC links. In addition, the cloud platform cluster further comprises a plurality of memory disk cabinets, each memory disk cabinet comprises a plurality of acceleration cards and a control board card, the acceleration cards are inserted into the disk cabinets and then are in wired communication connection with the control board card, and the control board card is in communication connection with other memories in the cluster through an MAC interface.

S2, allocating the memory resource.

The cluster management node collects network addresses of all memories (including a direct connection memory and a non-direct connection memory) in the cluster, and stores all the network addresses in an address list.

The management node acquires the traffic of each service node from the task allocation list, and allocates a corresponding number of accelerator cards to the service nodes according to the traffic, wherein the specific allocation method comprises the steps of calculating the average traffic of the cluster (the average traffic is the total traffic divided by the number of service nodes) and the average allocation number of the accelerator cards (the average allocation number of the accelerator cards is the total memory divided by the number of service nodes, and the average allocation number is two in the embodiment), calculating the proportion of the target service node in the average traffic, if the traffic of the target service node is twice the average traffic, allocating four memories to the target service node, acquiring the number of directly connected memories of the target service node as 1, at this time, allocating 3 memories to the target service node, and preferentially sending the network address of a memory disk cabinet comprising 3 accelerator cards to the target service node, and if the memory disk cabinet is completely allocated, allocating the idle direct-connected memories of other service nodes to the target service node, namely sending the network addresses of the other service nodes to the target service node. And the target service node records the received network address into a register of the direct connection memory.

The method comprises the steps that a service node CPU sends acceleration data to a direct-connected memory through a memory bus, a control board card of the direct-connected memory judges whether a direct-connected acceleration card in an idle state exists or not, if yes, the acceleration data are sent to the direct-connected acceleration card through the memory bus, if not, the acceleration data are packaged into a data packet through an Ethernet Switch, and the data packet header is provided with a plurality of contents such as a destination MAC address, a source MAC address, information length and the like. And then calling a target address from the network address stored in the register, and sending the data packet to the target address through the MAC link. The remote memory of the target address decodes the data packet and performs accelerated operation, and the operation result is packaged into an MAC data packet and returned to the source service node.

As shown in fig. 2, the system 200 includes:

a disk cabinet establishing unit 210 configured to connect a plurality of accelerator cards to a control board card, where the control board card and the plurality of accelerator cards form a memory disk cabinet;

a memory interconnection unit 220 configured to add the memory disk cabinet to the cluster by performing ethernet interconnection between the control board card and a service node in the cluster, where the service node includes a CPU and a direct-connection memory, the direct-connection memory includes a direct-connection accelerator card and a direct-connection control board card, and the CPU is connected to the direct-connection memory through a memory bus;

an address generating unit 230 configured to generate a network address for the memory disk cabinet, and store the network address of the memory disk cabinet in the address list;

the resource allocation unit 240 is configured to allocate memory resources to each service node according to the traffic of each service node in the cluster, and allocate corresponding network addresses to each service node according to a resource allocation scheme.

Optionally, as an embodiment of the present invention, the system further includes:

Fig. 3 is a schematic structural diagram of a terminal 300 according to an embodiment of the present invention, where the terminal 300 may be used to execute the method for pooling resources of the memory cloud platform according to the embodiment of the present invention.

Among them, the terminal 300 may include: a processor 310, a memory 320, and a communication unit 330. The components communicate via one or more buses, and those skilled in the art will appreciate that the architecture of the servers shown in the figures is not intended to be limiting, and may be a bus architecture, a star architecture, a combination of more or less components than those shown, or a different arrangement of components.

The memory 320 may be used for storing instructions executed by the processor 310, and the memory 320 may be implemented by any type of volatile or non-volatile storage terminal or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk. The executable instructions in memory 320, when executed by processor 310, enable terminal 300 to perform some or all of the steps in the method embodiments described below.

The processor 310 is a control center of the storage terminal, connects various parts of the entire electronic terminal using various interfaces and lines, and performs various functions of the electronic terminal and/or processes data by operating or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory. The processor may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 310 may include only a Central Processing Unit (CPU). In the embodiment of the present invention, the CPU may be a single operation core, or may include multiple operation cores.

A communication unit 330, configured to establish a communication channel so that the storage terminal can communicate with other terminals. And receiving user data sent by other terminals or sending the user data to other terminals.

The present invention also provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).

Therefore, the accelerator cards in the Memory cloud platform are interconnected through the MAC, so that the form OF machine-card binding is reserved on one hand, and a BOX OF Memory mode is introduced on the other hand. Various types OF Memory accelerator cards (including Intel chips and Memory manufacturer chips) in the BOX OF Memory carry out data interaction through MAC interfaces, so that the close coupling OF the Memory and the CPU is decoupled. And the accelerator card in the BOX OF Memory is also interconnected with other accelerator cards through MAC. Under the above structure, the application needing acceleration can transmit data to the acceleration card by two modes, namely memory or MAC. Moreover, the memory resources that can be partitioned by the user are not limited by the host, so that the memory resources can be more flexibly allocated and deployed, and the existing server cloud ecological environment is seamlessly docked.

Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in the form of a software product, where the computer software product is stored in a storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like, and the storage medium can store program codes, and includes instructions for enabling a computer terminal (which may be a personal computer, a server, or a second terminal, a network terminal, and the like) to perform all or part of the steps of the method in the embodiments of the present invention.

The same and similar parts in the various embodiments in this specification may be referred to each other. Especially, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the description in the method embodiment.

In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for pooling resources of a memory cloud platform, comprising:

2. The method of claim 1, further comprising:

3. The method of claim 2, wherein allocating memory resources for each service node based on traffic of each service node within the cluster comprises:

4. The method of claim 3, wherein the allocating a corresponding number of accelerator cards to service nodes in the cluster according to their traffic comprises:

5. The method of claim 4, wherein the allocating a corresponding number of accelerator cards to service nodes in a cluster according to their traffic comprises:

6. The method of claim 5, further comprising:

7. A system for pooling resources of a memory cloud platform, comprising:

8. The system of claim 7, further comprising:

9. A terminal, comprising:

a processor;

a memory for storing instructions for execution by the processor;

wherein the processor is configured to perform the method of any one of claims 1-6.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.