CN114661465A

CN114661465A - Resource management method, device, storage medium and electronic equipment

Info

Publication number: CN114661465A
Application number: CN202210266699.0A
Authority: CN
Inventors: 沈标标; 陈友旭; 邹懋; 陈飞; 王鲲
Original assignee: Vita Technology Beijing Co ltd
Current assignee: Vita Technology Beijing Co ltd
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2022-06-24

Abstract

The disclosure relates to a resource management method, a resource management device, a storage medium and an electronic device, wherein a service node with GPU resources can receive a resource request message sent by a client node, the resource request message is used for requesting to use the GPU resources on the service node, and the client node and the service node are both deployed on a target host; determining a target GPU resource from GPU resources according to the resource request message; under the condition of receiving a business operation instruction sent by a client node, executing the business operation instruction according to a target GPU resource; sending the instruction execution result to the client node; the service node is pre-established in the following mode: creating a target virtual machine on a target host machine, and taking the target virtual machine as a service node, wherein the system environment of the target virtual machine is the same as that of a client node; and directly transmitting the GPU resources on the target host machine to the service node.

Description

Resource management method, device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of resource management, and in particular, to a method, an apparatus, a storage medium, and an electronic device for resource management.

Background

With the continuous evolution and development of computer architecture, a GPU (Graphics Processing Unit) is gradually applied to the field of general-purpose computing due to its efficient many-core computing capability, such as scenarios of numerical analysis, three-dimensional modeling, games, and the like, in the related art, in order to relieve the GPU resource limitation of a terminal local, a GPU resource deployed on a remote server may be used, and a hard disk, a GPU, and other hardware devices are configured in the server, but the server generally deploys a Linux system, and a service environment is built based on a Linux platform, and in order to provide a Windows access environment, a plurality of Windows virtual machines are generally constructed on a single server (host) in which the Linux system is deployed as a client node to provide Windows services to users.

As described above, in a scenario such as a game, a modeling, and the like, GPU resources need to be used, but GPU hardware is deployed on a Linux server (i.e., a host), a Windows virtual machine cannot directly use the GPU resources on the Linux host, and in an existing GPU direct connection scheme, the Windows virtual machine uses the direct-in GPU resources to construct a GPU environment, but a GPU card of the direct-in host cannot be used by another Windows virtual machine, which may reduce the GPU resource utilization rate.

Disclosure of Invention

The purpose of the present disclosure is to provide a method, an apparatus, a storage medium and an electronic device for resource management.

In a first aspect, the present disclosure provides a resource management method, applied to a service node, where a GPU resource is deployed on the service node, where the method includes:

receiving a resource request message sent by a client node, wherein the resource request message is used for requesting to use the GPU resource on the service node, and the client node and the service node are both deployed on a target host;

determining a target GPU resource from the GPU resources according to the resource request message;

under the condition of receiving a business operation instruction sent by the client node, executing the business operation instruction according to the target GPU resource;

sending an instruction execution result to the client node;

wherein the service node is pre-established by:

creating a target virtual machine on the target host machine, and taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node;

and directly transmitting the GPU resources on the target host machine to the service node.

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node; the determining a target GPU resource from the GPU resources according to the resource request message comprises: and performing dynamic resource allocation on the GPU resources according to the target GPU computing resource request message and the target GPU storage resource request message to obtain the target GPU resources.

Optionally, the service nodes include one or more service nodes, and in the case that the service nodes include multiple service nodes, different service nodes deploy GPU resources corresponding to different service scenarios.

In a second aspect, the present disclosure provides a resource management method, which is applied to a management node, where the management node is configured to manage GPU resources deployed on a service node; the method comprises the following steps:

receiving a resource request message sent by a client node, wherein the resource request message is used for requesting to use the GPU resource on the service node;

determining a target GPU according to the resource request message, wherein the target GPU is a GPU which is distributed by the management node and provides GPU resources for the current service of the client node;

sending the identification information of the target GPU to the client node so that the client node determines a target service node according to the identification information and sends a business operation instruction to the target service node so that the target service node executes the business operation instruction based on the target GPU and returns an instruction execution result to the client node;

wherein the service node is pre-established by:

creating a target virtual machine on a target host machine, and taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node, and the service node and the client node are both deployed on the target host machine;

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required for the current service of the client node; the determining a target GPU according to the resource request message comprises: performing dynamic resource allocation on the GPU resources managed by the management node according to the target GPU computing resource request message and the target GPU storage resource request message to obtain target GPU resources; and determining the target GPU according to the target GPU resources.

Optionally, before receiving the resource request message sent by the client node, the method further includes: and receiving a resource reporting message sent by the service node, wherein the resource reporting message comprises GPU resource information deployed on the service node.

In a third aspect, the present disclosure provides a resource management apparatus, applied to a service node, where a GPU resource is deployed in the service node, the apparatus including:

a first receiving module, configured to receive a resource request message sent by a client node, where the resource request message is used to request to use the GPU resource on the service node, and both the client node and the service node are deployed on a target host;

the first determining module is used for determining target GPU resources from the GPU resources according to the resource request message;

the instruction execution module is used for executing the business operation instruction according to the target GPU resource under the condition of receiving the business operation instruction sent by the client node;

a first sending module, configured to send an instruction execution result to the client node;

wherein the service node is pre-established by:

creating a target virtual machine on the target host machine, and taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node; and directly transmitting the GPU resources on the target host machine to the service node.

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node; the first determining module is configured to perform dynamic resource allocation on the GPU resource according to the target GPU computing resource request message and the target GPU storage resource request message, to obtain the target GPU resource.

Optionally, the service nodes include one or more service nodes, and in the case that the service nodes include a plurality of service nodes, different service nodes deploy GPU resources corresponding to different service scenarios.

In a fourth aspect, the present disclosure provides a resource management apparatus, which is applied to a management node, where the management node is configured to manage GPU resources deployed on a service node; the device comprises:

a second receiving module, configured to receive a resource request message sent by a client node, where the resource request message is used to request to use the GPU resource on the service node;

a second determining module, configured to determine, according to the resource request message, a target GPU, where the target GPU is a GPU that is allocated by the management node and provides GPU resources for a current service of the client node;

the second sending module is used for sending the identification information of the target GPU to the client node so that the client node can determine a target service node according to the identification information and send a business operation instruction to the target service node so that the target service node can execute the business operation instruction based on the target GPU and return an instruction execution result to the client node;

wherein the service node is pre-established by:

creating a target virtual machine on a target host machine, and taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node, and the service node and the client node are both deployed on the target host machine; and directly transmitting the GPU resources on the target host machine to the service node.

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node; the second determining module is configured to perform dynamic resource allocation on the GPU resources managed by the management node according to the target GPU computing resource request message and the target GPU storage resource request message, so as to obtain target GPU resources; and determining the target GPU according to the target GPU resources.

Optionally, the apparatus further comprises:

and a third receiving module, configured to receive a resource report message sent by the service node, where the resource report message includes GPU resource information deployed on the service node.

In a fifth aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect of the present disclosure.

In a sixth aspect, the present disclosure provides an electronic device comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of the first aspect of the disclosure.

In a seventh aspect, the present disclosure provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the second aspect of the present disclosure.

In an eighth aspect, the present disclosure provides an electronic device comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of the second aspect of the disclosure.

According to the technical scheme, a service node deployed with GPU resources of a graphic processing unit receives resource request information sent by a client node, wherein the resource request information is used for requesting to use the GPU resources on the service node, and the client node and the service node are both deployed on a target host machine; determining a target GPU resource from the GPU resources according to the resource request message; under the condition of receiving a business operation instruction sent by the client node, executing the business operation instruction according to the target GPU resource; sending an instruction execution result to the client node; wherein the service node is pre-established by: creating a target virtual machine on the target host machine, and taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node; and directly transmitting the GPU resources on the target host machine to the service node. Therefore, the consistency of the client node and the service node platform can be ensured, the business operation instruction sent by the client node can be executed on the remote service node, a platform basis is provided for the remote execution of GPU related operation, furthermore, on the premise that the consistency of the client node and the service node platform is ensured, the client node can dynamically apply GPU resource use to the service node, the service node can carry out uniform distribution management on all GPU resources on a target host machine, and dynamic splitting of computing power is supported, so that the utilization rate of GPU resources is improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:

FIG. 1 is a schematic diagram of a prior art GPU pass-through scheme architecture;

FIG. 2 is an architectural diagram of a prior art GPU pooling technique;

FIG. 3 is a flow diagram illustrating a method of resource management in accordance with an exemplary embodiment;

FIG. 4 is a diagram illustrating a remote GPU resource sharing architecture in accordance with an illustrative embodiment;

FIG. 5 is a flow diagram illustrating a method of resource management in accordance with an exemplary embodiment;

FIG. 6 is a block diagram illustrating an apparatus for first type of resource management in accordance with an example embodiment;

FIG. 7 is a block diagram illustrating an apparatus for a second type of resource management in accordance with an example embodiment;

FIG. 8 is a block diagram illustrating an apparatus for third resource management in accordance with an example embodiment;

FIG. 9 is a block diagram illustrating the structure of an electronic device in accordance with an exemplary embodiment;

fig. 10 is a block diagram illustrating a structure of yet another electronic device according to an example embodiment.

Detailed Description

The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.

It should be noted that all actions of acquiring signals, information or data in the present disclosure are performed under the premise of complying with the corresponding data protection regulation policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

In a scenario such as a game, a modeling and the like, GPU resources are needed to be used, but GPU hardware is deployed on a Linux host, and a Windows virtual machine cannot directly use the GPU resources on the Linux host, as shown in fig. 1, an existing GPU direct connection scheme can directly connect one or more GPU cards on the host into the Windows virtual machine, and the Windows virtual machine uses the directly connected GPU resources to construct a GPU environment, but the GPU cards of the directly connected host cannot be used by another Windows virtual machine, for example, a virtual machine a cannot use idle GPU resources in a virtual machine B, which causes GPU resource waste and reduces GPU resource utilization.

In order to improve the resource utilization rate, as shown in fig. 2, in the related art, GPU resources on a host are integrated through GPU pooling, and GPU pool nodes perform unified management through a virtual GPU pool. The client node dynamically applies for GPU resource usage from the GPU pool node, so that dynamic Application of computational power can be supported, and a GPU pooling technology based on API (Application Programming Interface) hijacking needs the client node and the GPU pool node to have the same system environment.

In order to solve the existing problems, the present disclosure provides a method, an apparatus, a storage medium, and an electronic device for resource management, where a target virtual machine with a system environment identical to that of a client node may be created on a target host in advance as a service node, and then the GPU resource on the target host is directly communicated to the service node, so that consistency between the client node and a service node platform may be ensured, a business operation instruction sent by the client node may be executed on a remote service node, and a platform basis is provided for remote execution of GPU related operations.

Further, on the premise of ensuring the consistency of the client node and the service node platform, the service node may receive a resource request message sent by the client node for requesting to use the GPU resource on the service node, the service node may determine a target GPU resource according to the resource request message, and on the condition of receiving a business operation instruction sent by the client node, the service node may execute the business operation instruction according to the target GPU resource, and then send an instruction execution result to the client node, so that the client node may dynamically apply for GPU resource use to the service node, the service node may perform uniform allocation management on all GPU resources on a target host, and support dynamic splitting of computing power, thereby improving the utilization rate of the GPU resource.

Specific embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 3 is a flowchart illustrating a method for resource management, according to an exemplary embodiment, applied to a service node on which GPU resources of a graphics processor are deployed, as shown in fig. 3, the method includes the following steps:

in step S301, a resource request message sent by a client node is received, where the resource request message is used to request to use the GPU resource on the service node, and both the client node and the service node are deployed on a target host.

The target host is usually a server, and the client node and the service node are deployed on the target host at the same time, the client node may include a client physical machine or a client virtual machine constructed on the target host, the service node is a target virtual machine constructed on the target host, and the service node is the same as the system environment of the client node, the system environment of the target host is different from the system environment of the service node (and the client node), for example, in general, the system environment of the target host is Linux, and the system environments of the client node and the service node are Windows.

It should be noted that, multiple service nodes may be deployed on the target host at the same time, or multiple client nodes may be deployed at the same time.

The resource request message may include a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node, where the target GPU computing resource generally refers to a GPU core number, and the target GPU storage resource generally refers to a GPU video memory size, it may be understood that the GPU core numbers required by different services may be different, and the required GPU video memory sizes may also be different, and the service node may manage all GPU resources on the target host, where the all GPU resources include a plurality of GPUs and an available core number and an available video memory size corresponding to each GPU.

In step S302, a target GPU resource is determined from the GPU resources according to the resource request message.

In this step, the GPU resources may be dynamically resource-allocated according to the target GPU computing resource request message and the target GPU storage resource request message, so as to obtain the target GPU resources.

In a possible implementation manner, the target GPU core number and the target video memory size required by the current service of the client node may be obtained from the resource request message, and then GPU resource allocation is performed from the available core number and the available video memory size respectively corresponding to the multiple managed GPUs according to the target GPU core number and the target video memory size, so as to obtain the target GPU resource.

In an example, it is assumed that the number of target GPU cores required by the service node to acquire the current service of the client node from the resource request message is 16 cores, the size of a target video memory required by the current service is 32G, and the service node manages three GPUs including a GPU1, a GPU2, and a GPU3, where the number of available cores of the GPU1 is 4 cores, and the size of the available video memory is 8G; the number of available cores of the GPU2 is 8, and the size of available video memory is 16G; the number of available cores of the GPU3 is 8 cores, and the size of available video memory of the GPU3 is 16G, so that the service node can allocate resources of two GPUs, namely the GPU2 and the GPU3, to the client node for use so as to meet GPU computing resources and GPU storage resources required by current services of the client node, the GPU2 and the GPU3 are the target GPU, and the available GPU resources corresponding to the GPU2 and the GPU3 are the target GPU resources, which is only an example, and the disclosure does not limit this.

In step S303, when receiving the service operation instruction sent by the client node, the service operation instruction is executed according to the target GPU resource, and an instruction execution result is sent to the client node.

The business operation instruction may include, for example, an image rendering instruction (e.g., CreateTexture () in DirectX).

After receiving the service operation instruction, the service node may obtain an instruction execution result (e.g., a rendered texture object) by executing the service operation instruction based on the target GPU resource, and then may send the instruction execution result to the client node, so that the client node performs interface display.

It should be noted that, after the client node sends the resource request message to the service node, the service node may establish a data path with the client node in a memory sharing manner.

The steps realize that the target virtual machine is established on the target host machine as the service node, then the GPU resources on the target host machine are uniformly distributed and managed through the service node, the client node can apply for the dynamic GPU resources to the service node according to actual business requirements, the service node can support dynamic splitting of computing power, and the GPU resources are distributed according to the specific requirements of the client node on the GPU resources, so that the utilization rate of the GPU resources can be obviously improved.

However, considering that if the system environments of the service node and the client node are inconsistent, the GPU resources on the host managed by the service node cannot be shared with the client node, the service node may be pre-established in the following manner:

The specific implementation manner of creating the target virtual machine on the host as the service node and directly communicating the GPU resource of the target host to the service node may refer to descriptions in relevant documents, which is not limited herein.

Therefore, the consistency of the client node and the service node platform (namely, the system environment) can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform foundation is provided for the remote execution of the GPU related operation.

For example, fig. 4 is a schematic diagram of a remote GPU resource sharing architecture according to an exemplary embodiment, as shown in fig. 4, a Linux host is deployed with five GPUs, GPU1, GPU2, GPU3, GPU4, and GPU5, and the Linux host is further deployed with three Windows client virtual machines (i.e., VM1, VM2, and VM3) as client nodes, all of the three client nodes are installed with application programs that need to use GPU resources, in order to perform uniform allocation management on the GPU resources on the Linux host and ensure consistency of system environments of the client nodes and the service nodes, as shown in fig. 4, a Windows virtual machine (Windows Server VM) can be created on the Linux host as a service node, and then the GPU resources on the Linux host are directly transmitted to the Server VM, so as to implement uniform allocation management of the Server on all GPU resources on the host, so that the Windows client can send resource request messages to the Server VM, the Windows client virtual machine may send the service operation instruction to the Server VM when intercepting the service operation instruction sent by the application program, and the Server VM may obtain an instruction execution result by executing the service operation instruction based on the allocated target GPU resource, and return the instruction execution result to the Windows client virtual machine.

By adopting the method, the consistency of the client node and the service node platform can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform basis is provided for the remote execution of the GPU related operation. Furthermore, the client node can dynamically apply for GPU resource usage to the service node, the service node can uniformly distribute and manage all GPU resources on the target host machine, and dynamic splitting of computing power is supported, so that the utilization rate of the GPU resources is improved.

As mentioned above, the service nodes may include one or more service nodes, that is, one or more service nodes may be deployed on the target host, and in the case that the service node includes a plurality of service nodes, different service nodes may deploy GPU resources corresponding to different service scenarios, so as to further improve utilization efficiency of the GPU resources.

For example, taking a game and modeling two scenes as an example, generally, GPU resources required by a game-related application program during running are generally larger than GPU resources required by a modeling scene, and therefore, assuming that five GPUs with approximately the same configuration, namely GPU1, GPU2, GPU3, GPU4 and GPU5, are deployed on a Linux host as shown in fig. 4, in one possible implementation, GPU resources corresponding to three GPUs, GPU1, GPU2 and GPU3, may be allocated to the game scene for use, GPU resources corresponding to two GPUs, GPU4 and GPU5, may be allocated to the modeling scene for use, so that two Windows virtual machines may be created in advance on the Linux host as service nodes (which may be respectively denoted as S-VM1 and S-VM2), wherein the GPU resources corresponding to three GPUs, GPU1, GPU2 and GPU3, may be directly connected to the service node S-VM1, the GPU resources corresponding to GPU4 and GPU5 may be directly connected to the service node S-VM2, therefore, different service nodes manage GPU resources required by different service scenarios, and thus, the utilization rate and allocation efficiency of the GPU resources can be further improved.

Fig. 5 is a flowchart illustrating a method for managing resources, which may be applied to a management node, the management node is configured to manage GPU resources deployed on a service node, and the management node is generally a preset process and may be deployed on the service node, a client node, or a target host, the target host is generally a server, and the client node and the service node may be deployed on the target host at the same time, as shown in fig. 5, and the method includes the following steps:

in step S501, a resource request message sent by a client node is received, where the resource request message is used to request to use the GPU resource on the service node.

The client node may include a physical client machine or a virtual client machine constructed on a target host machine, the service node refers to a target virtual machine constructed on the target host machine, and the service node is the same as the system environment of the client node, and the system environment of the target host machine is different from the system environment of the service node (and the client node), for example, in general, the system environment of the target host machine is Linux, and the system environments of the client node and the service node are Windows.

The resource request message may include a target GPU compute resource request message and a target GPU store resource request message required for current traffic of the client node, the target GPU computing resources are typically referred to as GPU cores, the target GPU storage resources are typically referred to as GPU memory size, it is understood that the number of GPU cores required for different services may be different, the size of the GPU memory required may also be different, the service node may manage the GPU resources on the target host, the GPU resources comprise a plurality of GPUs and the available core number and the available video memory size which are respectively corresponding to each GPU, the management node can manage the GPU resources respectively deployed by a plurality of service nodes, after receiving the resource request message, the target GPU compute resources and target GPU storage resources required for the current service may be determined from the resource request message, and then selecting available resources from GPU resources respectively deployed by a plurality of managed service nodes for dynamic resource allocation.

In step S502, a target GPU is determined according to the resource request message, and the target GPU provides the GPU resources for the current service of the client node, which are allocated by the management node.

In this step, dynamic resource allocation may be performed on the GPU resources managed by the management node according to the target GPU computing resource request message and the target GPU storage resource request message, so as to obtain target GPU resources; and determining the target GPU according to the target GPU resources.

In a possible implementation manner, the target GPU core number and the target video memory size required by the current service of the client node may be obtained from the resource request message, and then GPU resource allocation is performed from the available core number and the available video memory size in the GPU resources respectively deployed by the multiple managed service nodes according to the target GPU core number and the target video memory size, so as to obtain the target GPU resource.

For example, assuming that the number of target GPU cores required by the current service of the client node obtained by the management node from the resource request message is 16 cores, the size of the target video memory required by the current service is 64G, and the management node manages three service nodes, i.e., a service node a, a service node b, and a service node c, wherein two GPUs 1 and 2 are deployed on the service node a, one GPU3 and one GPU are deployed on the service node b, two GPUs 4 and 5 are deployed on the service node c, and the available GPU resources of each GPU are as shown in table 1, after the management node performs resource allocation according to the available GPU resources of each GPU in table 1 and the target GPU cores and the target video memory sizes required by the current service of the client node, the resources of the two GPUs 4 and 5 can be allocated to the client node to meet the GPU computing resources and the GPU storage resources required by the current service of the client node, and the GPU4 and 5 are the target GPUs, the above examples are illustrative only, and the present disclosure is not limited thereto.

GPU	Number of available cores	Available video memory size
			GPU1	4 nucleus	8G
GPU2	8 nucleus	16G
			GPU3	4 nucleus	8G
GPU4	16 nucleus	32G
			GPU5	16 nucleus	32G

TABLE 1

In step S503, the identification information of the target GPU is sent to the client node, so that the client node determines a target service node according to the identification information, and sends a business operation instruction to the target service node, so that the target service node executes the business operation instruction based on the target GPU, and returns an instruction execution result to the client node.

The identification information may include, among other things, GPU identification information of the target GPU (such as GPU4 and GPU5 in the above example) and service node identification information of the target GPU (such as the third service node in the above example where GPU4 and GPU5 are deployed).

The steps realize the unified allocation management of the GPU resources deployed on the multiple service nodes through the management node, the customer node can apply for the dynamic GPU resources to the management node according to actual business requirements, the management node can support dynamic splitting of computing power, and the GPU resources are allocated according to the specific requirements of the customer node on the GPU resources, so that the utilization rate of the GPU resources can be improved.

However, considering that if the system environments of the service node and the client node are not consistent, the GPU resources on the host deployed by the service node cannot share the resources of the client node, the service node may be pre-established in the following manner:

creating a target virtual machine on a target host machine, taking the target virtual machine as the service node, wherein the system environment of the target virtual machine is the same as that of the client node, and the service node and the client node are both deployed on the target host machine; and directly transmitting the GPU resources on the target host machine to the service node.

Similarly, the specific implementation manner of creating the target virtual machine on the target host as the service node and passing the GPU resource of the target host through to the service node may refer to the description in the related literature, which is not limited herein.

Therefore, the consistency of the client node and the service node platform (namely, the system environment) can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform basis is provided for the remote execution of the GPU-related operation.

In addition, in order to implement uniform allocation management of GPU resources deployed on multiple service nodes by the management node, before step S501 is executed, the management node needs to receive a resource report message sent by the service node, where the resource report message includes GPU resource information deployed on the service node.

That is to say, for each service node, after the service node is created and the GPU resources on the corresponding host are directly passed into the service node, the service node needs to report the deployed GPU resources to the management node, so as to implement uniform allocation management of the GPU resources deployed on the plurality of service nodes by the management node, thereby improving the utilization rate and allocation efficiency of the resources.

By adopting the method, the consistency of the client node and the service node platform can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform basis is provided for the remote execution of the GPU related operation. Furthermore, GPU resources respectively deployed by the multiple service nodes can be managed through the management node, the client node can dynamically apply for GPU resource usage to the management node, and the management node can support dynamic splitting of computing power, so that the utilization rate of the GPU resources is improved.

Fig. 6 is a block diagram illustrating an apparatus for resource management, according to an exemplary embodiment, for a service node, where GPU resources of a graphics processor are deployed, as shown in fig. 6, the apparatus includes:

a first receiving module 601, configured to receive a resource request message sent by a client node, where the resource request message is used to request to use the GPU resource on the service node, and both the client node and the service node are deployed on a target host;

a first determining module 602, configured to determine a target GPU resource from the GPU resources according to the resource request message;

an instruction executing module 603, configured to execute, according to the target GPU resource, the service operation instruction sent by the client node when receiving the service operation instruction;

a first sending module 604, configured to send an instruction execution result to the client node;

the service node is pre-established in the following mode:

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node; the first determining module 602 is configured to perform dynamic resource allocation on the GPU resource according to the target GPU computing resource request message and the target GPU storage resource request message, so as to obtain the target GPU resource.

Optionally, the service node includes one or more service nodes, and in the case that the service node includes multiple service nodes, different service nodes deploy GPU resources corresponding to different service scenarios.

By adopting the device, the consistency of the client node and the service node platform can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform basis is provided for the remote execution of the GPU related operation. Furthermore, the client node can dynamically apply for GPU resource usage from the service node, the service node can carry out uniform distribution management on all GPU resources on the target host machine and support dynamic splitting of computing power, and therefore the utilization rate of the GPU resources is improved.

FIG. 7 is a block diagram illustrating an apparatus for resource management, according to an example embodiment, applied to a management node for managing GPU resources deployed on a service node; as shown in fig. 7, the apparatus includes:

a second receiving module 701, configured to receive a resource request message sent by a client node, where the resource request message is used to request to use the GPU resource on the service node;

a second determining module 702, configured to determine, according to the resource request message, a target GPU, where the target GPU is a GPU that is allocated by the management node and provides GPU resources for the current service of the client node;

a second sending module 703, configured to send the identification information of the target GPU to the client node, so that the client node determines a target service node according to the identification information, and sends a service operation instruction to the target service node, so that the target service node executes the service operation instruction based on the target GPU, and returns an instruction execution result to the client node;

the service node is pre-established in the following mode:

Optionally, the resource request message includes a target GPU computing resource request message and a target GPU storage resource request message required by the current service of the client node; the second determining module 702 is configured to perform dynamic resource allocation on the GPU resources managed by the management node according to the target GPU computing resource request message and the target GPU storage resource request message, so as to obtain target GPU resources; and determining the target GPU according to the target GPU resources.

Optionally, fig. 8 is a block diagram of an apparatus for resource management according to the embodiment shown in fig. 7, and as shown in fig. 8, the apparatus further includes:

a third receiving module 704, configured to receive a resource report message sent by the service node, where the resource report message includes GPU resource information deployed on the service node.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

By adopting the device, the consistency of the client node and the service node platform can be ensured, so that the business operation instruction sent by the client node can be executed on the remote service node, and a platform basis is provided for the remote execution of the GPU related operation. Furthermore, GPU resources respectively deployed by the multiple service nodes can be managed through the management node, the client node can dynamically apply for GPU resource usage to the management node, and the management node can support dynamic splitting of computing power, so that the utilization rate of the GPU resources is improved.

Fig. 9 is a block diagram illustrating an electronic device 900 in accordance with an example embodiment. As shown in fig. 9, the electronic device 900 may include: a processor 901 and a memory 902. The electronic device 900 may also include one or more of a multimedia component 903, an input/output (I/O) interface 904, and a communications component 905.

The processor 901 is configured to control the overall operation of the electronic device 900, so as to complete all or part of the steps in the method for resource management. The memory 902 is used to store various types of data to support operation of the electronic device 900, such as instructions for any application or method operating on the electronic device 900 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and the like. The Memory 902 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. The multimedia component 903 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 902 or transmitted through the communication component 905. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 904 provides an interface between the processor 901 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 905 is used for wired or wireless communication between the electronic device 900 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or a combination of one or more of them, which is not limited herein. The corresponding communication component 905 may thus include: Wi-Fi module, Bluetooth module, NFC module, etc.

In an exemplary embodiment, the electronic Device 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the above-described method of resource management.

In another exemplary embodiment, a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the method of resource management described above is also provided. For example, the computer readable storage medium may be the above-mentioned memory 902 comprising program instructions executable by the processor 901 of the electronic device 900 to perform the above-mentioned method of resource management.

Fig. 10 is a block diagram illustrating an electronic device 1000 in accordance with an example embodiment. For example, the electronic device 1000 may be provided as a server. Referring to fig. 10, the electronic device 1000 includes a processor 1022, which may be one or more in number, and a memory 1032 for storing computer programs executable by the processor 1022. The computer programs stored in memory 1032 may include one or more modules that each correspond to a set of instructions. Further, the processor 1022 may be configured to execute the computer program to perform the method of resource management described above.

Additionally, the electronic device 1000 may also include a power component 1026 and a communication component 1050, the power component 1026 may be configured to perform power management for the electronic device 1000, and the communication component 1050 may be configured to enable communication for the electronic device 1000, e.g., wired or wireless communication. In addition, the electronic device 1000 may also include input/output (I/O) interfaces 1058. The electronic device 1000 may operate based on an operating system stored in the memory 1032, such as Windows Server^TM，Mac OS X^TM，Unix^TM，Linux^TMAnd so on.

In another exemplary embodiment, a computer readable storage medium comprising program instructions which, when executed by a processor, implement the steps of the method of resource management described above is also provided. For example, the non-transitory computer readable storage medium may be the memory 1032 comprising program instructions executable by the processor 1022 of the electronic device 1000 to perform the method of resource management described above.

In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned method of resource management when executed by the programmable apparatus.

The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.

It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again.

In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure as long as it does not depart from the gist of the present disclosure.

Claims

1. A method for resource management, applied to a service node, where Graphics Processing Unit (GPU) resources are deployed, includes:

receiving a resource request message sent by a client node, wherein the resource request message is used for requesting to use the GPU resource on the service node, and the client node and the service node are both deployed on a target host machine;

sending an instruction execution result to the client node;

wherein the service node is pre-established by:

2. The method of claim 1, wherein the resource request message comprises a target GPU compute resource request message and a target GPU storage resource request message required for current traffic of the client node; the determining a target GPU resource from the GPU resources according to the resource request message comprises:

and performing dynamic resource allocation on the GPU resources according to the target GPU computing resource request message and the target GPU storage resource request message to obtain the target GPU resources.

3. The method according to claim 1 or 2, wherein the service nodes include one or more service nodes, and in the case that the service nodes include a plurality of service nodes, different service nodes deploy GPU resources corresponding to different service scenarios.

4. The method for resource management is applied to a management node, wherein the management node is used for managing GPU resources deployed on a service node; the method comprises the following steps:

wherein the service node is pre-established by:

5. The method of claim 4, wherein the resource request message comprises a target GPU compute resource request message and a target GPU storage resource request message required for current traffic of the client node; the determining a target GPU according to the resource request message comprises:

performing dynamic resource allocation on the GPU resources managed by the management node according to the target GPU computing resource request message and the target GPU storage resource request message to obtain target GPU resources;

and determining the target GPU according to the target GPU resources.

6. The method according to claim 4 or 5, wherein prior to receiving the resource request message sent by the client node, the method further comprises:

and receiving a resource reporting message sent by the service node, wherein the resource reporting message comprises GPU resource information deployed on the service node.

7. An apparatus for resource management, applied to a service node, the service node being deployed with Graphics Processor (GPU) resources, the apparatus comprising:

the first determining module is used for determining a target GPU resource from the GPU resources according to the resource request message;

wherein the service node is pre-established by:

8. The device for resource management is applied to a management node, wherein the management node is used for managing GPU resources deployed on a service node; the device comprises:

the second sending module is used for sending the identification information of the target GPU to the client node so that the client node determines a target service node according to the identification information and sends a business operation instruction to the target service node so that the target service node executes the business operation instruction based on the target GPU and returns an instruction execution result to the client node;

wherein the service node is pre-established by:

9. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 3.

10. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to implement the steps of the method of any one of claims 1-3.

11. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 4 to 6.

12. An electronic device, comprising:

a memory having a computer program stored thereon;

a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 4 to 6.