CN112162856A - GPU virtual resource allocation method and device, computer equipment and storage medium - Google Patents
GPU virtual resource allocation method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN112162856A CN112162856A CN202011006550.6A CN202011006550A CN112162856A CN 112162856 A CN112162856 A CN 112162856A CN 202011006550 A CN202011006550 A CN 202011006550A CN 112162856 A CN112162856 A CN 112162856A
- Authority
- CN
- China
- Prior art keywords
- image processing
- task
- gpu virtual
- target image
- container
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Abstract
The application relates to a method and a device for allocating GPU virtual resources, computer equipment and a storage medium. The method comprises the following steps: the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers; detecting the image processing containers, and determining the image processing containers in an idle state as target image processing containers; acquiring a target image processing task from a message queue cluster by using the target image processing container; processing the target image processing task using the target image processing container. By adopting the method, the corresponding GPU virtual resources can be allocated for the target image processing task.
Description
Technical Field
The present application relates to the field of resource allocation technologies, and in particular, to a method and an apparatus for allocating GPU virtual resources, a computer device, and a storage medium.
Background
Medical image Processing algorithms (such as a blood vessel extraction algorithm, a nodule extraction algorithm, and the like) have the characteristics of long calculation time, consumption of a large number of Graphics Processing Unit (GPU) resources, and the like, and therefore, the GPU resources need to be reasonably utilized.
In the related art, in order to reasonably utilize GPU resources, when an image processing task exists, the image processing task is usually sent to a load balancing server in a server cluster, and then the load balancing server obtains GPU resources available to each physical machine in the server cluster, so as to schedule the GPU resources in the service cluster.
However, with the development of cloud technology, the existing server cluster has evolved into a cloud server cluster, and the GPU resources available in the physical machine have also evolved into GPU virtual resources, and the existing resource allocation method is no longer applicable, so how to allocate the corresponding GPU virtual resources to the image processing task becomes a technical problem to be solved urgently.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, an apparatus, a computer device, and a storage medium for allocating GPU virtual resources, which can allocate the GPU virtual resources to image processing tasks more reasonably.
A method for allocating GPU virtual resources comprises the following steps:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from the message queue cluster by using a target image processing container;
the target image processing task is processed using the target image processing container.
In one embodiment, before the acquiring the target image processing task from the message queue cluster by using the target image processing container, the method further includes:
receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
determining GPU virtual resources required by an image processing task;
and storing the image processing task into the message queue cluster, and identifying GPU virtual resources required by the image processing task.
In one embodiment, the acquiring, by the target image processing container, the target image processing task from the message queue cluster includes:
and the target image processing container judges whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster.
In one embodiment, the image processing task includes a first image processing task, and the method further includes:
if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container, determining the first image processing task as the target image processing task;
if the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, the target image processing container is utilized to obtain the GPU virtual resource required by the second image processing task after the first image processing task;
and if the GPU virtual resources required by the second image processing task are matched with the GPU virtual resources corresponding to the target image processing container, determining the second image processing task as the target image processing task.
In one embodiment, the image processing request further carries a priority of an image processing task, and after the image processing task is stored in the message queue cluster, the method further includes:
and in the message queue cluster, sequencing the image processing tasks according to the priority of the image processing tasks.
In one embodiment, the method further comprises:
if the first image processing container in the plurality of image processing containers is detected to be crashed, reconstructing the first image processing container according to a pre-stored mirror image of the first image processing container;
the reconstructed first image processing container reprocesses the image processing task being processed.
In one embodiment, the plurality of image processing containers at least include a first image processing container and a second image processing container, and GPU virtual resources allocated to the first image processing container and the second image processing container are the same or different.
An apparatus for allocating GPU virtual resources, the apparatus comprising:
the system comprises a container setting module, a cloud server cluster and a storage module, wherein the container setting module is used for setting a plurality of image processing containers in the cloud server cluster, and each image processing container is distributed with GPU virtual resources;
the container detection module is used for detecting a plurality of image processing containers and determining the image processing containers in an idle state as target image processing containers;
the task acquisition module is used for acquiring a target image processing task from the message queue cluster by using the target image processing container;
and the task processing module is used for processing the target image processing task by using the target image processing container.
In one embodiment, the apparatus further comprises:
the request receiving module is used for receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
the resource determining module is used for determining GPU virtual resources required by the image processing task;
and the storage module is used for storing the image processing task into the message queue cluster and identifying GPU virtual resources required by the image processing task.
In one embodiment, the task obtaining module is specifically configured to determine, by the target image processing container, whether the image processing task is the target image processing task according to a GPU virtual resource required by the image processing task in the message queue cluster.
In one embodiment, the image processing task includes a first image processing task, and the apparatus further includes:
the first task determination module is used for determining the first image processing task as the target image processing task if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container;
the resource acquisition module is used for acquiring GPU virtual resources required by a second image processing task after the first image processing task by using the target image processing container if the GPU virtual resources required by the first image processing task are not matched with the GPU virtual resources corresponding to the target image processing container;
and the second task determination module is used for determining the second image processing task as the target image processing task if the GPU virtual resource required by the second image processing task is matched with the GPU virtual resource corresponding to the target image processing container.
In one embodiment, the apparatus further comprises:
and the sequencing module is used for sequencing the image processing tasks in the message queue cluster according to the priority of the image processing tasks.
In one embodiment, the apparatus further comprises:
the container construction module is used for reconstructing a first image processing container according to a prestored mirror image of the first image processing container if the first image processing container in the plurality of image processing containers is detected to be crashed;
and the reprocessing module is used for reprocessing the image processing task which is being processed by the reconstructed first image processing container.
In one embodiment, the plurality of image processing containers at least include a first image processing container and a second image processing container, and GPU virtual resources allocated to the first image processing container and the second image processing container are the same or different.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from the message queue cluster by using a target image processing container;
the target image processing task is processed using the target image processing container.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from the message queue cluster by using a target image processing container;
the target image processing task is processed using the target image processing container.
According to the method and the device for allocating the GPU virtual resources, the computer equipment and the storage medium, a plurality of image processing containers are arranged in the cloud server cluster, and each image processing container is allocated with the GPU virtual resources; detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers; acquiring a target image processing task from the message queue cluster by using a target image processing container; the target image processing task is processed using the target image processing container. By the embodiment of the disclosure, the GPU virtual resources which can be provided by the target image processing container are matched with the GPU virtual resources consumed by the target image processing task, namely, the GPU virtual resources are reasonably allocated to the target image processing task.
Drawings
FIG. 1 is a diagram of an application environment for a method for allocating GPU virtual resources, according to an embodiment;
FIG. 2 is a flowchart illustrating a method for allocating GPU virtual resources according to one embodiment;
FIG. 3 is a second flowchart illustrating a method for allocating GPU virtual resources according to an embodiment;
FIG. 4 is one of the flow diagrams illustrating the determination of whether an image processing task is a target image processing task in one embodiment;
FIG. 5 is a second flowchart illustrating a process of determining whether an image processing task is a target image processing task according to an embodiment;
FIG. 6 is a flowchart illustrating the processing steps for image processing container crash in one embodiment;
FIG. 7 is a block diagram of an apparatus for allocating GPU virtual resources, according to one embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method for allocating the GPU virtual resources can be applied to the application environment shown in FIG. 1. The application environment is a cloud server cluster, the cloud server cluster is composed of a cloud server 101, a cloud server 102, a cloud server 103, a cloud server 104 … … and the like, and the cloud servers can communicate with one another through a network.
In an embodiment, as shown in fig. 2, a method for allocating GPU virtual resources is provided, which is described by taking the method as an example applied to the cloud server cluster in fig. 1, and includes the following steps:
Each image processing container contains a complete operating environment, that is, all dependencies, class libraries, other binary files, configuration files, etc. required by the image processing program, except the image processing program itself, are uniformly packed into a package called an image processing container.
The cloud server cluster consists of a plurality of cloud servers, and one or more image processing containers can be arranged in at least one cloud server of the cloud server cluster; meanwhile, the corresponding GPU virtual resources are distributed for the image processing containers, the sizes of the GPU virtual resources distributed by the image processing containers are fixed, and the sizes of the GPU virtual resources distributed by the image processing containers can be the same or different and can be partially the same. The size of the GPU virtual resources allocated to each image processing container may be preset according to the image processing task that needs to be processed. It is understood that statistical analysis may be performed on GPU virtual resources required for image processing tasks in a statistical analysis manner, or more GPU virtual resources may be set for general size of GPU virtual resources required for image processing tasks according to experience, and less GPU virtual resources may be set for less image processing tasks requiring more or less GPU virtual resources. The allocation of the GPU virtual resources may be NVIDIA vGPU technology or may be an independent virtual graphics card.
In one embodiment, the plurality of image processing containers at least comprise a first image processing container and a second image processing container, and GPU virtual resources allocated by the first image processing container and the second image processing container are the same or different. For example, the GPU virtual resources allocated to the first image processing container and the second image processing container are both 1G, or the GPU virtual resources allocated to the first image processing container are 1G, and the GPU virtual resources allocated to the second image processing container are 2G.
In other embodiments, the plurality of image processing containers further includes a third image processing container, a fourth image processing container … …, and so on. The GPU virtual resources allocated for the first image processing container may be 512M, the second and third image processing containers are 1G, and the fourth image processing container is 2G.
As shown in fig. 1, the cloud server cluster includes a cloud server 101, a cloud server 102, and a cloud server 103; an image processing container 1 is arranged in a cloud server 101, and an image processing container 2 and an image processing container 3 are arranged in a cloud server 102; the GPU virtual resource allocated to the image processing container 1 is 4G, the GPU virtual resource allocated to the image processing container 2 is 2G, and the GPU virtual resource allocated to the image processing container 3 is also 2G.
In this step, the cloud server cluster detects whether each image processing container is in an idle state, and determines the image processing container as a target image processing container if the image processing container is in the idle state. For example, the cloud server cluster sequentially detects whether the image processing container 1, the image processing container 2, and the image processing container 3 are in an idle state, and if the image processing container 1 is in the idle state, determines the image processing container 1 as a target image processing container; and the image processing container 2 and the image processing container 3 are both in working state, neither the image processing container 2 nor the image processing container 3 is the target image processing container.
In one embodiment, when there are a plurality of image processing containers in the idle state, one image processing container is selected from the plurality of image processing containers in the idle state as a target image processing container according to a preset rule. Wherein, the preset rule comprises: and the sequence number is sequenced from front to back, the allocated GPU virtual resources are sequenced from small to large, and the idle time length is sequenced from long to short.
For example, the image processing container 1 and the image processing container 2 are both in an idle state, and the image processing container 2 is idle for a period longer than the idle period of the image processing container 1, and the image processing container 3 is in an operating state. If the image processing container 1 is determined as a target image processing container from front to back according to the sequence number; if the virtual resources are from small to large according to the GPU, determining the image processing container 2 as a target image processing container; if the image processing container 2 is determined to be the target image processing container according to the length of the idle time period.
The detection of whether the image processing container is in an idle state may be performed in real time or according to a preset period. The disclosed embodiments are not limited herein.
In one embodiment, the allocated GPU virtual resources may be ordered from small to large. By sequencing from small to large, each GPU virtual resource can be fully utilized, and the problem that the large GPU virtual resource is consumed by a task with small consumption to cause resource waste is avoided.
At least one image processing task is stored in the message queue cluster in advance. The implementation of message queue clusters is not limited to Kafka, Redis, rabbitmq, etc. As can be appreciated, the message queue cluster is adopted to store the image processing tasks, so that high concurrency of image processing can be ensured, and the image processing efficiency is improved.
After the target image processing container is determined, the target image processing container judges whether the image processing task is the target image processing task according to GPU virtual resources required by the image processing tasks in the message queue cluster. And if the GPU virtual resource required by the image processing task is matched with the GPU virtual resource corresponding to the target image processing container, the target image processing container acquires the image processing task from the message queue cluster.
According to the embodiment of the disclosure, GPU virtual resources are allocated to the image processing container, and after the target image processing container is determined, the target image processing container selects the target image processing task. By the method, the problem that adaptive allocation cannot be performed according to the size of the GPU virtual resources required to be consumed by the image processing task due to the fact that the size of the GPU virtual resources is fixedly allocated can be effectively solved.
And after the target image processing container acquires the target processing task, the target image processing container processes the target image processing task.
In the method for allocating the GPU virtual resources, a plurality of image processing containers are arranged in a cloud server cluster, and each image processing container is allocated with the GPU virtual resources; detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers; acquiring a target image processing task from the message queue cluster by using a target image processing container; the target image processing task is processed using the target image processing container. By the embodiment of the disclosure, the GPU virtual resources which can be provided by the target image processing container are matched with the GPU virtual resources consumed by the target image processing task, namely, the GPU virtual resources can be reasonably allocated to the target image processing task.
In one embodiment, as shown in fig. 3, before acquiring the target image processing task from the message queue cluster by using the target image processing container, the method may further include:
in step 205, the cloud server cluster receives an image processing request sent by the client.
And when the client needs to process the image, sending an image processing request to the cloud server cluster. Correspondingly, the cloud server cluster receives an image processing request sent by the client. Wherein the image processing request includes at least one image processing task.
For example, if a client needs to perform blood vessel extraction processing on two medical images, the client sends an image processing request to the cloud server cluster, the image processing request includes two image processing tasks for performing blood vessel extraction processing, and one image processing task corresponds to one medical image.
In step 206, GPU virtual resources required by the image processing task are determined.
According to the number of images to be processed by the image processing task, the size of each image, the required processing algorithm and the like, the video memory to be consumed is accurately estimated, and therefore the GPU virtual resource required by the image processing task can be determined. And simultaneously, the memory resource and the CPU resource required by the image processing task can be determined.
After GPU virtual resources required by the image processing tasks are determined, the cloud server cluster stores each image processing task into the message queue cluster; and identifying the GPU virtual resources required by each image processing task so that the target image processing container can determine whether the image processing task is the target image processing task according to the GPU virtual resources required by each image processing task.
In one embodiment, after step 207, the method may further include:
and step 208, sequencing the image processing tasks according to the priority of the image processing tasks in the message queue cluster.
Wherein, the image processing request also carries the priority of the image processing task.
After the cloud server cluster stores the image processing tasks into the message queue cluster, the image processing tasks are sequenced according to the priority, so that the processing sequence of the image processing tasks is adjusted, and the image processing tasks with high priority can be matched and processed firstly.
In the method for allocating the GPU virtual resources, a cloud server cluster receives an image processing request sent by a client; determining GPU virtual resources required by an image processing task; storing the image processing task into a message queue cluster, and identifying GPU virtual resources required by the image processing task; and in the message queue cluster, sequencing the image processing tasks according to the priority of the image processing tasks. By the embodiment of the disclosure, the image processing task is stored in the message queue cluster, so that high concurrency and no loss of image processing can be ensured, the image processing efficiency can be improved, and the safety of image data can be ensured.
In an embodiment, the step of obtaining the target image processing task from the message queue cluster by using the target image processing container may include: and the target image processing container judges whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster. The image processing task includes a first image processing task and a second image processing task, as shown in fig. 4, including the following steps:
The target image processing container firstly acquires the identifier of the GPU virtual resource corresponding to the first image processing task, and then determines the GPU virtual resource required by the first image processing task according to the acquired identifier. And then, the target image processing container judges whether the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container. Specifically, the target image processing container judges whether the GPU virtual resource required by the first image processing task is less than or equal to the GPU virtual resource corresponding to the target image processing container; if the GPU virtual resource required by the first image processing task is less than or equal to the GPU virtual resource corresponding to the target image processing container, determining that the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container; and if the GPU virtual resource required by the first image processing task is larger than the GU virtual resource corresponding to the target image processing container, determining that the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container.
For example, an image processing task 1 and an image processing task 2 are stored in the message queue cluster, the GPU virtual resource required by the image processing task 1 is 3.9G, and the GPU virtual resource required by the image processing task 2 is 1.8G; and the GPU virtual resource corresponding to the target image processing container is 4G. And taking the image processing task 1 as a first image processing task, and determining the image processing task 1 as a target image processing task if the GPU virtual resource required by the image processing task 1 is matched with the GPU virtual resource corresponding to the target image processing container.
For example, an image processing task 1 and an image processing task 2 are stored in the message queue cluster, the GPU virtual resource required by the image processing task 1 is 4.1G, and the GPU virtual resource required by the image processing task 2 is 3.2G; and the GPU virtual resource corresponding to the target image processing container is 4G. The image processing task 1 is a first image processing task, the image processing task 2 is a second image processing task, the GPU virtual resource required by the image processing task 1 is not matched with the GPU virtual resource corresponding to the target image processing container, the target image processing container determines the GPU virtual resource required by the image processing task 2 according to the identification of the GPU virtual resource, and judges whether the GPU virtual resource required by the second image processing task is matched with the GPU virtual resource corresponding to the target image processing container.
For example, the target image processing container determines that the GPU virtual resource required by the image processing task 2 is 3.2G, the GPU virtual resource corresponding to the target image processing container is 4G, and the GPU virtual resource is matched with the GPU virtual resource corresponding to the target image processing container, and then determines that the image processing task 2 is the target image processing task.
In one embodiment, if the GPU virtual resource required by the second image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, the GPU virtual resource required by the third image processing task after the second image processing task is acquired, and whether the GPU virtual resource required by the third image processing task is matched with the GPU virtual resource corresponding to the target image processing container is determined. And the process is circulated until the matching is successful. If all image processing tasks in the message queue cluster do not match, the target image processing container is replaced, and then step 301 is performed. For example, the target image processing container to which the GPU virtual resources are allocated is replaced with image processing containers in which the GPU virtual resources are arranged from small to large. Meanwhile, the image processing container before replacement is discharged to the end of the container queue.
And for each image processing task, if the GPU virtual resource required by the image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, carrying out matching failure marking on the image processing task. According to the matching failure mark, the situation of repeated matching can be avoided, and therefore computing resources are saved.
And if the GPU virtual resources required by the image processing tasks are not matched with the GPU virtual resources corresponding to any image processing container, abandoning the image processing tasks and sending feedback information to the client.
In one embodiment, as shown in fig. 5, the method may further include:
And if the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, returning the first image processing task to the original position in the message queue cluster.
For example, if the GPU virtual resource required by the image processing task 1 does not match the GPU virtual resource corresponding to the target image processing container, the image processing task 1 is returned to the message queue cluster, and the image processing task 1 is not processed.
In the step of acquiring the target image processing task from the message queue cluster by using the target image processing container, the target image processing container judges whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster. By the embodiment of the disclosure, GPU virtual resources can be reasonably allocated to the target image processing task, so that the computing power is matched with the consumed GPU resources, and the utilization rate of the GPU virtual resources and the image processing efficiency are improved.
In an embodiment, during the process of processing the image processing task, there may be a processing exception in each image processing container, which may cause the image processing container to crash, as shown in fig. 6, the processing steps related to the crash of the image processing container may include the following steps:
And mirror images of the image processing containers are prestored in the cloud server cluster. If the first image processing container crashes in the process of processing the target image processing task, the mirror image of the first image processing container can be acquired from the cloud server cluster, and the first image processing container is reconstructed according to the mirror image of the first image processing container.
In step 402, the reconstructed first image processing container reprocesses the image processing task being processed.
When the first image processing container is reconstructed, the image processing task which is being processed by the first image processing container is not processed and completed due to the fact that the first image processing container is crashed, and therefore after the message queue cluster detects that the first image processing container is crashed, the image processing task which is being processed when the first image processing container is crashed is determined, and then the image processing task is re-queued, so that the image processing task is processed and completed finally. And, after the image processing task is processed, the message queue cluster will clear the image processing task.
In one embodiment, the mirror image of the message queue cluster is preset, and if the message queue cluster crashes, the message queue cluster is reconstructed according to the mirror image of the message queue cluster. As can be understood, the message queue cluster can not lose the image processing task due to crash, and the safety of image processing is improved.
In one embodiment, when the first image processing container crashes, the image processing task in the message queue cluster may also be transferred to a new target image processing container such as a second image processing container for processing, and the second image processing container repeats the above process to process the image processing task in the message queue cluster.
In the above embodiment, if it is detected that a first image processing container of the plurality of image processing containers is crashed, reconstructing the first image processing container according to a mirror image of the first image processing container stored in advance; the processing of the target image processing task continues by reconstructing the first image processing container or a new target image processing container (e.g., the second image processing container), and the image processing tasks in the message queue cluster (including the target image processing task being processed) are not lost due to the container crash. By the embodiment of the disclosure, the image processing task is not lost, and the safety of image processing is improved.
It should be understood that although the various steps in the flowcharts of fig. 2-6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-6 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
In one embodiment, as shown in fig. 7, there is provided an apparatus for allocating GPU virtual resources, including:
a container setting module 501, configured to set a plurality of image processing containers in a cloud server cluster, where each image processing container is allocated with a GPU virtual resource;
a container detection module 502, configured to detect multiple image processing containers, and determine an image processing container in an idle state as a target image processing container;
a task obtaining module 503, configured to obtain a target image processing task from the message queue cluster by using the target image processing container;
a task processing module 504 for processing the target image processing task using the target image processing container.
In one embodiment, the apparatus further comprises:
the request receiving module is used for receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
the resource determining module is used for determining GPU virtual resources required by the image processing task;
and the storage module is used for storing the image processing task into the message queue cluster and identifying GPU virtual resources required by the image processing task.
In one embodiment, the task obtaining module 503 is specifically configured to determine, by the target image processing container, whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster.
In one embodiment, the image processing task includes a first image processing task, and the apparatus further includes:
the first task determination module is used for determining the first image processing task as the target image processing task if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container;
the resource acquisition module is used for acquiring GPU virtual resources required by a second image processing task after the first image processing task by using the target image processing container if the GPU virtual resources required by the first image processing task are not matched with the GPU virtual resources corresponding to the target image processing container;
and the second task determination module is used for determining the second image processing task as the target image processing task if the GPU virtual resource required by the second image processing task is matched with the GPU virtual resource corresponding to the target image processing container.
In one embodiment, the apparatus further comprises:
and the sequencing module is used for sequencing the image processing tasks in the message queue cluster according to the priority of the image processing tasks.
In one embodiment, the apparatus further comprises:
the container construction module is used for reconstructing a first image processing container according to a prestored mirror image of the first image processing container if the first image processing container in the plurality of image processing containers is detected to be crashed;
and the re-processing module is used for re-processing the image processing task being processed by the reconstructed first image processing container.
In one embodiment, the plurality of image processing containers at least include a first image processing container and a second image processing container, and GPU virtual resources allocated to the first image processing container and the second image processing container are the same or different.
For specific limitations of the allocation apparatus for GPU virtual resources, reference may be made to the above limitations on the allocation method for GPU virtual resources, which are not described herein again. The modules in the device for allocating GPU virtual resources may be implemented wholly or partially by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the allocation data of the GPU virtual resources. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for allocating virtual resources of a GPU.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from the message queue cluster by using a target image processing container;
the target image processing task is processed using the target image processing container.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
determining GPU virtual resources required by an image processing task;
and storing the image processing task into the message queue cluster, and identifying GPU virtual resources required by the image processing task.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and the target image processing container judges whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster.
In one embodiment, the image processing task includes a first image processing task, and the processor executes the computer program to further implement the following steps:
if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container, determining the first image processing task as the target image processing task;
if the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, the target image processing container is utilized to obtain the GPU virtual resource required by the second image processing task after the first image processing task;
and if the GPU virtual resources required by the second image processing task are matched with the GPU virtual resources corresponding to the target image processing container, determining the second image processing task as the target image processing task.
In one embodiment, the image processing request further carries a priority of the image processing task, and the processor executes the computer program to further implement the following steps:
and in the message queue cluster, sequencing the image processing tasks according to the priority of the image processing tasks.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
if the first image processing container in the plurality of image processing containers is detected to be crashed, reconstructing the first image processing container according to a pre-stored mirror image of the first image processing container;
the reconstructed first image processing container reprocesses the image processing task being processed.
In one embodiment, the plurality of image processing containers at least include a first image processing container and a second image processing container, and GPU virtual resources allocated to the first image processing container and the second image processing container are the same or different.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting a plurality of image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from the message queue cluster by using a target image processing container;
the target image processing task is processed using the target image processing container.
In one embodiment, the computer program when executed by the processor further performs the steps of:
receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
determining GPU virtual resources required by an image processing task;
and storing the image processing task into the message queue cluster, and identifying GPU virtual resources required by the image processing task.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and the target image processing container judges whether the image processing task is the target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster.
In an embodiment, the image processing task comprises a first image processing task, and the computer program when executed by the processor further performs the steps of:
if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container, determining the first image processing task as the target image processing task;
if the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, the target image processing container is utilized to obtain the GPU virtual resource required by the second image processing task after the first image processing task;
and if the GPU virtual resources required by the second image processing task are matched with the GPU virtual resources corresponding to the target image processing container, determining the second image processing task as the target image processing task.
In an embodiment, the image processing request further carries a priority of the image processing task, and the computer program, when executed by the processor, further implements the following steps:
and in the message queue cluster, sequencing the image processing tasks according to the priority of the image processing tasks.
In one embodiment, the computer program when executed by the processor further performs the steps of:
if the first image processing container in the plurality of image processing containers is detected to be crashed, reconstructing the first image processing container according to a pre-stored mirror image of the first image processing container;
the reconstructed first image processing container reprocesses the image processing task being processed.
In one embodiment, the plurality of image processing containers at least include a first image processing container and a second image processing container, and GPU virtual resources allocated to the first image processing container and the second image processing container are the same or different.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method for allocating GPU virtual resources, which is characterized by comprising the following steps:
the method comprises the steps that a plurality of image processing containers are arranged in a cloud server cluster, and GPU virtual resources are distributed to the image processing containers;
detecting the image processing containers, and determining the image processing containers in an idle state as target image processing containers;
acquiring a target image processing task from a message queue cluster by using the target image processing container;
processing the target image processing task using the target image processing container.
2. The method of claim 1, wherein prior to said fetching a target image processing task from a message queue cluster with the target image processing container, the method further comprises:
receiving an image processing request sent by a client; the image processing request comprises at least one image processing task;
determining GPU virtual resources required by the image processing task;
and storing the image processing task into the message queue cluster, and identifying GPU virtual resources required by the image processing task.
3. The method of claim 2, wherein said fetching a target image processing task from a message queue cluster using the target image processing container comprises:
and the target image processing container judges whether the image processing task is a target image processing task according to the GPU virtual resource required by the image processing task in the message queue cluster.
4. The method of claim 3, wherein the image processing task comprises a first image processing task, the method further comprising:
if the GPU virtual resource required by the first image processing task is matched with the GPU virtual resource corresponding to the target image processing container, determining the first image processing task as the target image processing task;
if the GPU virtual resource required by the first image processing task is not matched with the GPU virtual resource corresponding to the target image processing container, acquiring the GPU virtual resource required by a second image processing task after the first image processing task by using the target image processing container;
and if the GPU virtual resource required by the second image processing task is matched with the GPU virtual resource corresponding to the target image processing container, determining the second image processing task as the target image processing task.
5. The method of any of claims 2-4, wherein the image processing request further carries a priority of the image processing task, and after the storing the image processing task in the message queue cluster, the method further comprises:
and in the message queue cluster, sequencing the image processing tasks according to the priority of the image processing tasks.
6. The method of claim 1, further comprising:
if detecting that a first image processing container in the plurality of image processing containers crashes, reconstructing the first image processing container according to a pre-stored mirror image of the first image processing container;
the reconstructed first image processing container reprocesses the image processing task being processed.
7. The method of claim 1, wherein the plurality of image processing containers comprises at least a first image processing container and a second image processing container, and wherein the first image processing container and the second image processing container are allocated GPU virtual resources which are the same or different.
8. An apparatus for allocating GPU virtual resources, the apparatus comprising:
the system comprises a container setting module, a cloud server cluster and a GPU virtual resource allocation module, wherein the container setting module is used for setting a plurality of image processing containers in the cloud server cluster, and each image processing container is allocated with a GPU virtual resource;
the container detection module is used for detecting the image processing containers and determining the image processing containers in an idle state as target image processing containers;
the task acquisition module is used for acquiring a target image processing task from the message queue cluster by using the target image processing container;
and the task processing module is used for processing the target image processing task by using the target image processing container.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011006550.6A CN112162856A (en) | 2020-09-23 | 2020-09-23 | GPU virtual resource allocation method and device, computer equipment and storage medium |
US17/448,546 US20220091894A1 (en) | 2020-09-23 | 2021-09-23 | System and method for data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011006550.6A CN112162856A (en) | 2020-09-23 | 2020-09-23 | GPU virtual resource allocation method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112162856A true CN112162856A (en) | 2021-01-01 |
Family
ID=73863448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011006550.6A Pending CN112162856A (en) | 2020-09-23 | 2020-09-23 | GPU virtual resource allocation method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112162856A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114281545A (en) * | 2021-12-20 | 2022-04-05 | 中南大学 | Resource allocation method based on time delay relation between CPU (Central processing Unit) resources and image model |
CN114979411A (en) * | 2021-05-06 | 2022-08-30 | 中移互联网有限公司 | Distributed image processing method, device, equipment and system |
CN116361038A (en) * | 2023-06-02 | 2023-06-30 | 山东浪潮科学研究院有限公司 | Acceleration computing management method, system, equipment and storage medium |
CN116756444A (en) * | 2023-06-14 | 2023-09-15 | 北京百度网讯科技有限公司 | Image processing method, device, equipment and storage medium |
CN116757915A (en) * | 2023-08-16 | 2023-09-15 | 北京蓝耘科技股份有限公司 | Cluster GPU resource scheduling method |
CN116820783A (en) * | 2023-08-29 | 2023-09-29 | 中航金网(北京)电子商务有限公司 | Image processing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273211A (en) * | 2017-06-19 | 2017-10-20 | 成都鼎智汇科技有限公司 | Data processing method based on virtual machine under a kind of cloud computing environment |
US20180293700A1 (en) * | 2015-05-29 | 2018-10-11 | Intel Corporation | Container access to graphics processing unit resources |
CN110764901A (en) * | 2019-09-17 | 2020-02-07 | 阿里巴巴集团控股有限公司 | Data processing method based on GPU (graphics processing Unit) resources, electronic equipment and system |
CN111190712A (en) * | 2019-12-25 | 2020-05-22 | 北京推想科技有限公司 | Task scheduling method, device, equipment and medium |
US20200210242A1 (en) * | 2018-12-26 | 2020-07-02 | Lablup Inc. | Method and system for gpu virtualization based on container |
-
2020
- 2020-09-23 CN CN202011006550.6A patent/CN112162856A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180293700A1 (en) * | 2015-05-29 | 2018-10-11 | Intel Corporation | Container access to graphics processing unit resources |
CN107273211A (en) * | 2017-06-19 | 2017-10-20 | 成都鼎智汇科技有限公司 | Data processing method based on virtual machine under a kind of cloud computing environment |
US20200210242A1 (en) * | 2018-12-26 | 2020-07-02 | Lablup Inc. | Method and system for gpu virtualization based on container |
CN110764901A (en) * | 2019-09-17 | 2020-02-07 | 阿里巴巴集团控股有限公司 | Data processing method based on GPU (graphics processing Unit) resources, electronic equipment and system |
CN111190712A (en) * | 2019-12-25 | 2020-05-22 | 北京推想科技有限公司 | Task scheduling method, device, equipment and medium |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114979411A (en) * | 2021-05-06 | 2022-08-30 | 中移互联网有限公司 | Distributed image processing method, device, equipment and system |
CN114979411B (en) * | 2021-05-06 | 2023-07-04 | 中移互联网有限公司 | Distributed image processing method, device, equipment and system |
CN114281545A (en) * | 2021-12-20 | 2022-04-05 | 中南大学 | Resource allocation method based on time delay relation between CPU (Central processing Unit) resources and image model |
CN116361038A (en) * | 2023-06-02 | 2023-06-30 | 山东浪潮科学研究院有限公司 | Acceleration computing management method, system, equipment and storage medium |
CN116361038B (en) * | 2023-06-02 | 2023-09-08 | 山东浪潮科学研究院有限公司 | Acceleration computing management method, system, equipment and storage medium |
CN116756444A (en) * | 2023-06-14 | 2023-09-15 | 北京百度网讯科技有限公司 | Image processing method, device, equipment and storage medium |
CN116757915A (en) * | 2023-08-16 | 2023-09-15 | 北京蓝耘科技股份有限公司 | Cluster GPU resource scheduling method |
CN116757915B (en) * | 2023-08-16 | 2023-11-28 | 北京蓝耘科技股份有限公司 | Cluster GPU resource scheduling method |
CN116820783A (en) * | 2023-08-29 | 2023-09-29 | 中航金网(北京)电子商务有限公司 | Image processing method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112162856A (en) | GPU virtual resource allocation method and device, computer equipment and storage medium | |
US11188392B2 (en) | Scheduling system for computational work on heterogeneous hardware | |
CN105373431B (en) | Computer system resource management method and computer resource management system | |
CN112035238A (en) | Task scheduling processing method and device, cluster system and readable storage medium | |
CN112256417B (en) | Data request processing method and device and computer readable storage medium | |
CN111290838B (en) | Application access request processing method and device based on container cluster | |
US11526377B2 (en) | Method for executing task by scheduling device, and computer device and storage medium | |
CN106713388B (en) | Burst service processing method and device | |
CN110362418A (en) | A kind of abnormal data restoration methods, device, server and storage medium | |
CN111782383A (en) | Task allocation method, server, electronic terminal and computer readable storage medium | |
CN114816709A (en) | Task scheduling method, device, server and readable storage medium | |
CN110659131A (en) | Task processing method, electronic device, computer device, and storage medium | |
CN113626173A (en) | Scheduling method, device and storage medium | |
CN112463361A (en) | Method and equipment for distributing elastic resources of distributed computation | |
CN111709723A (en) | RPA business process intelligent processing method, device, computer equipment and storage medium | |
CN111359205A (en) | Operation method and device of cloud game, computer equipment and storage medium | |
CN108833532B (en) | Service processing method, device and system based on Internet of things | |
CN111506388A (en) | Container performance detection method, container management platform and computer storage medium | |
CN112631577B (en) | Model scheduling method, model scheduler and model safety test platform | |
CN114924888A (en) | Resource allocation method, data processing method, device, equipment and storage medium | |
CN112486502A (en) | Distributed task deployment method and device, computer equipment and storage medium | |
CN114564281A (en) | Container scheduling method, device, equipment and storage medium | |
CN113742059A (en) | Task allocation method and device, computer equipment and storage medium | |
CN112100017A (en) | Memory resource monitoring method and device | |
CN116860464B (en) | Load resource allocation method and device, storage medium and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |