CN115599533A

CN115599533A - Task processing method, device, equipment and storage medium

Info

Publication number: CN115599533A
Application number: CN202110768245.9A
Authority: CN
Inventors: 查冲
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-07-07
Filing date: 2021-07-07
Publication date: 2023-01-13

Abstract

The embodiment of the application provides a task processing method, a task processing device and a task processing storage medium, and is applicable to the fields of artificial intelligence, computer technology, cloud technology and the like. The method comprises the following steps: determining a GPU resource set of a graphics processor, wherein the GPU resource set comprises a plurality of GPU resources; determining the data processing performance and the scene test performance of each GPU resource, and determining the task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource; based on the task processing performance of each GPU resource, putting each GPU resource into a corresponding logic resource pool; the method comprises the steps of obtaining a task to be processed, determining expected processing time of the task to be processed, determining target GPU resources from all logic resource pools based on the expected processing time, and processing the task to be processed based on the target GPU resources. By adopting the embodiment of the application, the GPU resource utilization rate can be improved, the task processing efficiency is improved, and the applicability is high.

Description

Task processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for task processing.

Background

With the continuous development of computer technology, the amount of computing tasks is also rapidly increasing, the demand for computing resources is also correspondingly increasing, and Graphics Processing Unit (GPU) resources are becoming the main choice of users as the key resources for application computing.

At present, a user often selects a GPU resource through a GPU resource platform to perform task processing, but due to the reasons of model, memory, network configuration, and the like, task processing performance of different GPU resources is uneven. When a user selects a GPU resource, the GPU resource with high task processing performance is generally selected by priority to perform task processing, so that the GPU resource is unevenly distributed, and problems of waste of the GPU resource with weak task processing performance, untimely task processing, and the like are caused.

Disclosure of Invention

The embodiment of the application provides a task processing method, a task processing device and a storage medium, which can improve the utilization rate of GPU resources and the task processing efficiency and are high in applicability.

In one aspect, an embodiment of the present application provides a task processing method, where the method includes:

determining a GPU resource set of a graphic processor, wherein the GPU resource set comprises a plurality of GPU resources;

determining the data processing performance and the scene test performance of each GPU resource, and determining the task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource;

based on the task processing performance of each GPU resource, putting each GPU resource into a corresponding logic resource pool;

and acquiring a task to be processed, determining the expected processing time of the task to be processed, determining a target GPU resource from each logic resource pool based on the expected processing time, and processing the task to be processed based on the target GPU resource.

In another aspect, an embodiment of the present application provides a task processing device, where the task processing device includes:

a resource determining module, configured to determine a GPU resource set of a graphics processor, where the GPU resource set includes a plurality of GPU resources;

a performance determining module, configured to determine data processing performance and scene test performance of each GPU resource, and determine task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource;

the resource processing module is used for putting each GPU resource into a corresponding logic resource pool based on the task processing performance of each GPU resource;

and the task processing module is used for acquiring the tasks to be processed, determining the expected processing time of the tasks to be processed, determining target GPU resources from the logic resource pools based on the expected processing time, and processing the tasks to be processed based on the target GPU resources.

In another aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the processor and the memory are connected to each other;

the memory is used for storing computer programs;

the processor is configured to execute the task processing method provided by the embodiment of the application when the computer program is called.

In another aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the task processing method provided by the embodiment of the present application.

In another aspect, the present application provides a computer program product or a computer program, which includes computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the task processing method provided by the embodiment of the application.

In the embodiment of the application, the task processing performance of each GPU can be obtained by determining the data processing performance and the scene test performance of each GPU resource, and further each GPU can be placed into the corresponding logic resource pool according to the task processing performance. Furthermore, after the to-be-processed task is obtained, the corresponding target GPU resource can be determined based on the predicted processing time of the to-be-processed task, the situation that the to-be-processed task is not processed timely or partial GPU resources are used in a centralized mode is avoided, the utilization rate of the GPU resources is improved, the task processing efficiency is improved, and the applicability is high.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic view of a scenario of a task processing method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a task processing method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of another scenario of a task processing method provided in an embodiment of the present application;

FIG. 4 is another schematic flow chart diagram of a task processing method provided in an embodiment of the present application;

FIG. 5 is a schematic flowchart of a task processing method provided in an embodiment of the present application;

FIG. 6 is a flowchart illustrating task processing based on the number of GPUs according to an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of a task processing device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, fig. 1 is a schematic view of a scenario of a task processing method according to an embodiment of the present application. As shown in FIG. 1, after determining the set of GPU resources, the GPU resources in the set of GPUs can be placed into different logical resource pools. Specifically, the data processing performance and the scenario test performance of each GPU resource may be determined, and the task processing performance of each GPU resource may be determined based on the data processing performance and the scenario test performance of each GPU resource. If the GPU resource 400 is one GPU resource in the GPU resource set, for the GPU resource 400, the data processing performance 501 and the scenario test performance 502 of the GPU resource 400 can be determined, and then the task processing performance 600 of the GPU resource 400 can be determined based on the data processing performance 501 and the scenario test performance 502 of the GPU resource 400.

Further, a logic resource pool corresponding to each GPU resource is determined based on the task processing performance of each GPU resource. Also for the GPU resource 400, a corresponding logical resource pool may be determined based on the task processing performance 600 of the GPU resource 400, as shown in fig. 1, the logical resource pool corresponding to the GPU resource 400 is the logical resource pool 301, and the GPU400 is then placed in the logical resource pool 301.

When the task to be processed is acquired, that is, the task to be processed needs to be processed, the target GPU resource corresponding to the task to be processed can be determined from each logic resource pool, and the task to be processed is processed based on the target GPU resource. After the task 100 to be processed is obtained as shown in fig. 1, the target GPU resource 200 corresponding to the task 100 to be processed may be determined from the logical resource pool 302, and the task 100 to be processed is processed based on the target GPU resource 200.

The GPU resource in the embodiment of the present application may be a graphics processor, or may also be a chip, a device, and an apparatus having the same function as the graphics processor, such as a video card, a GPU card, and the like, which may be determined based on the actual application scene requirement, and is not limited herein. The graphics processor is also called a display core, a visual processor, and a display chip, and is a microprocessor specially used for image operation on a personal computer, a workstation, a game machine, and some mobile devices (such as a tablet computer and a mobile phone).

The task processing method provided by the embodiment of the application can be applied to various fields such as artificial intelligence, computer technology and Cloud technology, for example, human-computer interaction, cloud computing (Cloud computing), artificial intelligence Cloud service and the like, and aims to improve task (or data) processing efficiency while reasonably utilizing GPU resources. The task to be processed may be a training task based on a neural network, or may also be a related data processing task based on image processing, voice processing, natural language processing, and the like of the neural network, and may be determined specifically based on an actual application scenario, which is not limited herein.

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The task to be processed as in the embodiment of the present application may be related data processing tasks such as image processing based on a neural network, voice processing, natural Language Processing (NLP), and the like.

Natural language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language people use daily, so it has a close relation with the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question answering, and the like.

The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data. The data processing performance, the scene testing performance, the task processing performance and the like of each GPU resource in the task processing method can be achieved based on cloud computing in the cloud technology.

Cloud Computing refers to obtaining required resources in an on-demand and easily-extensible manner through a Network, and is a product of development and fusion of traditional computers and Network Technologies, such as Grid Computing (Grid Computing), distributed Computing (Distributed Computing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network Storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load Balance), and the like.

Referring to fig. 2, fig. 2 is a schematic flow chart of a task processing method provided in the embodiment of the present application. As shown in fig. 2, a task processing method provided in an embodiment of the present application may include the following steps:

and S21, determining a GPU resource set, wherein the GPU resource set comprises a plurality of GPU resources.

In some possible embodiments, the GPU resource set includes multiple GPU resources, such as multiple GPU resources with different card types, multiple GPU resources with different memory and different network configurations, which may be determined based on the actual application scene requirements, and is not limited herein.

The GPU resource may be a graphics processor, or may be a chip, a device, and an apparatus having the same function as the graphics processor, such as a video card, a GPU card, and the like, and may be specifically determined based on the actual application scene requirement, which is not limited herein.

As an example, the GPU resource set comprises GPU cards with different models.

And S22, determining the data processing performance and the scene test performance of each GPU resource, and determining the task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource.

In some possible embodiments, the data processing performance of each GPU resource may be determined by a variety of neural network models. Specifically, for each GPU resource, the model operation performance of the GPU resource corresponding to each neural network model may be determined, that is, the GPU resource runs through different neural network models to obtain the performance value corresponding to the GPU resource when running different neural network models, and the performance value corresponding to the GPU resource when running different neural network models is determined as the model operation performance value corresponding to the GPU resource corresponding to different neural network models.

The Neural network models are basic models constructed based on a Neural network and used for processing a task or realizing a function, and include, but are not limited to, models constructed in one or more of a Recurrent Neural Network (RNN), a Long Short-Term Memory artificial Neural network (LSTM), a Gated Recurrent Unit (GRU) and a transform structure, such as a ResNet-50 and a transform-based Bidirectional Encoder (BERT) model, and the like, and may be determined based on actual application scene requirements, without limitation.

The model operation performance corresponding to each GPU resource is used for representing the basic performance of the neural network model corresponding to the GPU resource in operation.

Further, for each GPU resource, after determining the model operating performance of the GPU resource corresponding to each neural network model, the data processing performance of the GPU resource may be determined based on the model operating performance of the GPU resource corresponding to each neural network model.

For each GPU resource, the data processing performance of the GPU resource is used for representing the comprehensive capability of the GPU resource for operating the neural network model, the higher the data processing performance of the GPU resource is, the better the effect of the GPU resource for operating the majority of the neural network models is shown, and the lower the data processing performance of the GPU resource is, the poorer the effect of the GPU resource for operating the majority of the neural network models is shown.

For each GPU resource, after determining the model operation performance of the GPU resource corresponding to each neural network model, the average model operation performance of the GPU resource corresponding to each neural network model may be determined as the data processing performance of the GPU resource. As an example, the model operation performance of the GPU resource corresponding to each neural network model is M1, M2.. Mk, respectively, then the data processing performance of the GPU resource is M = (M1 + M2+.. + Mk)/k, where k is the model index of each neural network model.

Optionally, for each GPU resource, after determining the model operation performance of the GPU resource corresponding to each neural network model, the sum of the model operation performance of the GPU resource corresponding to each neural network model may be determined as the data processing performance of the GPU resource. As an example, the model operation performance of the GPU resource corresponding to each neural network model is M1, M2.. Mk, respectively, then the data processing performance of the GPU resource is M = M1+ M2+.. + Mk, where k is the model index of each neural network model.

Optionally, for each GPU resource, after determining that the GPU resource corresponds to the model operation performance of each neural network model, the model weight of each neural network model may be determined, and then the data processing performance of the GPU may be determined based on the model operation performance of the GPU resource corresponding to each neural network model and the model weight of each neural network model. As an example, the GPU resource has model operation performances of M1, M2.. Mk corresponding to the neural network models, and model weights of the neural network models are P1, P2.. Pk, respectively, then the data processing performance M = (M1 × P1+ M2 = P2+.. + Mk × Pk) of the GPU resource is obtained, where k is the model index of each neural network model.

In some possible embodiments, when determining the scenario test performance of each GPU resource, the scenario test performance may be determined through a variety of scenario test cases. Specifically, for each GPU resource, the data throughput of the GPU resource corresponding to each scenario test case may be determined first, that is, the number throughput of the GPU resource in different scenarios is determined through different scenario test cases.

Each scene test case corresponds to an application scene, and the application scene includes, but is not limited to, image processing, natural language processing, speech processing, and the like, and may be determined based on the requirements of the actual application scene, which is not limited herein.

The scenario test case may be a test tool, software, a model, and the like corresponding to different application scenarios, and may be specifically determined based on requirements of an actual application scenario, which is not limited herein.

As an example, the above scenario test cases are Neural network models constructed based on a Neural network and corresponding to different application scenarios, and include, but are not limited to, neural network models constructed based on one or more of a cyclic Neural network (RNN), a Long Short-Term Memory artificial Neural network (LSTM), a Gated cyclic Unit (GRU), and a transform structure, such as a Neural network model used in a speech recognition scenario, a Neural network model used in an image processing scenario, and the like.

Further, for each GPU resource, after determining the data throughput of the GPU resource corresponding to each scenario test case, the scenario test performance of the GPU resource may be determined based on the data throughput of the GPU resource corresponding to each scenario test case.

For each GPU resource, the scene test performance of the GPU resource is used for representing the comprehensive capability of the GPU resource operation scene test case, the higher the scene test performance of the GPU resource is, the GPU resource operation can be suitable for various application scenes, and the lower the scene test performance of the GPU resource is, the GPU resource can only be suitable for less application scenes.

For each GPU resource, after determining the data throughput of the GPU resource corresponding to each scenario test case, the average data throughput of the GPU resource corresponding to each scenario test case may be determined as the scenario test performance of the GPU resource. As an example, the data throughput of the GPU resource corresponding to each scenario test case is N1, N2.. Nr, respectively, then the scenario test performance N = (N1 + N2+.. + Nr)/r of the GPU resource, where r is the model index of each scenario test case.

Optionally, for each GPU resource, after determining the data throughput of the GPU resource corresponding to each scenario test case, the sum of the scenario test performances of the GPU resource corresponding to each scenario test case may be determined as the scenario test performance of the GPU resource. As an example, the data throughput of the GPU resource corresponding to each scenario test case is N1, N2.. Nr, and then the scenario test performance N = N1+ N2+.. + Nr of the GPU resource, where r is the model index of each scenario test case.

Optionally, for each GPU resource, after determining the data throughput of the GPU resource corresponding to each scenario test case, the model weight of each scenario test case may be determined, and then the scenario test performance of the GPU may be determined based on the data throughput of the GPU resource corresponding to each scenario test case and the scenario weight of each scenario test case. As an example, the data throughputs of the GPU resources corresponding to the scenario test cases are N1, N2.. Nr, respectively, and the scenario weights of the scenario test cases are Q1, Q2.. Qr, respectively, then the scenario test performance R = (N1 × Q1+ N2 × Q2+.. + Nr × Qr) of the GPU resources is obtained, where R is the model index of the scenario test cases.

In some possible embodiments, for each GPU resource, the task processing performance corresponding to the GPU resource is used to represent the computing power (computing power) of the GPU resource, and the higher the task processing performance corresponding to the GPU resource is, the stronger the computing power of the GPU resource is, the higher the efficiency when processing the task based on the GPU resource is, the lower the task processing performance corresponding to the GPU resource is, the lower the computing power of the GPU resource is, the lower the efficiency when processing the task based on the GPU resource is.

Specifically, when determining the task processing performance of each GPU resource based on the data processing performance and the scenario test performance of each GPU resource, for each GPU resource, the sum of the data processing performance and the scenario test performance of the GPU resource may be determined as the task processing performance of the GPU resource.

As an example, if the data processing performance of each GPU resource is M and the scenario test performance of the GPU resource is N, the task processing performance of the GPU is F = M + N.

Optionally, for each GPU resource, a weight of data processing performance (hereinafter referred to as a first weight for convenience of description) and a weight of scenario test performance (hereinafter referred to as a second weight for convenience of description) may also be determined, and further, based on the data processing performance and the scenario test performance of the GPU resource, and the first weight of data processing performance and the second weight of scenario test performance, the task processing performance of the GPU resource may be determined.

The first weights corresponding to the data processing performances of different GPU resources are the same, and the second weights corresponding to the scene testing performances of different GPU resources are the same. And the specific values of the first weight and the second weight may be determined based on the requirements of the actual application scenario, which is not limited herein.

As an example, for each GPU resource, if the data processing performance of the GPU resource is M, the first weight corresponding to the data processing performance is a, the scenario test performance of the GPU resource is N, and the second weight corresponding to the scenario test performance is B, the task processing performance of the GPU is F = a × M + B × N.

Optionally, after determining the data processing performance and the scene test performance of each GPU resource, a reference GPU resource may also be determined, and further, the task processing performance of each GPU resource is determined based on the reference data processing performance and the reference scene test performance of the reference GPU resource.

The reference GPU resource may be a GPU resource with the largest number in the GPU resource set, may be a preset GPU resource, and may be specifically determined based on an actual situation, which is not limited herein.

The reference data processing performance and the reference scenario test performance of the reference GPU resource may be preset data processing performance and preset scenario test performance, or may be determined based on the way of determining that the data processing performance and the scenario test performance of each GPU are the same, which is not limited herein.

Further, for each GPU resource, a ratio of the data processing performance of the GPU resource to the reference data processing performance (hereinafter, referred to as a first ratio for convenience of description) and a ratio of the scenario test performance of the GPU resource to the reference scenario test performance (hereinafter, referred to as a second ratio for convenience of description) may be determined. And determining the task processing performance of the GPU resource based on the data processing performance and the scene test performance of the GPU resource, the first proportion and the second proportion corresponding to the GPU resource and the first weight and the second weight corresponding to the GPU resource.

For each GPU resource, a first proportion of the data processing performance of the GPU resource to the reference data processing performance and a second proportion of the scene test performance of the GPU resource to the reference scene test performance can be determined through a preset model.

As an example, if the first weight of the data processing performance is a, the second weight of the scenario test performance is B, the reference data processing performance of the reference GPU resource is G, and the reference scenario test performance of the reference GPU resource is H. For any GPU resource, under the condition that the first ratio of the data processing performance of the GPU resource to the reference data processing performance is X, and the second ratio of the scene test performance of the GPU resource to the reference scene test performance is Y, the task processing performance of the GPU resource is F = A X + B H Y.

And S23, putting each GPU resource into a corresponding logic resource pool based on the task processing performance of each GPU resource.

In some possible embodiments, without limiting the number of the logical resource pools, one logical resource pool may correspond to one task processing performance, that is, one logical resource pool includes one or more GPU resources with the same task processing performance. And further arranging the logical resource pools according to the order of high-to-low or high-to-low task processing performance.

Optionally, in a case where the number of the logical resource pools is limited, the GPU resources may be arranged in an order from high to low in task processing performance, so that the GPU resources are divided into different subsets of GPU resources according to the density of distribution of the task processing performance, and the GPU resources in each subset of GPU resources are determined as the GPU resources in one logical resource pool. And the logical resource pools can be arranged in order of high to low or high to low task processing performance.

Optionally, a performance interval of the task processing performance corresponding to each logical resource pool may also be determined. For each logical resource pool, the GPU resource whose task processing performance is within the performance interval of the logical resource pool may be determined as the GPU resource in the logical resource pool. The logical resource pools may also be arranged in order from high to low or from high to low in task processing performance, and the performance interval corresponding to each logical resource pool may be determined based on actual needs, which is not limited herein. Based on the implementation mode, the GPU resources can be grouped into the logic resource pools according to certain task processing performance, for example, three logic resource pools including high-performance, medium-performance and low-performance GPU resources can be obtained.

And S24, acquiring the tasks to be processed, determining the expected processing time of the tasks to be processed, determining target GPU resources from each logic resource pool based on the expected processing time, and processing the tasks to be processed based on the target GPU resources.

In some possible embodiments, the to-be-processed task may be a training task based on a neural network, such as a task in which a user trains an initial neural network model for a training sample set. As an example, the to-be-processed task may be a training task for training an initial translation model based on a training text set.

Optionally, the task to be processed in this embodiment of the application may also be a related data processing task such as image processing, voice processing, natural language processing, and the like based on a neural network, for example, a processing task for rendering a game screen based on a neural network model, which may be determined specifically based on an actual application scenario, and is not limited herein.

In some possible embodiments, after the task to be processed is obtained, the expected processing time of the task to be processed, i.e., the time expected to be required for the task to be processed to start processing and to complete processing, may be determined.

Specifically, a processed task that is the same as or similar to a task of the to-be-processed task may be determined, and an actual processing time of the processed task may be acquired, thereby determining an expected processing time of the to-be-processed task based on the actual processing time of the processed task.

As an example, the actual processing time of the processed task, which is the same as or similar to the to-be-processed task, may be determined as the predicted processing time of the to-be-processed task.

The processed tasks that are the same as or similar to the tasks of the to-be-processed tasks include, but are not limited to, the processed tasks that have the same or similar task types as or to the to-be-processed tasks, the processed tasks that have the same or similar model structures as or to the to-be-processed tasks, the processed tasks that have the same or similar data volumes as or to the to-be-processed tasks, and one or more of the processed tasks that are the same as or similar to the to-be-processed tasks, which may be determined based on the actual application scenarios, and are not limited herein.

As an example, a processed task having the same model structure, data amount, and task type as a pending task may be determined as the same processed task as the pending task.

Optionally, when determining the predicted processing time of the to-be-processed task, the predicted processing time of the to-be-processed task corresponding to different GPU resources, that is, the time predicted to be required for processing the to-be-processed task based on the different GPU resources, may also be determined. In other words, in this case, the predicted time of the task to be processed includes the predicted processing time corresponding to the different GPU resources.

The predicted processing time of each GPU resource corresponding to the task to be processed can be determined based on the actual processing time of the processed task processed by each GPU.

In some possible embodiments, the to-be-processed task often includes multiple phase tasks in practical applications, and therefore if the predicted processing time of the to-be-processed task is directly determined, it may be possible that the accuracy of the predicted processing time of the to-be-processed task is low due to the complexity of the multiple phase tasks. Based on this, when the predicted processing time of the task to be processed is determined, the stage predicted processing time corresponding to each stage task of the task to be processed can be determined, and then the sum of the stage predicted processing times of the task to be processed is determined as the predicted processing time of the task to be processed.

The determination method of the phase expected processing time of each phase task of the task to be processed may adopt any one of the determination methods described above, and is not limited herein.

In some possible embodiments, the logical resource pool may be each logical resource pool in the GPU container, or may also be a logical resource pool in other containers, task processing systems, task processing platforms, and the like, which may be determined based on requirements of an actual application scenario, and is not limited herein. And each logical resource pool includes at least one GPU resource. The logical resource pool in the request processing method provided by the embodiment of the application may be run by a server, and the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services.

As an example, the logical resource pool in the embodiment of the present application may include GPU resources for processing tasks corresponding to the cloud game, and the logical resource pool may be a GPU container run by the cloud server.

The cloud game may also be called a game on demand (gaming), which is an online game technology based on a cloud computing technology. Cloud game technology enables light-end devices (thin clients) with relatively limited graphics processing and data computing capabilities to run high-quality games. In a cloud game scene, a game is not executed at a game terminal of a player, but is executed in a cloud server, the cloud server renders the game scene into a video and audio stream, and the video and audio stream is transmitted to the game terminal of the player through a network. The game terminal of the player does not need to have strong graphic operation and data processing capacity, and only needs to have basic streaming media playing capacity and capacity of acquiring input instructions of the player and sending the input instructions to the cloud server.

Specifically, the predicted processing time of the to-be-processed task may include predicted processing times (hereinafter, referred to as a first predicted processing time for convenience of description) of the to-be-processed task corresponding to different GPU resources, that is, a first predicted processing time corresponding to the to-be-processed task is a predicted completion time for processing the to-be-processed task based on the corresponding GPU resource. Based on the determination, the average processing time of the tasks to be processed corresponding to the first expected processing time of different GPU resources can be determined, the logic resource pool comprising GPU resources with the expected processing time smaller than the average processing time is determined as a target logic resource pool, any GPU resource is determined as a target GPU resource by the target logic resource pool, any GPU resource in an idle state in the target logic resource pool is determined as a target GPU resource, or the GPU resource with the highest task processing performance and processing the idle state in the target logic resource pool is determined as the target GPU resource.

Optionally, the number of idle GPU resources in each logical resource pool may be determined, and a target logical resource pool in which the number of idle GPU resources is greater than the first threshold value is determined. And further determining a target GPU resource with the minimum average processing time of the first estimated processing time from the GPU resources of each target logic resource pool, and further determining any GPU resource in the target logic resource pool as the target GPU resource, or determining any GPU resource in an idle state in the target logic resource pool as the target GPU resource, or determining the GPU resource which has the highest task processing performance and processes the idle state in the target logic resource pool as the target GPU resource. Therefore, the problem that the GPU resources with higher task processing performance are not in an idle state to cause queuing of the tasks to be processed can be avoided, and reasonable distribution of the GPU resources is realized.

Optionally, a first expected processing time of each GPU resource in each logical resource pool corresponding to the task to be processed may also be determined, and for each logical resource pool, the logical resource pool may be determined as the target logical resource pool based on an average processing time of the first expected processing time corresponding to each GPU resource in the logical resource pool, if the average processing time of the logical resource pool is less than a second threshold. And further determining any GPU resource in the target logic resource pool as a target GPU resource, or determining any GPU resource in an idle state in the target logic resource pool as the target GPU resource, or determining the GPU resource which has the highest task processing performance and processes the idle state in the target logic resource pool as the target GPU resource. Based on the method, the target logic resource pool of the GPU resource which processes the task to be processed quickly can be determined, and therefore any GPU resource in the target logic resource pool in an idle state can be determined to be the target GPU resource.

Further, after the target GPU resource is determined, the pending task may be processed based on the target GPU resource.

The task processing method provided in the embodiment of the present application is further described below with reference to fig. 3. Referring to fig. 3, fig. 3 is a schematic view of another scenario of a task processing method according to an embodiment of the present application. After the to-be-processed task is obtained, the expected processing time of the to-be-processed task can be determined, and then the target logic resource pool is determined from the plurality of logic resource pools based on the expected processing time of the to-be-processed task. And further determining a target GPU resource from the target logic resource pool so as to process the task to be processed based on the target GPU resource.

In some possible implementations, if the pending task includes a multi-stage task, a stage projected processing time for each stage task of the pending task may be determined. And for each stage task, determining a target GPU resource corresponding to the stage task from each logic resource pool based on the stage predicted processing time corresponding to the stage task, and processing the stage task based on the target GPU resource corresponding to the stage task. Based on the mode, the multi-stage task of the task to be processed can be processed based on the multiple target GPU resources, and the task processing efficiency is improved.

Wherein, for each phase task, based on the phase predicted processing time of the phase task, a target logical resource pool matching the phase predicted processing time is determined. And determining the target logic resource pool from each logic resource pool based on the positive correlation between the stage estimated processing time and the task processing performance interval of each logic resource pool.

Or, for each stage task, the stage predicted processing time of the stage task may include predicted processing times of the stage task corresponding to different GPU resources, and then a target logic resource pool is determined from each logic resource pool based on the predicted processing times of the stage task corresponding to different GPU resources, and specific implementation manners are not described herein again.

Further, for each stage task, any GPU resource in the target logic resource pool corresponding to the stage task may be determined as the target GPU resource corresponding to the stage task, or any GPU resource in the idle state in the target logic resource pool corresponding to the stage task may be determined as the target GPU resource corresponding to the stage task, or the GPU resource in the target logic resource pool with the highest task processing performance and processing the idle state may be determined as the target GPU resource corresponding to the stage task.

Further, after the target GPU resources corresponding to each phase task are determined, the corresponding phase task may be processed based on the target GPU resources corresponding to each phase task.

In the embodiment of the application, the task processing performance of each GPU can be obtained by determining the data processing performance and the scene testing performance of each GPU resource, and further each GPU can be placed into a corresponding logic resource pool according to the task processing performance. Furthermore, after the task to be processed is obtained, the corresponding target GPU resource can be determined based on the predicted processing time of the task to be processed, the situation that the task to be processed is not processed timely or partial GPU resources are used in a centralized mode is avoided, the utilization rate of the GPU resources is improved, the task processing efficiency is improved, and the applicability is high.

Further, after the to-be-processed task is processed based on the target GPU resource, the scenario test case for determining the GPU resource may also be adjusted, and a specific adjustment manner may be as shown in fig. 4. Fig. 4 is another schematic flowchart of a task processing method provided in an embodiment of the present application, and as shown in fig. 4, the task processing method provided in the embodiment of the present application may include the following steps:

and S41, determining the actual data throughput of the task to be processed.

In some possible embodiments, the actual data throughput of the to-be-processed task is a data throughput corresponding to the target GPU when processing the to-be-processed task.

And S42, adjusting the scene test case corresponding to the task to be processed based on the actual data throughput.

In some feasible embodiments, after the actual data throughput of the to-be-processed task is obtained, the scenario test case with the same scenario as the to-be-processed task may be adjusted based on the actual throughput of the to-be-processed task, for example, relevant parameters of the scenario test case and the like are updated, so that when the scenario test performance of a new to-be-processed task is determined, the data throughput determined based on the scenario test case with the same scenario is more accurate.

In the embodiment of the application, the task processing performance of each GPU can be obtained by determining the data processing performance and the scene testing performance of each GPU resource, and further each GPU can be placed into a corresponding logic resource pool according to the task processing performance. Furthermore, after the task to be processed is obtained, the corresponding target GPU resource can be determined based on the predicted processing time of the task to be processed, the situation that the task to be processed is not processed timely or partial GPU resources are used in a centralized mode is avoided, the utilization rate of the GPU resources is improved, the task processing efficiency is improved, and the applicability is high. Meanwhile, the scene test case is adjusted through the actual data throughput obtained after the task to be processed is processed every time, so that the scene test performance determined based on the adjusted scene test case is more accurate, and the applicability is higher.

Further, after the to-be-processed task is processed based on the target GPU resource, other tasks may also be processed based on the actual processing time of the to-be-processed task, and a specific implementation manner may be as shown in fig. 5. Fig. 5 is a schematic flowchart of another task processing method provided in an embodiment of the present application, and as shown in fig. 5, the task processing method provided in the embodiment of the present application may include the following steps:

and S51, determining the actual processing time of the task to be processed.

In some possible embodiments, the actual processing time of the pending task is based on the time consumed by the target GPU in processing the pending task. Alternatively, the actual processing time of the task to be processed may include the actual processing time of each stage task of the task to be processed.

And S52, storing the actual processing time to determine the predicted processing time of other tasks based on the actual processing time when other tasks same as the tasks to be processed are processed.

In some possible embodiments, after the actual processing time corresponding to the task to be processed is obtained, the actual processing time corresponding to the task to be processed may be stored, so that when other tasks that are the same as the task to be processed are obtained and processed, the predicted processing time of the other tasks may be determined based on the actual processing time of the task to be processed. For example, for a task, if the task is the same as or similar to the task to be processed, the actual processing time of the task to be processed corresponding to the target GPU resource may be determined as the predicted processing time of the task corresponding to the target GPU resource.

Alternatively, if the actual processing time of the task to be processed includes the actual processing time of each stage task, the actual processing time of each stage task may be stored, so that when other tasks identical to any stage task are processed, the actual processing time of the stage task may be determined as the predicted processing time of the other tasks. Alternatively, when processing another task, the phase predicted processing time of each phase task of another task may be determined based on the stored actual processing time of each phase task.

In some possible embodiments, in any of the task processing methods in fig. 2, 4, and 5, after the to-be-processed task is acquired, if the to-be-processed task needs to be processed by multiple GPU resources, it may be determined whether a sufficient number of GPU resources exist in each logical resource pool to process the to-be-processed task, if so, it may be determined that the to-be-processed task is scheduled to be processed, and if not, processing of the to-be-processed task is stopped until there are a sufficient number of GPU resource locations in the logical resource pool.

Referring to fig. 6, fig. 6 is a schematic flowchart of task processing based on the number of GPUs according to the embodiment of the present application. As shown in fig. 6, after the to-be-processed task is acquired, it may be determined whether the number of GPU resources in each logic resource pool meets the task requirement, that is, whether there are enough GPU resources to process the to-be-processed task. And if the number of the GPU resources meets the task requirement, determining the predicted processing time of the task to be processed, determining the GPU resources corresponding to the task to be processed from each logic resource pool based on the predicted processing time, and processing the task to be processed based on the target GPU resources. And if the GPU resource quantity does not meet the task requirement, stopping processing the task to be processed.

In the embodiment of the application, after the actual processing time and the actual data throughput of the to-be-processed task are obtained, the actual processing time and the actual data throughput of the to-be-processed task may be stored in a database, a data warehouse and a block chain, or the actual processing time and the actual data throughput of the to-be-processed task may be stored through functions such as cluster application, a grid technology and a distributed storage file system based on big data, a cloud storage technology and the like.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A blockchain is essentially a decentralized database, a series of blocks of data that are generated using cryptographic methods to correlate, each block of data being used to store data. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.

In the embodiment of the application, the task processing performance of each GPU can be obtained by determining the data processing performance and the scene test performance of each GPU resource, and further each GPU can be placed into the corresponding logic resource pool according to the task processing performance. Furthermore, after the task to be processed is obtained, the corresponding target GPU resource can be determined based on the predicted processing time of the task to be processed, the situation that the task to be processed is not processed timely or partial GPU resources are used in a centralized mode is avoided, the utilization rate of the GPU resources is improved, the task processing efficiency is improved, and the applicability is high. Meanwhile, the actual processing time obtained after the task to be processed is processed every time is stored, and the estimated processing time of a new task to be processed can be better determined when the new task to be processed is processed, so that the task processing efficiency is further improved.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a task processing device according to an embodiment of the present application. The task processing device provided by the embodiment of the application comprises:

a resource determining module 71, configured to determine a GPU resource set of the graphics processor, where the GPU resource set includes a plurality of GPU resources;

a performance determining module 72, configured to determine data processing performance and scene test performance of each GPU resource, and determine task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource;

a resource processing module 73, configured to put each GPU resource into a corresponding logical resource pool based on task processing performance of each GPU resource;

and a task processing module 74, configured to obtain a task to be processed, determine an expected processing time of the task to be processed, determine a target GPU resource from each of the logical resource pools based on the expected processing time, and process the task to be processed based on the target GPU resource.

In some possible embodiments, the performance determining module 72 is configured to:

and for each GPU resource, determining the model operation performance of the GPU resource corresponding to each neural network model, and determining the data processing performance of the GPU resource based on the model operation performance corresponding to the GPU resource.

and for each GPU resource, determining the data throughput of each scene test case corresponding to the GPU resource, and determining the scene test performance of the GPU resource based on each data throughput corresponding to the GPU resource.

In some possible embodiments, the performance determination module 72 is configured to:

determining a first weight of the data processing performance and a second weight of the scene test performance;

and determining the task processing performance of each GPU resource based on the data processing performance and the scene test performance of each GPU resource, the first weight and the second weight.

determining a reference GPU resource, and determining the reference data processing performance and the reference scene testing performance of the reference GPU resource;

determining a first ratio of the data processing performance of each GPU resource to the reference data processing performance and a second ratio of the scene test performance of each GPU resource to the reference scene test performance;

and for each GPU resource, determining the task processing performance of the GPU resource based on the data processing performance and the scene test performance of the GPU resource, the first proportion and the second proportion corresponding to the GPU resource, and the first weight and the second weight.

In some possible embodiments, the predicted processing time includes a stage predicted processing time of each stage task of the to-be-processed task;

the task process 74 is configured to:

and for each stage task, determining a target GPU resource corresponding to the stage task from each logic resource pool based on the stage predicted processing time corresponding to the stage task, and processing the stage task based on the target GPU resource corresponding to the stage task. In some possible embodiments, the predicted processing time includes a first predicted processing time of the task to be processed corresponding to each of the GPU resources;

the task process 74 is configured to:

determining an average processing time of each first expected processing time, and determining a logic resource pool comprising first GPU resources as a target resource pool, wherein the first expected processing time of the first GPU resources corresponding to the task to be processed is less than the average processing time;

and determining any GPU resource in the target resource pool as a target GPU resource.

In some possible embodiments, the performance determination module 72 is further configured to:

determining the actual data throughput of the task to be processed;

and adjusting the scene test case corresponding to the task to be processed based on the actual data throughput.

determining the actual processing time of the task to be processed;

and storing the actual processing time so as to determine the predicted processing time of other tasks based on the actual processing time when the other tasks identical to the task to be processed are processed.

In some possible embodiments, each of the above logical resource pools is a logical resource pool in a GPU container.

In a specific implementation, the apparatus may execute, through each built-in functional module thereof, the implementation manners provided in each step in fig. 2, fig. 4, and/or fig. 5, which may be referred to specifically for the implementation manners provided in each step, and are not described herein again.

Referring to fig. 8, fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. As shown in fig. 8, the electronic device 800 in the present embodiment may include: the processor 801, the network interface 804 and the memory 805, the electronic device 800 may further include: a user interface 803, and at least one communication bus 802. The communication bus 802 is used to realize connection communication among these components. The user interface 803 may include a Display (Display) and a Keyboard (Keyboard), and the optional user interface 803 may also include a standard wired interface and a standard wireless interface. The network interface 804 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 804 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 805 may optionally be at least one memory device located remotely from the processor 801. As shown in fig. 8, the memory 805, which is a type of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.

In the electronic device 800 shown in fig. 8, the network interface 804 may provide network communication functions; and the user interface 803 is primarily an interface for providing input to a user; and the processor 801 may be used to invoke the device control application stored in the memory 805 to implement:

In some possible embodiments, the processor 801 is configured to:

determining a first weight of the data processing performance and a second weight of the scene testing performance;

In some possible implementations, the processor 801 is configured to:

the processor 801 is configured to:

determining an average processing time of each first expected processing time, and determining a logic resource pool comprising first GPU resources as a target resource pool, wherein the first expected processing time of the first GPU resources corresponding to the to-be-processed task is less than the average processing time;

In some possible implementations, the processor 801 is configured to:

determining the actual data throughput of the task to be processed;

In some possible implementations, the processor 801 is configured to:

determining the actual processing time of the task to be processed;

It should be appreciated that in some possible implementations, the processor 801 may be a Central Processing Unit (CPU), and the processor may be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field-programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The memory may include both read-only memory and random access memory, and provides instructions and data to the processor. A portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.

In a specific implementation, the electronic device 800 may execute, through each built-in functional module thereof, the implementation manners provided in each step in fig. 2, fig. 4, and/or fig. 5, which may be referred to specifically for the implementation manners provided in each step, and are not described herein again.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and is executed by a processor to implement the method provided in each step in fig. 2, fig. 4, and/or fig. 5, which may specifically refer to implementation manners provided in each step, and are not described herein again.

The computer readable storage medium may be any one of the task processing devices or an internal storage unit of the electronic device, such as a hard disk or a memory of the electronic device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash card (flash card), and the like, which are provided on the electronic device. The computer readable storage medium may further include a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), and the like. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the electronic device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the electronic device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the methods provided by the steps of fig. 2, fig. 4, and/or fig. 5.

The terms "first", "second", "third", "fourth", and the like in the claims and in the description and drawings of the present application are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus. Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments. The term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items and includes such combinations.

Those of ordinary skill in the art will appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The method and the related apparatus provided by the embodiments of the present application are described with reference to the flowchart and/or the structural diagram of the method provided by the embodiments of the present application, and each flow and/or block of the flowchart and/or the structural diagram of the method, and the combination of the flow and/or block in the flowchart and/or the block diagram can be specifically implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not intended to limit the scope of the present application, which is defined by the appended claims.

Claims

1. A method for processing a task, the method comprising:

determining a set of graphics processor GPU resources, the set of GPU resources comprising a plurality of GPU resources;

the method comprises the steps of obtaining a task to be processed, determining the expected processing time of the task to be processed, determining target GPU resources from each logic resource pool based on the expected processing time, and processing the task to be processed based on the target GPU resources.

2. The method of claim 1, wherein determining the data processing performance of each of the GPU resources comprises:

3. The method of claim 1, wherein determining the scenario test performance of each GPU resource comprises:

and for each GPU resource, determining the data throughput of the GPU resource corresponding to each scene test case, and determining the scene test performance of the GPU resource based on each data throughput corresponding to the GPU resource.

4. The method according to claim 1, wherein the determining the task processing performance of each GPU resource based on the data processing performance and the scenario test performance of each GPU resource comprises:

determining a first weight of the data processing performance and a second weight of the scenario test performance;

5. The method of claim 4, wherein determining the data processing performance and the scenario test performance of each GPU resource based on the data processing performance and the scenario test performance of each GPU resource and the first weight and the second weight comprises:

6. The method of claim 1, wherein the predicted processing time comprises a phase predicted processing time for each phase task of the to-be-processed task;

the determining target GPU resources from each logic resource pool based on the predicted processing time and processing the tasks to be processed based on the target GPU resources comprises the following steps:

and for each phase task, determining a target GPU resource corresponding to the phase task from each logic resource pool based on the phase predicted processing time corresponding to the phase task, and processing the phase task based on the target GPU resource corresponding to the phase task.

7. The method of claim 1, wherein the predicted processing time comprises a first predicted processing time for the task to be processed corresponding to each of the GPU resources;

the determining target GPU resources from each logical resource pool based on the predicted processing time comprises:

determining the average processing time of each first expected processing time, and determining a logic resource pool comprising first GPU resources as a target resource pool, wherein the first expected processing time of the first GPU resources corresponding to the task to be processed is less than the average processing time;

8. The method of claim 3, wherein after the processing the task to be processed based on the target GPU resource, the method further comprises:

determining the actual data throughput of the task to be processed;

9. The method of claim 1, wherein after the processing the task to be processed based on the target GPU resource, the method further comprises:

determining the actual processing time of the task to be processed;

and storing the actual processing time so as to determine the predicted processing time of other tasks based on the actual processing time when the other tasks which are the same as the tasks to be processed are processed.

10. The method of claim 1, wherein each of the logical pools of resources is a logical resource pool in a GPU container.

11. A task processing apparatus, characterized in that the apparatus comprises:

a resource determination module to determine a set of graphics processor GPU resources, the set of GPU resources comprising a plurality of GPU resources;

the performance determining module is used for determining the data processing performance and the scene testing performance of each GPU resource, and determining the task processing performance of each GPU resource based on the data processing performance and the scene testing performance of each GPU resource;

and the task processing module is used for acquiring the tasks to be processed, determining the expected processing time of the tasks to be processed, determining target GPU resources from each logic resource pool based on the expected processing time, and processing the tasks to be processed based on the target GPU resources.

12. An electronic device comprising a processor and a memory, the processor and the memory being interconnected;

the memory is used for storing a computer program;

the processor is configured to perform the method of any of claims 1 to 10 when the computer program is invoked.

13. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1 to 10.