WO2023207623A1 - 任务处理方法 - Google Patents

任务处理方法 Download PDF

Info

Publication number
WO2023207623A1
WO2023207623A1 PCT/CN2023/088249 CN2023088249W WO2023207623A1 WO 2023207623 A1 WO2023207623 A1 WO 2023207623A1 CN 2023088249 W CN2023088249 W CN 2023088249W WO 2023207623 A1 WO2023207623 A1 WO 2023207623A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual node
target
task
node
status information
Prior art date
Application number
PCT/CN2023/088249
Other languages
English (en)
French (fr)
Inventor
聂大鹏
Original Assignee
阿里巴巴(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴(中国)有限公司 filed Critical 阿里巴巴(中国)有限公司
Publication of WO2023207623A1 publication Critical patent/WO2023207623A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Definitions

  • the embodiments of this specification relate to the field of computer technology, and in particular, to a task processing method.
  • embodiments of this specification provide a task processing method.
  • One or more embodiments of this specification simultaneously relate to a task processing device, a computing device, a computer-readable storage medium, and a computer program to solve technical deficiencies existing in the prior art.
  • a task processing method including:
  • a corresponding target virtual node is determined for the target task based on the current status information of the candidate virtual node, and the target task is executed through the target virtual node.
  • a task processing device including:
  • the receiving module is configured to determine the current status information of the initial virtual node based on the received target task, wherein the current status information is determined based on the physical computing unit corresponding to the initial virtual node;
  • a determining module configured to determine a candidate virtual node corresponding to the task type information from the initial virtual node based on the task type information of the target task
  • the execution module is configured to determine a corresponding target virtual node for the target task based on the current status information of the candidate virtual node, and execute the target task through the target virtual node.
  • a computing device including:
  • the memory is used to store computer-executable instructions
  • the processor is used to execute the computer-executable instructions.
  • the steps of the task processing method are implemented.
  • a computer-readable storage medium which stores computer-executable instructions.
  • the steps of the task processing method are implemented.
  • a computer program is provided, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the task processing method.
  • the task processing method provided in this specification includes determining the current status information of the initial virtual node based on the received target task, wherein the current status information is determined based on the physical computing unit corresponding to the initial virtual node; based on the task type information of the target task , determine a candidate virtual node corresponding to the task type information from the initial virtual node; determine a corresponding target virtual node for the target task based on the current status information of the candidate virtual node, and use the target virtual node Perform the stated target tasks.
  • the method determines the corresponding target virtual node for the target task based on the current status information of the initial virtual node and the task type information of the target task when receiving the target task, and executes the target task through the target virtual node , thus meeting the need to accurately determine the target virtual node for the target task.
  • Figure 1 is a schematic diagram of the Serverless platform scheduling framework provided in this manual
  • Figure 2 is a schematic diagram of a Serverless scheduling algorithm based on request concurrency provided in this manual
  • Figure 3 is a flow chart of a task processing method provided by an embodiment of this specification.
  • Figure 4 is a flow chart of task scheduling in a task processing method provided by an embodiment of this specification
  • Figure 5 is a schematic diagram of elastic scaling in a task processing method provided by an embodiment of this specification.
  • Figure 6 is a process flow chart of a task processing method provided by an embodiment of this specification.
  • Figure 7 is a schematic structural diagram of a task processing device provided by an embodiment of this specification.
  • Figure 8 is a structural block diagram of a computing device provided by an embodiment of this specification.
  • first, second, etc. may be used to describe various information in one or more embodiments of this specification, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other.
  • the first may also be called the second, and similarly, the second may also be called the first.
  • the word "if” as used herein may be interpreted as "when” or “when” or “in response to determining.”
  • GPU Generally refers to graphics processor.
  • Graphics processor graphics processing unit, abbreviation: GPU
  • display core visual processor
  • display chip is a device specially used in personal computers, workstations, game consoles and some mobile devices (such as tablets, smartphones, etc.)
  • a microprocessor that performs image and graphics related operations.
  • VPU Video Processing Unit
  • CPU Central Processing Unit
  • VPU can reduce server load and network bandwidth consumption. Used to distinguish it from GPU (Graph Process Unit, graphics processing unit).
  • the graphics processing unit also includes three main modules: video processing unit, external video module and post-processing module.
  • TPU It is a processor for neural network training, mainly used for deep learning and AI (artificial intelligence) operations.
  • TPU has programming like GPU and CPU, and a CISC instruction set (Complex Instruction Set).
  • CISC instruction set Complex Instruction Set
  • the TPU uses low-precision (8-bit) calculations to reduce the number of transistors used per operation.
  • GPGPU General-purpose computing on graphics processing units, referred to as GPGPU, is a graphics processor that uses graphics processing tasks to calculate general computing tasks originally handled by the central processing unit. These general-purpose calculations often have nothing to do with graphics processing. Due to the powerful parallel processing capabilities and programmable pipelines of modern graphics processors, stream processors can process non-graphic data. Especially when faced with Single Instruction Multiple Data (SIMD) and the amount of data processing operations is much greater than the need for data scheduling and transmission, general-purpose graphics processors greatly surpass central processor applications in performance.
  • SIMD Single Instruction Multiple Data
  • Hardware encoder A video encoding unit built into the graphics card.
  • Hardware decoder A video decoding unit built into the graphics card.
  • SP Streaming Processor, Stream Processing Units: Streaming processor, which directly maps multimedia graphics data streams to the stream processor for processing. There are two types: programmable and non-programmable.
  • CUDA Core a stream processor.
  • Tensor Core A specialized execution unit designed to perform tensor or matrix operations.
  • NVLINK A bus and its communication protocol. NVLink adopts point-to-point structure and serial transmission, and is used to connect the central processing unit (CPU) and the graphics processing unit (GPU). It can also be used to connect multiple graphics processors to each other. catch.
  • CPU central processing unit
  • GPU graphics processing unit
  • GPU instance A container type that can run GPU tasks.
  • Serverless platform Serverless computing platform, or microservice platform.
  • RR polling scheduling means that in a polling response request, each request signal will receive a response.
  • GPUs are widely used in audio and video production, graphics and image processing, AI training, AI reasoning, rendering scenes and other fields. Achieving acceleration ratios that are several times or even tens of thousands of times compared to CPUs; on the other hand, based on the popularity of cloud computing and the continuous upward movement of computing interfaces, more and more customers are migrating from VMs (virtual machines) and containers to Serverless
  • VMs virtual machines
  • the elastic computing platform allows customers to focus on their own computing tasks and shields many details of non-computing tasks such as cluster management, observability, and diagnosis.
  • the Serverless platform supports GPU heterogeneous computing tasks
  • the demand for elastic scaling of workload types based on different heterogeneous computing tasks naturally arises. That is to say, heterogeneous computing tasks based on GPU hardware will use different hardware computing units built into the GPU according to different types of workloads. For example: audio and video production will use the GPU's hardware encoding unit/hardware decoding unit, high-precision AI training tasks will use the GPU's TensorCore computing unit, and low-precision AI inference tasks will use the GPU's CUDA Core computing unit.
  • Heterogeneous computing tasks based on the Serverless platform require the Serverless platform to provide a method so that customers of the Serverless platform can select appropriate GPU indicators based on their own heterogeneous computing workload types to cope with elastic expansion and traffic during peak traffic times. Elastic shrinkage at low peaks.
  • the Serverless platform usually elastically scales GPU computing instances based on request concurrency. For example: when the concurrency of a function is set to 1, 100 concurrency (i.e. 100 concurrency request) will create 100 GPU instances; when the concurrency of the function is set to 10, 100 concurrency will create 10 GPU instances.
  • This elastic scaling method based on request concurrency does not truly reflect whether the GPU hardware is fully used. For example: when the concurrency of the function is set to 10, 10 GPU instances are created with 100 concurrency.
  • the GPU in each GPU instance Different internal hardware parts; such as: GPU computing unit, GPU storage unit, GPU interconnection bandwidth resources, the utilization rate of each GPU component may still remain at a low level, resulting in waste of resources and cost.
  • Figure 1 is a schematic diagram of a Serverless platform scheduling framework provided in this specification.
  • the scheduling framework includes a request access system, a scheduling system, an inventory system, and a GPU instance.
  • the user when the user completes writing the Serverless service function, he or she will initiate a function call request to the Serverless platform's request access system. See Figure 1.
  • Figure 1 describes the processing flow of function requests after arriving inside the Serverless platform, including: the access request system provides HTTP, HTTPS or other access protocols, and the requests are scheduled to the scheduler through this access writing protocol.
  • this scheduling system is responsible for applying for GPU instances from the inventory system, and is responsible for scheduling user function requests to different GPU instances in order to run corresponding functions.
  • the inventory system is responsible for the management of GPU instances; when the scheduling system senses the current GPU instance When the function call cannot be served, a new GPU instance is requested from the inventory system, and the inventory system will create a GPU instance based on the request from the scheduling system.
  • the GPU instance is responsible for the specific execution of the function.
  • Figure 2 is a schematic diagram of a Serverless scheduling algorithm based on request concurrency provided in this specification. This scheduling method is based on the Serverless platform scheduling framework provided in Figure 1 above, and specifically includes the following steps:
  • Step 202 Determine whether the function request has a surviving GPU instance.
  • Serverless can provide users with a scheduling portal.
  • the Serverless platform determines whether the function request has a corresponding surviving GPU instance, thereby scheduling the function request to the serverless platform. GPU instance.
  • step 206 If yes, perform step 206; if not, perform step 204.
  • Step 204 Apply for a new GPU instance from the inventory system.
  • the Serverless platform applies for a new GPU instance from the inventory system. If the application is successful, apply to obtain a surviving GPU instance, and perform step 202. If the application fails, perform step 210.
  • Step 206 Traverse all GPU instances and determine whether the request concurrency of any GPU instance is less than the user-configured request concurrency.
  • the Serverless platform traverses all GPU instances and determines whether the request concurrency of any GPU instance is less than the user-configured request concurrency. If yes, perform step 208; if not, perform step 204.
  • Step 208 Schedule the request to a GPU instance that meets the conditions.
  • the Serverless platform schedules the function request to a GPU instance that meets the conditions.
  • Step 210 Scheduling is terminated.
  • the serverless platform's default elastic scaling scheduling method based on request concurrency realizes the scheduling of requests, but does not comprehensively consider the utilization of the internal components of the GPU hardware. This will lead to the utilization of the internal components of the GPU hardware.
  • the rate remains at a low level
  • the elastic expansion caused by the request concurrency being higher than the user setting causes a waste of costs for users.
  • the utilization of various components within the GPU hardware remains at a high level
  • the elastic shrinkage caused by the request concurrency being lower than the user setting causes performance losses to the user.
  • the default elastic scaling scheduling method based on request concurrency of the Serverless platform does not take into account the utilization of various components within the GPU hardware, resulting in cost and performance losses.
  • this manual provides a task processing method that proposes a system structure for elastic scaling of heterogeneous computing tasks on the Serverless platform, thereby solving the problems caused by the elastic scaling of Serverless heterogeneous computing hardware. Performance and cost issues.
  • a task processing method is provided, and this specification also relates to a task processing method.
  • An apparatus, a computing device, a computer-readable storage medium and a computer program are described in detail one by one in the following embodiments.
  • Figure 3 shows a flow chart of a task processing method provided according to an embodiment of this specification, which specifically includes the following steps.
  • Step 302 Determine the current status information of the initial virtual node based on the received target task, where the current status information is determined based on the physical computing unit corresponding to the initial virtual node.
  • the target task can be understood as a heterogeneous computing task that requires heterogeneous hardware devices for processing;
  • the heterogeneous computing task includes but is not limited to audio and video production tasks, graphics and image processing tasks, AI training tasks, AI reasoning tasks, Scene rendering tasks and so on.
  • the initial virtual node can be understood as a node that can run heterogeneous computing tasks.
  • the initial virtual node can be a general computing node, a GPU instance, a virtual machine, a container, etc.
  • the physical computing unit can be understood as a physical device that supports the implementation of the initial virtual node; for example, the physical computing unit can be a GPU, GPGPU, VPU, TPU, etc.
  • the current status information can be understood as the utilization rate of each hardware component in the GPU corresponding to the GPU instance.
  • the task processing method provided in this specification can be applied to the Serverless platform or the scheduling system in the Serverless platform.
  • the scheduling system can be understood as a system that schedules target tasks to corresponding GPU instances.
  • the scheduling system can determine the current status information of all initial virtual nodes based on the received target task, where the current status information of the initial virtual node is based on the physical calculation corresponding to the initial virtual node.
  • Unit OK the current status information of all initial virtual nodes based on the received target task, where the current status information of the initial virtual node is based on the physical calculation corresponding to the initial virtual node.
  • the current status information of the initial virtual node is determined based on the physical computing unit corresponding to the initial virtual node, which can be understood as the internal hardware of the GPU hardware.
  • the component utilization indicator is determined as the current utilization of the GPU instance corresponding to each hardware component, which facilitates subsequent task scheduling based on this utilization.
  • Determining the current status information of the initial virtual node based on the received target task includes:
  • the current running information of the target physical computing subunit is used as the current status information of the initial virtual node.
  • the physical computing subunit can be understood as the various hardware components inside the GPU hardware, including but not limited to hardware encoder, hardware decoder, SP, CUDA Core, Tensor Core, etc.
  • the current running information of the physical computing subunit can be understood as the utilization indicator of each hardware component.
  • the scheduling system determines the current running information of each physical computing subunit in the physical computing unit, and can use the current running information as the current status information of the initial virtual node. For example, the scheduling system can obtain the utilization indicators of each component within the GPU hardware and determine the utilization indicators as the utilization corresponding to the GPU instance.
  • the target physical computing subunit corresponding to the initial virtual node can be understood as the hardware device corresponding to the GPU instance in the GPU.
  • the following takes the application of the task processing method provided in this manual in the Serverless scenario as an example to further explain the determination of the current status information of the initial virtual node based on the physical computing unit, where the physical computing unit is the GPU and the target task is graphics and image processing.
  • Task the target physical computing subunit is a hardware encoder and a hardware decoder.
  • the scheduling system of the Serverless platform determines the current utilization of each hardware unit in the GPU after receiving the target task, and determines the hardware unit corresponding to the GPU instance from multiple hardware units, where the processing graphics image processing
  • the GPU instance of the task corresponds to the hardware encoder and hardware decoder.
  • the GPU instance of this AI inference processing task corresponds to CUDA Core and Tensor Core.
  • the current utilization of each hardware unit in the GPU is regarded as the utilization of the corresponding GPU instance.
  • the current status information of the initial virtual node is determined based on the current running information of the physical computing subunit in the physical computing unit, so as to facilitate subsequent determination of the corresponding target task based on the utilization rate.
  • Target virtual machine node in the case of a received target task, the current status information of the initial virtual node is determined based on the current running information of the physical computing subunit in the physical computing unit, so as to facilitate subsequent determination of the corresponding target task based on the utilization rate.
  • the method before determining the physical computing unit corresponding to the initial virtual node based on the received target task, the method further includes:
  • the information collection module can be understood as any module that implements the function of collecting current running information of a physical computing unit, such as a GPU monitor.
  • the information collection module can monitor the current operating information of each physical computing subunit in the physical computing unit in real time, and send the current operating information to the dispatching system; thereby the dispatching system can receive the physical computing unit sent by the information collection module.
  • the GPU monitor can obtain the utilization indicators of each hardware component inside the GPU corresponding to each GPU instance, and periodically synchronize the utilization indicators to the scheduling system.
  • the current running information of the physical computing unit sent by the information collection module can be received, so that the current status information of the initial virtual node can be subsequently determined based on the current running information.
  • the current running information of the physical computing unit may be the current running information of each physical computing unit in the physical computing unit.
  • the physical computing unit is a GPU
  • determining the current status information of the initial virtual node based on the received target task includes:
  • a target hardware component corresponding to the initial virtual node is determined from the hardware components, and the current utilization of the target hardware component is used as the current status information of the initial virtual node.
  • the hardware components of the GPU include but are not limited to hardware encoders, hardware decoders, SP, CUDA Core, Tensor Core, etc.
  • the GPU monitor can obtain the utilization indicators of each hardware component inside the GPU corresponding to each GPU instance, and periodically synchronize the utilization indicators to the scheduling system.
  • the scheduling system of the Serverless platform determines the current utilization of each hardware unit in the GPU, determines the hardware unit corresponding to the GPU instance from multiple hardware units, and combines the current utilization of each hardware unit in the GPU. rate, as the utilization rate of the corresponding GPU instance. This facilitates subsequent determination of the corresponding GPU instance for the target task based on utilization.
  • Step 304 Based on the task type information of the target task, determine candidate virtual nodes corresponding to the task type information from the initial virtual nodes.
  • the task type information of the target task can be understood as information characterizing the type of the target task, such as characters, numbers, and other information.
  • the task type information of the target task may be information such as characters and numbers that characterize the AI training task type.
  • Candidate virtual nodes can be understood as all virtual nodes among the initial virtual nodes that can handle the target task.
  • the candidate virtual node is a GPU instance that can process the graphics and image processing task, where the GPU instance that processes the graphics and image processing task is consistent with the hardware encoder and hardware decoder in the GPU. correspond.
  • the scheduling system can determine the task type information of the target task, and based on the task type information, determine the candidate virtual node corresponding to the task type information from the initial virtual node, also That is, all virtual nodes among the initial virtual nodes that can handle the target task.
  • Step 306 Determine a corresponding target virtual node for the target task based on the current status information of the candidate virtual node, and execute the target task through the target virtual node.
  • the target virtual node can be understood as the GPU instance to which the heterogeneous computing task needs to be scheduled.
  • the scheduling system can add or select a corresponding target virtual node for the target task based on the current status information of the candidate virtual node, and execute the target task through the target virtual node.
  • Determining the corresponding target virtual node for the target task based on the current status information of the candidate virtual node includes:
  • a corresponding target virtual node is added to the target task.
  • the scheduling system can re-add the corresponding target task if it is determined that one or more candidate virtual nodes cannot process the target task based on the current status information of the candidate virtual node.
  • the target virtual node that is to say, reapply or create a virtual node for the target task.
  • adding a corresponding target virtual node to the target task based on the current status information of the candidate virtual node includes:
  • the target calculation ratio is greater than or equal to the first ratio threshold, a corresponding target virtual node is added to the target task.
  • the target calculation ratio can be understood as the utilization rate of the subsequent virtual node; the utilization rate can be set according to the actual application scenario, for example, the utilization rate can be 0 Any value in the range from % to 100%, or any value in the range [0,1], etc.
  • the first ratio threshold can be set according to the actual application scenario, which is not specifically limited in this specification, such as 70%, 0.7, etc.
  • the target task may be an audio and video production task
  • the current status information may be the utilization rate of the hardware encoding unit
  • the first ratio threshold may be 70%.
  • the scheduling system determines the utilization rate of the hardware decoding unit corresponding to the GPU instance, and uses the utilization rate of the hardware decoding unit as the utilization rate of the GPU instance corresponding to the hardware decoding unit, where the utilization rate may be 80%.
  • the scheduling system determines that the utilization rate is greater than the first ratio threshold (70%), it determines that the remaining computing power of the GPU instance is too low and may not be able to execute the current audio and video production tasks. Therefore, new GPU instances are created for audio and video production tasks through GPU instance expansion to ensure the normal execution of audio and video production tasks.
  • the scheduling system can achieve the purpose of expanding the GPU instance by applying for a GPU instance from the inventory system, and further ensure the normal execution of heterogeneous computing tasks; the specific implementation method is as follows.
  • Adding a corresponding target virtual node for the target task includes:
  • the virtual node providing module can be understood as a module that can provide virtual nodes for target tasks.
  • the inventory system in Serverless.
  • the virtual node to be determined can be understood as a virtual node provided by the inventory system.
  • inventory system newly created GPU instances.
  • the scheduling system when the scheduling system determines that the utilization rate of the existing GPU instance is not sufficient to run the audio and video production task, it can issue a GPU instance acquisition request to the inventory system to apply for a new GPU instance, and the inventory system will Based on the application of the scheduling system, create a new GPU instance for it and send the new GPU instance to the scheduling system system.
  • the scheduling system uses the newly applied GPU instance as the GPU instance to process the audio and video production tasks. Subsequently, the scheduling system can schedule the audio and video production tasks to the new GPU instance for execution.
  • the scheduling system matches GPU instances with better hardware performance for heterogeneous computing tasks from currently existing GPU instances in the following manner.
  • Determining the corresponding target virtual node for the target task based on the current status information of the candidate virtual node includes:
  • a target virtual node corresponding to the target task is selected from the candidate virtual nodes.
  • the scheduling system can, based on the current status information of the candidate virtual node, determine that the one or more candidate virtual nodes can handle the target task, select the target from the candidate virtual node.
  • the task selects a corresponding target virtual node.
  • selecting the target virtual node corresponding to the target task from the candidate virtual nodes based on the current status information of the candidate virtual node includes:
  • a target virtual node corresponding to the target task is determined.
  • the scheduling system determines the utilization of one or more hardware decoding units corresponding to the GPU instance. If it is determined that the utilization rate is less than or equal to the first ratio threshold 70%, it is determined that there is a GPU instance capable of running the audio and video production task in the current GPU instance. The scheduling system then determines the minimum utilization rate from the utilization rates that are less than or equal to the first ratio threshold of 70%.
  • the GPU instance corresponding to the minimum utilization rate is determined as the GPU instance running the audio and video production task.
  • one or more GPU instances are randomly determined from the multiple GPU instances corresponding to the minimum utilization rates as the GPU instances for running the audio and video production tasks.
  • determining the GPU instance with better performance further includes: sorting the candidate virtual nodes based on the target calculation ratio, and obtaining the Ranking results of candidate virtual nodes, wherein the candidate virtual nodes include at least two;
  • a corresponding target virtual node is determined for the target task from the candidate virtual nodes based on the sorting result.
  • the scheduling system determines that the utilization is less than or equal to the first ratio threshold 70%, it determines the GPU instances whose utilization is less than or equal to the first ratio threshold 70%, and sorts the GPU instances in descending order based on the utilization. Obtain the sorting result of the GPU instance. The higher the GPU instance is in the sorting result, the smaller the utilization rate is. That is, The performance of the GPU instance is better. Based on this, the scheduling system schedules the audio and video production tasks to the first place in the sorting results, or to the first specific number of GPU instances in the sorting results. The specific number can be set according to the actual application scenario. , such as the top three, top ten.
  • the physical computing unit is a GPU.
  • the method further includes:
  • the target virtual node is deleted.
  • the scheduling system can monitor the current utilization of the target hardware component in real time, for example, obtain the utilization indicator of the hardware component inside each GPU through the GPU monitor component (GPU Monitor), and determine the target task based on the utilization indicator.
  • the virtual machine node to be deleted is deleted, thereby saving hardware resources.
  • the function call request can be understood as the above target task.
  • the scheduling system can determine whether there is a surviving GPU instance for the function request, that is, whether there is an instance that can run the function request.
  • the method of determining whether there is an instance capable of executing the function request may be determined by determining whether the utilization of the GPU instance is less than a first ratio threshold (ie, the preset maximum utilization of the GPU hardware).
  • the scheduling system applies for a new GPU instance from the inventory system, and after the application is successful, it continues to determine whether there is an instance that can run the function request.
  • the scheduling system determines that there is a GPU instance that can run the function call request, and uses RR polling to load balance the function call request to all GPU instances, thereby executing the scheduling termination.
  • RR polling to load balance function call requests to all GPU instances, the above-mentioned method of sorting GPU instances based on utilization can be used to determine the GPU instances with better performance to achieve load balancing.
  • the Serverless scheduling system uses the RR polling method to load balance the request to each GPU instance after expansion and contraction, thereby ensuring that the entire GPU cluster is fully used.
  • the task processing method provided in this specification can solve the problem by sampling the utilization of various components inside the GPU hardware and allowing users to set different elastic scaling indicators according to heterogeneous computing tasks in different scenarios. This eliminates the cost waste caused by excessive expansion and the performance loss caused by premature narrowing.
  • the utilization rate of various internal components of the GPU hardware is not comprehensively considered, thus failing to solve the cost waste and premature expansion caused by excessive expansion.
  • the task processing method provided in this manual can sample the utilization of various components inside the GPU hardware and allow users to set different elastic scaling indicators according to heterogeneous computing tasks in different scenarios, thereby periodically
  • the GPU instance is elastically scaled based on the hardware utilization of the GPU instance, which solves the problem of cost waste caused by excessive expansion and performance loss caused by premature shrinking.
  • the specific implementation method is as follows.
  • the task processing method also includes:
  • the scheduling system can periodically determine the physical computing unit corresponding to the initial virtual node, and determine the current status information of the initial virtual node based on the physical computing unit. For example, the utilization rate of each hardware component in the GPU is monitored in real time, and based on the utilization rate of each hardware component, the utilization rate of the GPU instance corresponding to each hardware component is determined.
  • adding a new virtual node when it is determined that the initial virtual node meets the preset node addition conditions based on the current status information includes:
  • the target calculation ratio is greater than the node load threshold, it is determined that the initial virtual node satisfies the preset node addition condition
  • a new virtual node is added based on the virtual node providing module.
  • the node load threshold can be understood as a threshold set in advance for each initial virtual node and indicating that its utilization has reached a load state or is about to reach a load state.
  • the node load threshold can be set according to actual application scenarios. For example, the node load threshold can be set to 80%.
  • the scheduling system can determine the target calculation ratio of the initial virtual node based on the current status information of the initial virtual node, and when determining that the target calculation ratio is greater than the node load threshold, determine that the initial virtual node satisfies the preset node addition conditions. , and when the initial virtual node meets the preset node addition conditions, apply to the virtual node providing module to add a new virtual node, so that the virtual node provided by the virtual node providing module is used as a new virtual node.
  • the task processing method provided in this manual allows users to set different elastic indicators according to heterogeneous computing tasks in different scenarios.
  • the elastic indicators include elastic expansion indicators and elastic reduction indicators.
  • the elastic expansion indicator can understand the utilization threshold, that is, the node load threshold. Based on this, after the scheduling system schedules the audio and video production tasks to the GPU instance with better performance, it can monitor the utilization rate of the GPU instance based on the hardware decoding unit in real time.
  • the scheduling system can actively apply for a new GPU instance from the inventory system, and based on the new GPU instance and the utilization rate, audio and video
  • the GPU instances determined by the production task jointly run the video production task, thus ensuring the normal operation of the video production task. It also enables users to set different elastic scaling indicators based on heterogeneous computing tasks in different scenarios.
  • the idle virtual node in the initial virtual node when it is determined that the initial virtual node satisfies the preset node deletion condition based on the current status information, the idle virtual node in the initial virtual node is deleted, include:
  • the target calculation ratio is less than the node idle threshold, it is determined that the initial virtual node satisfies the preset node deletion condition
  • the node idle threshold can be understood as a threshold set in advance for each initial virtual node to indicate that its utilization has reached an idle state.
  • the node idle threshold can be set according to actual application scenarios. For example, the node load threshold can be set to 0% or 5%.
  • the scheduling system can monitor the utilization of the hardware decoding unit in the GPU instance in real time.
  • the utilization rate is less than the elastic scaling indicator (node idle threshold) set by the user, the scheduling system can proactively delete redundant GPU instances from the inventory system, thereby ensuring the normal operation of video production tasks and saving hardware resources.
  • the scheduling system needs to delete the GPU instance after the tasks corresponding to the GPU instance are completed.
  • the specific method is as follows.
  • Delete idle virtual nodes among the initial virtual nodes including:
  • the idle virtual node is deleted.
  • the task execution status information can be understood as information representing the task execution progress.
  • the scheduling system can monitor the task execution status information of the idle virtual node in real time, and delete the idle virtual node when it is determined that the target task execution is completed based on the task execution status information, thereby saving hardware resources.
  • the scheduling system can periodically perform capacity expansion checks.
  • This capacity expansion detection can be understood as the content of the above embodiment in which the scheduling system performs elastic scaling based on the user setting different elasticity indicators for heterogeneous computing tasks in different scenarios.
  • Figure 5 is a schematic diagram of elastic scaling in a task processing method provided by an embodiment of this specification. The elastic scaling is implemented based on GPU utilization, and the user can configure the GPU elastic scaling indicator for each function.
  • the scheduling system can periodically perform expansion checks to determine whether the function (that is, the function call request) Whether the aggregate utilization of the GPU implementation is higher than the user configuration, that is, determine whether the aggregate utilization of all GPUs running the function call request is higher than the user-configured GPU elastic scaling indicator for each function call request.
  • the GPU elastic scaling indicator can be configured to expand when the hardware encoding utilization of the GPU instance is >80%; in the AI scenario, the GPU elastic scaling indicator can be configured to expand when the CUDA CORE of the GPU instance When the hardware utilization is ⁇ 20%, scale down.
  • the scheduling system determines that the aggregate utilization is higher than the user configuration, it will apply for a new GPU instance from the inventory system and perform scheduling termination. If not, that is, if the scheduling system determines that the aggregate utilization is lower than the user configuration, it returns the applied GPU instance to the inventory system and performs scheduling termination.
  • the current status information can also include multi-dimensional hybrid indicators, thereby facilitating subsequent multi-dimensional hybrid indicators. Make comprehensive scheduling decisions to better adapt to the scaling requirements of heterogeneous computing tasks in various scenarios.
  • the task processing method provided in this manual adopts a more aggressive strategy to expand the GPU instance to ensure the service performance of user functions; while shrinking the GPU instance adopts a more lazy strategy to take into account the cost of ensuring user functions.
  • the aggressive coefficient and lazy coefficient will take effect when "whether the aggregate utilization of all GPU instances of this function is higher than the user configuration" in Figure 5.
  • the task processing method provided in this specification determines the corresponding target virtual node for the target task based on the current status information of the initial virtual node and the task type information of the target task when receiving the target task, and passes the target virtual node Execute the target task, thereby meeting the need to accurately determine the target virtual node for the target task.
  • Figure 6 shows a process flow chart of a task processing method provided by an embodiment of this specification.
  • Figure 6 provides a system framework for elastic scaling based on GPU utilization, wherein the system framework includes a request to receive Entry system, scheduling system, inventory system, GPU instance, and GPU Monitor.
  • the GPU monitor component (GPU Monitor) in the system framework is used to obtain the internal hardware component utilization indicators of each GPU instance; it should be noted that depending on the application scenarios of the task processing methods provided in this manual, this hardware component also Different; for example, when the task processing method is applied to audio and video production scenarios, the hardware components can be hardware encoding units and hardware decoding units; when the task processing method is applied to AI production scenarios, the hardware components can be Cuda Core, etc.; accordingly, the hardware component utilization indicators include but are not limited to hardware encoding utilization in audio and video production scenarios, hardware decoding utilization in audio and video production scenarios, Cuda Core utilization in AI production scenarios, and Cuda Core utilization in AI production scenarios. Tensor Core utilization, NVLINK bandwidth utilization of AI production scenarios, video memory utilization of all scenarios, etc.
  • the GPU monitor periodically synchronizes the utilization of these GPU hardware parts to the scheduling system.
  • the access request system schedules the request to the scheduling system through the access write protocol.
  • the Serverless scheduling system will elastically expand and shrink the GPU instance based on the GPU hardware utilization and the corresponding scheduling policy.
  • the Serverless scheduling system will apply for or return GPU instances to the inventory system based on GPU hardware utilization and corresponding scheduling policies to achieve elastic expansion and contraction of GPU instances.
  • the scheduling strategy can be set according to actual application scenarios, and this specification does not specifically limit this, for example, the scheduling strategy shown in Figure 4.
  • the serverless scheduling system elastically expands and shrinks the GPU instance, it schedules the user's function request to different GPU instances to run the corresponding functions; the GPU instance is responsible for the specific execution of the function.
  • the task processing method provided by the embodiments of this specification provides an elastic scaling control method based on GPU indicators of different dimensions in the Serverless scenario, so that heterogeneous computing tasks (audio and video production, AI production, graphics and image production) in different scenarios can be Set different elastic scaling indicators to achieve an elastic scaling strategy that balances performance and cost.
  • FIG. 6 shows a schematic structural diagram of a task processing device provided by an embodiment of this specification. As shown in Figure 6, the device includes:
  • the receiving module 702 is configured to determine the current status information of the initial virtual node based on the received target task, wherein the current status information is determined based on the physical computing unit corresponding to the initial virtual node;
  • the determination module 704 is configured to determine, from the initial virtual nodes, a candidate virtual node corresponding to the task type information based on the task type information of the target task;
  • the execution module 706 is configured to determine a corresponding target virtual node for the target task based on the current status information of the candidate virtual node, and execute the target task through the target virtual node.
  • execution module 706 is also configured to:
  • a target virtual node corresponding to the target task is selected from the candidate virtual nodes.
  • execution module 706 is also configured to:
  • a corresponding target virtual node is added to the target task.
  • the task processing device further includes a node processing module configured to:
  • the physical computing unit is a GPU
  • the receiving module 702 is also configured to:
  • the target hardware component corresponding to the initial virtual node is determined from the hardware components, and the target hardware component is The current utilization of the software is used as the current status information of the initial virtual node.
  • the task processing device further includes a deletion module configured to:
  • the target virtual node is deleted.
  • execution module 706 is also configured to:
  • the target calculation ratio is greater than or equal to the first ratio threshold, a corresponding target virtual node is added to the target task.
  • execution module 706 is also configured to:
  • execution module 706 is also configured to:
  • a target virtual node corresponding to the target task is determined.
  • the node processing module is also configured to:
  • the target calculation ratio is greater than the node load threshold, it is determined that the initial virtual node satisfies the preset node addition condition
  • a new virtual node is added based on the virtual node providing module.
  • the node processing module is also configured to:
  • the target calculation ratio is less than the node idle threshold, it is determined that the initial virtual node satisfies the preset node deletion condition
  • the receiving module 702 is also configured to:
  • the current running information of the target physical computing subunit is used as the current status information of the initial virtual node.
  • the task processing device further includes an information receiving module configured to:
  • the task processing device determines the corresponding target virtual node for the target task based on the current status information of the initial virtual node and the task type information of the target task when receiving the target task, and uses the target virtual node to Execute the target task, thereby meeting the need to accurately determine the target virtual node for the target task.
  • the above is a schematic solution of a task processing device in this embodiment. It should be noted that the technical solution of the task processing device and the technical solution of the above-mentioned task processing method belong to the same concept. For details that are not described in detail in the technical solution of the task processing device, please refer to the description of the technical solution of the above task processing method. .
  • Figure 8 shows a structural block diagram of a computing device 800 provided according to an embodiment of this specification.
  • Components of the computing device 800 include, but are not limited to, memory 810 and processor 820 .
  • the processor 820 is connected to the memory 810 through a bus 830, and the database 850 is used to save data.
  • Computing device 800 also includes an access device 840 that enables computing device 800 to communicate via one or more networks 860 .
  • networks include the Public Switched Telephone Network (PSTN), a local area network (LAN), a wide area network (WAN), a personal area network (PAN), or a combination of communications networks such as the Internet.
  • Access device 840 may include one or more of any type of network interface (eg, a network interface card (NIC)), wired or wireless, such as an IEEE 802.11 Wireless Local Area Network (WLAN) wireless interface, Worldwide Interconnection for Microwave Access ( Wi-MAX) interface, Ethernet interface, Universal Serial Bus (USB) interface, cellular network interface, Bluetooth interface, Near Field Communication (NFC) interface, etc.
  • NIC network interface card
  • the above-mentioned components of the computing device 800 and other components not shown in FIG. 8 may also be connected to each other, such as through a bus. It should be understood that the structural block diagram of the computing device shown in FIG. 8 is for illustrative purposes only and does not limit the scope of this description. Those skilled in the art can add or replace other components as needed.
  • Computing device 800 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet computer, personal digital assistant, laptop computer, notebook computer, netbook, etc.), a mobile telephone (e.g., smartphone ), a wearable computing device (e.g., smart watch, smart glasses, etc.) or other type of mobile device, or a stationary computing device such as a desktop computer or PC.
  • a mobile computer or mobile computing device e.g., tablet computer, personal digital assistant, laptop computer, notebook computer, netbook, etc.
  • a mobile telephone e.g., smartphone
  • a wearable computing device e.g., smart watch, smart glasses, etc.
  • stationary computing device such as a desktop computer or PC.
  • Computing device 800 may also be a mobile or stationary server.
  • the processor 820 is configured to execute the following computer-executable instructions. When the computer-executable instructions are executed by the processor 820, the steps of the above task processing method are implemented.
  • the above is a schematic solution of a computing device in this embodiment. It should be noted that the technical solution of the computing device and the technical solution of the above-mentioned task processing method belong to the same concept. For details that are not described in detail in the technical solution of the computing device, please refer to the description of the technical solution of the above task processing method.
  • An embodiment of the present specification also provides a computer-readable storage medium that stores computer-executable instructions.
  • the computer-executable instructions are executed by a processor, the steps of the above task processing method are implemented.
  • An embodiment of this specification also provides a computer program, wherein when the computer program is executed in a computer, the computer is caused to perform the steps of the above task processing method.
  • the computer instructions include computer program code, which may be in the form of source code, virtual node code, executable file or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording media, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signals telecommunications signals
  • software distribution media etc.
  • the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of legislation and patent practice in the jurisdiction.
  • the computer-readable medium Excludes electrical carrier signals and telecommunications signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)
  • Computer And Data Communications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本说明书实施例提供任务处理方法,其中所述任务处理方法包括:基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务,从而满足了准确的为目标任务确定目标虚拟节点的需求。

Description

任务处理方法
本申请要求于2022年04月24日提交中国专利局、申请号为202210454886.1、申请名称为“任务处理方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本说明书实施例涉及计算机技术领域,特别涉及一种任务处理方法。
背景技术
随着计算机技术的不断发展,基于计算设备实现计算任务的占比在快速提升,比如异构计算场景下,基于异构硬件(如图形处理器GPU)的异构计算占比在快速提升,从而在音视频生产、图形图像处理、AI训练、等领域被广泛地应用。而随着不同计算服务提供方通过计算实例支持异构计算任务的执行,基于不同异构计算任务的工作负载类型,为其准确的确定计算实例的需求由此产生;因此,亟需提供一种能够满足为异构计算任务准确分配计算实例弹性扩容需求的方案。
发明内容
有鉴于此,本说明书实施例提供了一种任务处理方法。本说明书一个或者多个实施例同时涉及一种任务处理装置,一种计算设备,一种计算机可读存储介质,一种计算机程序,以解决现有技术中存在的技术缺陷。
根据本说明书实施例的第一方面,提供了一种任务处理方法,包括:
基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;
基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;
基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
根据本说明书实施例的第二方面,提供了一种任务处理装置,包括:
接收模块,被配置为基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;
确定模块,被配置为基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;
执行模块,被配置为基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
根据本说明书实施例的第三方面,提供了一种计算设备,包括:
存储器和处理器;
所述存储器用于存储计算机可执行指令,所述处理器用于执行所述计算机可执行指令,该计算机可执行指令被处理器执行时实现所述任务处理方法的步骤。
根据本说明书实施例的第四方面,提供了一种计算机可读存储介质,其存储有计算机可执行指令,该计算机可执行指令被处理器执行时实现所述任务处理方法的步骤。
根据本说明书实施例的第五方面,提供了一种计算机程序,其中,当所述计算机程序在计算机中执行时,令计算机执行所述任务处理方法的步骤。
本说明书提供的任务处理方法,包括基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
具体地,该方法通过在接收到目标任务的情况下,基于初始虚拟节点的当前状态信息以及目标任务的任务类型信息,为该目标任务确定对应的目标虚拟节点,并通过目标虚拟节点执行目标任务,从而满足了准确的为目标任务确定目标虚拟节点的需求。
附图说明
图1是本说明书提供了一种Serverless平台调度框架示意图;
图2是本说明书提供了一种基于请求并发度的Serverless调度算法的示意图;
图3是本说明书一个实施例提供的一种任务处理方法的流程图;
图4是本说明书一个实施例提供的一种任务处理方法中任务调度的流程图;
图5是本说明书一个实施例提供的一种任务处理方法中弹性伸缩的示意图;
图6是本说明书一个实施例提供的一种任务处理方法的处理过程流程图;
图7是本说明书一个实施例提供的一种任务处理装置的结构示意图;
图8是本说明书一个实施例提供的一种计算设备的结构框图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本说明书。但是本说明书能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本说明书内涵的情况下做类似推广,因此本说明书不受下面公开的具体实施的限制。
在本说明书一个或多个实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书一个或多个实施例。在本说明书一个或多个实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本说明书一个或多个实施例中使用的术语“和/或”是指并包含一个 或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本说明书一个或多个实施例中可能采用术语第一、第二等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书一个或多个实施例范围的情况下,第一也可以被称为第二,类似地,第二也可以被称为第一。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
首先,对本说明书一个或多个实施例涉及的名词术语进行解释。
GPU:一般指图形处理器。图形处理器(graphics processing unit,缩写:GPU),又称显示核心、视觉处理器、显示芯片,是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上做图像和图形相关运算工作的微处理器。
VPU:VPU(Video Processing Unit,视频处理单元)是一种全新的视频处理平台核心引擎,具有硬解码功能以及减少CPU(中央处理器)负荷的能力。另外,VPU可以减少服务器负载和网络带宽的消耗。用于区别于GPU(Graph Process Unit,图形处理单元)。图形处理单元又包括视频处理单元、外视频模块和后处理模块这三个主要模块。
TPU:是一种神经网络训练的处理器,主要用于深度学习、AI(人工智能)运算。TPU具有像GPU和CPU一样的编程,以及一套CISC指令集(复杂指令集)。作为机器学习处理器,不仅仅支持某一种神经网络,还支持卷积神经网络、LSTM(长短期记忆人工神经网络)、全连接网络等多种。TPU采用低精度(8位)计算,以降低每步操作使用的晶体管数量。
GPGPU:通用图形处理器(General-purpose computing on graphics processing units,简称GPGPU),是一种利用处理图形任务的图形处理器来计算原本由中央处理器处理的通用计算任务。这些通用计算常常与图形处理没有任何关系。由于现代图形处理器强大的并行处理能力和可编程流水线,令流处理器可以处理非图形数据。特别在面对单指令流多数据流(SIMD),且数据处理的运算量远大于数据调度和传输的需要时,通用图形处理器在性能上大大超越了中央处理器应用程序。
硬件编码器:一种显卡内置的视频编码单元。
硬件解码器:一种显卡内置的视频解码单元。
SP(Streaming Processor、Stream Processing Units):流处理器,是直接将多媒体的图形数据流映射到流处理器上进行处理的,有可编程和不可编程两种。
CUDA Core:一种流处理器。
Tensor Core:一种专为执行张量或矩阵运算而设计的专用执行单元。
NVLINK:一种总线及其通信协议。NVLink采用点对点结构、串列传输,用于中央处理器(CPU)与图形处理器(GPU)之间的连接,也可用于多个图形处理器之间的相互连 接。
GPU实例:一种可以运行GPU任务的容器类型。
Serverless平台:无服务器计算平台,或者微服务平台。
RR轮询:RR轮询调度,是指在一次轮询响应请求中,每个请求信号都会获得响应。
随着计算机技术的不断发展,一方面基于GPU的异构计算占比在快速提升,比如:在音视频生产、图形图像处理、AI训练、AI推理、渲染场景等领域GPU被广泛地应用,以取得相较于CPU数倍、乃至数万倍的加速比;另一方面基于云计算的普及以及计算界面的不断上移,越来越多的客户从VM(虚拟机)、容器,迁移至Serverless弹性计算平台,使得客户聚焦于其自身的计算任务,屏蔽掉诸多例如集群管理、可观测、诊断等非计算任务本身的细节。
而随着Serverless平台对GPU异构计算任务的支持,对基于不同异构计算任务的工作负载类型的弹性伸缩需求自然产生。也即是,基于GPU硬件的异构计算任务,会根据工作负载的不同种类,从而使用GPU内置的不同硬件算力单元。例如:音视频生产会使用GPU的硬件编码单元/硬件解码单元,高精度的AI训练任务会使用GPU的TensorCore计算单元,低精度的AI推理任务会使用GPU的CUDA Core计算单元。基于Serverless平台的异构计算任务,需要Serverless平台提供一种方法,使得Serverless平台的客户可以根据自己的异构计算工作负载类型,选择适当的GPU指标,从而应对流量高峰时的弹性扩容、以及流量低峰时的弹性缩容。
基于此,在本说明书提供的一种弹性伸缩方案中,Serverless平台通常基于请求并发度来对GPU计算实例进行弹性伸缩,比如:当函数的并发度设置为1时,100并发(即100个并发的请求)就会创建100个GPU实例;当函数的并发度设置为10时,100并发就会创建10个GPU实例。这种基于请求并发度的弹性伸缩方法,并未真实反映GPU硬件是否充分使用,比如:当函数的并发度设置为10时,100并发所创建的10个GPU实例,每个GPU实例中的GPU内部不同硬件部分;比如:GPU计算单元、GPU存储单元、GPU互联带宽资源,GPU各部件利用率可能仍维持在较低的水平上,从而导致资源浪费、成本浪费。
进一步的,参见图1,图1是本说明书提供了一种Serverless平台调度框架示意图,该调度框架包括请求接入系统、调度系统、库存系统、GPU实例。其中,当用户完成Serverless的服务功能编写后,会向Serverless平台的请求接入系统发起函数调用请求。参见图1,图1描述了函数请求在到达Serverless平台内部后的处理流程,具体包括:接入请求系统提供了HTTP、HTTPS或其它的接入协议,通过该接入写协议将请求调度至调度系统;该调度系统负责向库存系统申请GPU实例,并负责将用户的函数请求,调度至不同的GPU实例,以便运行相应的功能。库存系统负责GPU实例的纳管;当调度系统感知当前的GPU实 例无法服务该函数调用时,向库存系统申请新的GPU实例,而库存系统会基于调度系统的申请创建GPU实例。GPU实例负责函数的具体执行。
基于此,参见图2,图2是本说明书提供了一种基于请求并发度的Serverless调度算法的示意图,该调度方法基于上述图1提供的一种Serverless平台调度框架,具体包括如下步骤:
步骤202:判断函数请求是否有存活的GPU实例。
具体地,Serverless能够向用户提供调度入口,当用户通过该调度入口向Serverless平台发起函数请求的情况下,该Serverless平台判断该函数请求是否具有对应存活的GPU实例,从而将该函数请求调度至该GPU实例。
若是,则执行步骤206,若否,则执行步骤204。
步骤204:向库存系统申请新的GPU实例。
具体地,该Serverless平台向库存系统申请新的GPU实例。若申请成功的情况下,则申请获得存活的GPU实例,并执行步骤202,若申请失败的情况下,则执行步骤210。
步骤206:遍历所有GPU实例,判断是否有GPU实例的请求并发度小于用户配置的请求并发度。
具体地,该Serverless平台在确定该函数请求具有存活的GPU实例后,遍历所有GPU实例,并判断是否有GPU实例的请求并发度小于用户配置的请求并发度。若是,则执行步骤208;若否,则执行步骤204。
步骤208:调度该请求至满足条件的GPU实例。
具体地,该Serverless平台将该函数请求,调度至满足条件的GPU实例中。
步骤210:调度终止。
基于此,该Serverless平台默认的基于请求并发度的弹性伸缩调度方法,虽然实现了请求的调度,但并未综合考虑GPU硬件内部各部件的利用率,这会导致当GPU硬件内部各部件的利用率维持在较低水位时,由于请求并发度高于用户设置而引发的弹性扩容,对用户造成了成本浪费。以及当GPU硬件内部各部件的利用率维持在较高水位时,由于请求并发度低于用户设置而引发的弹性缩容,对用户造成了性能损失。
综上所述,Serverless平台默认的基于请求并发度的弹性伸缩调度方法,并未考虑GPU硬件内部各部件的利用率,从而导致成本与性能上的损失。
基于上述方案中存在的缺陷,本说明书提供的一种任务处理方法中,提出了一种Serverless平台异构计算任务的弹性伸缩的系统结构,从而解决Serverless异构计硬件的弹性伸缩所带来的性能与成本问题。
具体地,在本说明书中,提供了一种任务处理方法,本说明书同时涉及一种任务处理 装置,一种计算设备,一种计算机可读存储介质以及一种计算机程序,在下面的实施例中逐一进行详细说明。
图3示出了根据本说明书一个实施例提供的一种任务处理方法的流程图,具体包括以下步骤。
步骤302:基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定。
其中,该目标任务可以理解为需要异构硬件设备进行处理的异构计算任务;例如,该异构计算任务包括但不限于音视频生产任务、图形图像处理任务、AI训练任务、AI推理任务、场景渲染任务等等。
该初始虚拟节点可以理解为能够运行异构计算任务的节点,例如,该初始虚拟节点可以为通用计算节点、GPU实例、虚拟机、容器等。
该物理计算单元可以理解为支持该初始虚拟节点实现的物理设备;例如,该物理计算单元可以为GPU、GPGPU、VPU、TPU等等。
在初始虚拟节点为GPU实例的情况下,该当前状态信息可以理解为该GPU实例对应的GPU中各硬件部件的利用率。需要说明的是,本说明书提供的任务处理方法能够应用于Serverless平台、或者Serverless平台中的调度系统,调度系统可以理解为将目标任务调度到对应的GPU实例的系统。
具体地,该调度系统在接收到目标任务之后,能够基于接收到的目标任务确定所有初始虚拟节点的当前状态信息,其中,该初始虚拟节点的当前状态信息,基于该初始虚拟节点对应的物理计算单元确定。
实际应用中,在初始虚拟节点为GPU实例,该物理计算单元为GPU的情况下,初始虚拟节点的当前状态信息基于初始虚拟节点对应的物理计算单元确定,可以理解为将该GPU硬件内部各硬件部件的利用率指标,确定为各硬件部件对应的GPU实例的当前利用率,便于后续基于该利用率进行任务调度。具体实现方式如下。
所述基于接收的目标任务确定初始虚拟节点的当前状态信息,包括:
基于接收的目标任务,确定所述物理计算单元中的物理计算子单元,以及所述物理计算子单元的当前运行信息;
从所述物理计算子单元中确定所述初始虚拟节点对应的目标物理计算子单元;
将所述目标物理计算子单元的当前运行信息作为所述初始虚拟节点的当前状态信息。
其中,物理计算子单元可以理解为GPU硬件内部各硬件部件,包括但不限于硬件编码器、硬件解码器、SP、CUDA Core、Tensor Core等等。物理计算子单元的当前运行信息可以理解为各硬件部件的利用率指标。基于此,调度系统通过确定物理计算单元中各物理计算子单元的当前运行信息,并能够将该当前运行信息作为初始虚拟节点的当前状态信息。 例如,调度系统能够获取该GPU硬件内部各部件的利用率指标,并将该利用率指标确定为GPU实例对应的利用率。
初始虚拟节点对应的目标物理计算子单元可以理解为GPU中与该GPU实例对应的硬件设备。
下面以本说明书提供的任务处理方法在Serverless场景下的应用为例,对基于物理计算单元确定初始虚拟节点的当前状态信息做进一步说明,其中,该物理计算单元为GPU、目标任务为图形图像处理任务,该目标物理计算子单元为硬件编码器和硬件解码器。
基于此,Serverless平台的调度系统在接收到目标任务的情况下,确定GPU中各硬件单元的当前利用率,并从多个硬件单元中确定GPU实例对应的硬件单元,其中,该处理图形图像处理任务的GPU实例,与硬件编码器、硬件解码器对应。该AI推理处理任务的GPU实例,与CUDA Core、Tensor Core对应。将GPU中各硬件单元的当前利用率,作为对应的GPU实例的利用率。
本说明书实施例中,在接收的目标任务的情况下,基于物理计算单元中的物理计算子单元的当前运行信息确定初始虚拟节点的当前状态信息,从而便于后续基于利用率为目标任务确定对应的目标虚拟机节点。
进一步的,在本说明书提供的一实施例中,所述基于接收的目标任务确定初始虚拟节点对应的物理计算单元之前,还包括:
接收信息采集模块发送的、所述物理计算单元中的所述物理计算子单元的当前运行信息,其中,所述信息采集模块为监测所述物理计算单元中的、所述物理计算子单元的当前运行信息的模块。
其中,该信息采集模块可以理解为任意一种实现采集物理计算单元的当前运行信息功能的模块,例如GPU监测器。
具体地,该信息采集模块能够实时监测物理计算单元中各物理计算子单元的当前运行信息,并将该当前运行信息发送至调度系统;从而该调度系统能够接收信息采集模块发送的、物理计算单元中各物理计算子单元的当前运行信息。例如,GPU监测器能够获取各GPU实例对应的GPU内部各硬件部件的利用率指标,并将利用率指标周期性地同步至调度系统。
在本说明书实施例中,也即是说,能够接收信息采集模块发送的该物理计算单元的当前运行信息,便于后续能够基于该当前运行信息确定初始虚拟节点的当前状态信息。其中,该物理计算单元的当前运行信息可以为物理计算单元中各物理计算单元的当前运行信息。
进一步地,在本说明书提供的一实施例中,该物理计算单元为GPU;
相应地,所述基于接收的目标任务确定初始虚拟节点的当前状态信息,包括:
基于接收的目标任务确定所述GPU的硬件部件,以及所述硬件部件的当前利用率;
从所述硬件部件中确定所述初始虚拟节点对应的目标硬件部件,并将所述目标硬件部件的当前利用率作为所述初始虚拟节点的当前状态信息。
其中,该GPU的硬件部件包括但不限于硬件编码器、硬件解码器、SP、CUDA Core、Tensor Core等等。
沿用上例,GPU监测器能够获取各GPU实例对应的GPU内部各硬件部件的利用率指标,并将利用率指标周期性地同步至调度系统。之后Serverless平台的调度系统在接收到目标任务的情况下,确定GPU中各硬件单元的当前利用率,并从多个硬件单元中确定GPU实例对应的硬件单元,将GPU中各硬件单元的当前利用率,作为对应的GPU实例的利用率。从而便于后续基于利用率为目标任务确定对应的GPU实例。
步骤304:基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点。
其中,该目标任务的任务类型信息可以理解为表征目标任务的类型的信息,例如,字符、编号等信息。在目标任务为AI训练任务的情况下,该目标任务的任务类型信息可以为表征AI训练任务类型的字符、编号等信息。
候选虚拟节点可以理解为初始虚拟节点中能够处理该目标任务的所有虚拟节点。例如,在目标任务为图形图像处理任务的情况下,该候选虚拟节点为能够处理图形图像处理任务的GPU实例,其中,处理图形图像处理任务的GPU实例与GPU中的硬件编码器、硬件解码器对应。
具体地,调度系统在确定初始虚拟节点的当前状态信息之后,能够确定该目标任务的任务类型信息,并基于该任务类型信息,从初始虚拟节点中确定与任务类型信息对应的候选虚拟节点,也即是,初始虚拟节点中能够处理该目标任务的所有虚拟节点。
步骤306:基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
其中,该目标虚拟节点可以理解为需要将该异构计算任务调度至的GPU实例。
具体地,该调度系统能够基于候选虚拟节点的当前状态信息为该目标任务添加或选择对应的目标虚拟节点,并通过该目标虚拟节点执行目标任务。
在本说明书提供的一实施例中,可以通过扩容GPU实例的方式,为异构计算任务添加对应的GPU实例;也可以从当前存在的GPU实例中为异构计算任务选择硬件性能较优的GPU实例,从而实现灵活的对异构计算任务进行调度,并且节省了硬件资源。基于此,为异构计算任务扩容GPU实例的方式如下。
所述基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,包括:
基于所述候选虚拟节点的当前状态信息,为所述目标任务添加对应的目标虚拟节点。
具体地,该调度系统在确定候选虚拟节点之后,能够在基于该候选虚拟节点的当前状态信息,确定该一个或多个候选虚拟节点无法处理该目标任务的情况下,为该目标任务重新添加对应的目标虚拟节点,也即是说为该目标任务重新申请或创建一个虚拟节点。
进一步地,在本说明书提供的实施例中,所述基于所述候选虚拟节点的当前状态信息,为所述目标任务添加对应的目标虚拟节点,包括:
基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
在所述目标计算比率大于等于第一比率阈值的情况下,为所述目标任务添加对应的目标虚拟节点。
其中,在当前状态信息为硬件编码单元的利用率的情况下,该目标计算比率可以理解为该后续虚拟节点的利用率;该利用率可以根据实际应用场景进行设置,例如该利用率可以为0%至100%区间内的任意数值,或者[0,1]区间内的任意数值等等。
该第一比率阈值可以根据实际应用场景进行设置,本说明书对此不作具体限定,例如70%、0.7等等。
沿用上例,其中,该目标任务可以为音视频生产任务,当前状态信息为硬件编码单元的利用率,该第一比率阈值可以为70%。基于此,调度系统确定GPU实例对应的硬件解码单元的利用率,并将该硬件解码单元的利用率作为该硬件解码单元对应的GPU实例的利用率,其中该利用率可以为80%。之后调度系统在确定该利用率大于第一比率阈值(70%)的情况下,确定该GPU实例的剩余计算能力过低,可能无法执行当前的音视频生产任务。因此,通过GPU实例扩容的方式为音视频生产任务创建新的GPU实例,从而保证音视频生产任务的正常执行。
本说明书提供的实施例中,该调度系统可以通过向库存系统申请GPU实例的方式,达到扩容GPU实例的目的,进一步保证异构计算任务的正常执行;具体实现方式如下。
所述为所述目标任务添加对应的目标虚拟节点,包括:
基于所述目标任务生成虚拟节点获取请求,并将所述虚拟节点获取请求发送至虚拟节点提供模块;
接收所述虚拟节点提供模块基于所述虚拟节点获取请求发送的待确定虚拟节点,并将所述待确定虚拟节点确定为所述目标任务对应的目标虚拟节点。
其中,该虚拟节点提供模块可以理解为能够为目标任务提供虚拟节点的模块。例如,Serverless中的库存系统。相应地,该待确定虚拟节点可以理解为库存系统提供的虚拟节点。例如,库存系统新创建的GPU实例。
沿用上例,调度系统在确定现有的GPU实例的利用率不足以运行该音视频生产任务的情况下,能够向库存系统发生GPU实例获取请求,从而申请一个新的GPU实例,而库存系统会基于调度系统的申请,为其创建新GPU实例,并将该新GPU实例发送至调度系 统。该调度系统将新申请的GPU实例作为处理该音视频生产任务的GPU实例,后续该调度系统能够将音视频生产任务调度至新GPU实例中运行。
进一步地,调度系统从当前存在的GPU实例中,为异构计算任务匹配硬件性能较优的GPU实例的方式如下。
所述基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,包括:
基于所述候选虚拟节点的当前状态信息,从所述候选虚拟节点中选择所述目标任务对应的目标虚拟节点。
具体地,该调度系统在确定候选虚拟节点之后,能够在基于该候选虚拟节点的当前状态信息,确定该一个或多个候选虚拟节点能够处理该目标任务的情况下,从候选虚拟节点为该目标任务选择一个对应的目标虚拟节点。
进一步地,在本说明书提供的实施例中,所述基于所述候选虚拟节点的当前状态信息,从所述候选虚拟节点中选择所述目标任务对应的目标虚拟节点,包括:
基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
在所述目标计算比率小于第一比率阈值的情况下,从所述目标计算比率中确定最小目标计算比率;
基于所述最小目标计算比率对应的所述候选虚拟节点,确定所述目标任务对应的目标虚拟节点。
沿用上例,调度系统确定GPU实例对应的一个或多个硬件解码单元的利用率。在确定该利用率存在小于等于第一比率阈值70%的情况下,确定当前GPU实例中存在能够运行该音视频生产任务的GPU实例。之后该调度系统从小于等于第一比率阈值70%的利用率中,确定最小利用率。
在该最小利用率为一个的情况下,将该最小利用率对应的GPU实例确定为运行该音视频生产任务的GPU实例。
在该最小利用率为多个的情况下,从多个最小利用率对应的GPU实例中,随机确定一个或多个GPU实例,作为运行该音视频生产任务的GPU实例。
在实际应用中,在确定利用率小于等于第一比率阈值70%的情况下,从中确定性能较优的GPU实例还包括:基于所述目标计算比率对所述候选虚拟节点进行排序,获得所述候选虚拟节点的排序结果,其中,所述候选虚拟节点包括至少两个;
基于所述排序结果从所述候选虚拟节点中为所述目标任务确定对应的目标虚拟节点。
沿用上例,调度系统在确定该利用率小于等于第一比率阈值70%的情况下,确定利用率小于等于第一比率阈值70%的GPU实例,并基于利用率对该GPU实例进行降序排序,获得该GPU实例的排序结果,其中该排序结果中GPU实例越靠前,利用率越小,也即是 该GPU实例的性能越优,基于此,该调度系统将音视频生产任务调度至排序结果中第一位、或者排序结果中前特定数量的GPU实例,其中该特定数量可以根据实际应用场景进行设置,例如前三位、前十位。
进一步地,在本说明书提供的实施例中,在物理计算单元为GPU。在此情况下,所述通过所述目标虚拟节点执行所述目标任务之后,还包括:
在基于所述目标硬件部件的当前利用率确定所述目标任务执行完毕的情况下,删除所述目标虚拟节点。
具体地,该调度系统能够实时监测该目标硬件部件的当前利用率,例如,通过GPU监测器组件(GPU Monitor)获取各GPU内部的硬件部件的利用率指标,并在根据利用率指标确定目标任务执行完毕的情况下,则删除待删除虚拟机节点,从而节省了硬件资源。
此外,参见图4,图4是本说明书一个实施例提供的一种任务处理方法中任务调度的流程图。其中,函数调取请求可以理解为上述目标任务。基于此,调度系统在接收到请求调度的指令后,能够确定该函数请求是否有存活的GPU实例,也即是确定是否有实例能够运行该函数请求。其中,确定该是否有实例能够运行该函数请求的方式,可以通过判断GPU实例的利用率是否小于第一比率阈值(即预设的GPU硬件最大利用率)的方式确定。
若否,该调度系统则向库存系统申请新的GPU实例,并在申请成功后,继续判断是否有实例能够运行该函数请求。
若是,调度系统则确定存在能够运行该函数调取请求的GPU实例,并通过RR轮询的方式,将该函数调用请求负载均衡至所有的GPU实例,从而执行调度终止。其中,将函数调用请求负载均衡至所有的GPU实例,可以采用上述基于利用率对GPU实例进行排序的方式,确定性能较优的GPU实例,实现负载均衡。
基于此,实现了当函数请求到来时,Serverless调度系统通过采用RR轮询方式,将请求依次负载均衡至扩容后、缩容后的每个GPU实例,从而保证整体GPU集群被充分使用。
在本说明书提供的一实施例中,本说明书提供的任务处理方,可以通过采样GPU硬件内部各部件的利用率,并允许用户根据不同场景的异构计算任务设置不同的弹性伸缩指标,从而解决了过度扩容带来的成本浪费、以及过早缩窄带来的性能损失,
本说明书提供的实施例中,考虑到serverless默认的基于请求并发度的弹性伸缩调度方法,并未综合考虑GPU硬件内部各部件的利用率,从而无法解决过度扩容带来的成本浪费、以及过早缩窄带来的性能损失的问题,本说明书提供的任务处理方法,可以通过采样GPU硬件内部各部件的利用率,并允许用户根据不同场景的异构计算任务设置不同的弹性伸缩指标,从而周期性的基于GPU实例的硬件利用率对GPU实例进行弹性伸缩,解决了过度扩容带来的成本浪费、以及过早缩窄带来的性能损失的问题。具体实现方式如下。
所述的任务处理方法,还包括:
基于初始虚拟节点对应的物理计算单元,确定所述初始虚拟节点的当前状态信息;
在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点添加条件的情况下,添加新增虚拟节点;或者
在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
具体地,调度系统能够周期性确定该初始虚拟节点对应的物理计算单元,并基于该物理计算单元确定该初始虚拟节点的当前状态信息。例如,实时监测该GPU中各硬件部件的利用率,并基于该各硬件部件的利用率,确定各硬件部件对应的GPU实例的利用率。
之后,在基于当前状态信息确定初始虚拟节点满足预设节点添加条件的情况下,则可以确定当前初始虚拟节点的数量过少,从而添加新增虚拟节点;或者
在基于当前状态信息确定初始虚拟节点满足预设节点删除条件的情况下,则确定前初始虚拟节点的数量过少,许多虚拟节点的利用率较低,因此删除初始虚拟节点中的空闲虚拟节点。从而灵活且准确地对虚拟节点进行添加和删除,解决了过度扩容带来的成本浪费、以及过早缩窄带来的性能损失。
进一步地,在本说明书提供的实施例中,所述在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点添加条件的情况下,添加新增虚拟节点,包括:
基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
在所述目标计算比率大于节点负载阈值的情况下,确定所述初始虚拟节点满足预设节点添加条件;
在所述初始虚拟节点满足预设节点添加条件的情况下,基于虚拟节点提供模块添加新增虚拟节点。
其中,该节点负载阈值可以理解为预先为每个初始虚拟节点设定的、表征其利用率已经达到负载状态或即将达到负载状态的阈值。此外,该节点负载阈值可以根据实际应用场景进行设置,例如,该节点负载阈值可以设置为80%。
具体地,调度系统可以基于初始虚拟节点的当前状态信息,确定踹初始虚拟节点的目标计算比率,并在确定该目标计算比率大于节点负载阈值的情况下,确定初始虚拟节点满足预设节点添加条件,且在初始虚拟节点满足预设节点添加条件的情况下,向虚拟节点提供模块申请添加新的虚拟节点,从而将虚拟节点提供模块提供的虚拟节点作为新增虚拟节点。
沿用上例,本说明书提供的任务处理方法,能够允许用户根据不同场景的异构计算任务设置不同的弹性指标,该弹性指标包括弹性扩容指标和弹性缩容指标。其中,该弹性扩容指标可以理解利用率阈值,也即是节点负载阈值。基于此,调度系统将音视频生产任务调度至性能较优的GPU实例后,能够实时监测GPU实例基于硬件解码单元确定的利用率。在该利用率大于等于用户设置的弹性扩容指标(节点负载阈值)的情况下,该调度系统能够主动向库存系统申请新的GPU实例,并基于该新的GPU实例以及基于利用率为音视频 生产任务确定的GPU实例,共同运行该视频生产任务,从而保证了视频生产任务的正常运行。并且实现了用户能够根据不同场景的异构计算任务设置不同的弹性伸缩指标。
进一步地,在本说明书提供的实施例中,所述在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点,包括:
基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
在所述目标计算比率小于节点空闲阈值的情况下,确定所述初始虚拟节点满足预设节点删除条件;
在所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
其中,该节点空闲阈值可以理解为预先为每个初始虚拟节点设定的、表征其利用率已经达到空闲状态的阈值。此外,该节点空闲阈值可以根据实际应用场景进行设置,例如,该节点负载阈值可以设置为0%、5%。
沿用上例,本说明书提供的任务处理方法,调度系统将音视频生产任务调度至性能较优的GPU实例后,能够实时监测该GPU实例中硬件解码单元的利用率。在该利用率小于用户设置的弹性缩容指标(节点空闲阈值)的情况下,该调度系统能够主动向库存系统删除多余的GPU实例,从而在保证视频生产任务的正常运行,节省了硬件资源。
在实际应用中,在删除GPU实例的过程中,由于GPU实例中可能还运行有音视频生产任务,因此,调度系统需要在GPU实例对应的任务执行完毕的情况下,将该GPU实例删除。具体方式如下。
删除所述初始虚拟节点中的空闲虚拟节点,包括:
监测所述空闲虚拟节点的任务执行状态信息;
在基于所述任务执行状态信息确定目标任务执行完成的情况下,删除所述空闲虚拟节点。
其中,该任务执行状态信息可以理解为表征任务执行进度的信息。
具体地,该调度系统能够实时监测该空闲虚拟节点的任务执行状态信息,并在基于任务执行状态信息确定目标任务执行完成的情况下,删除空闲虚拟节点,从而节省了硬件资源。
在实际应用中,调度系统能够周期性的进行扩容检查,该扩容检测可以理解为上述实施例中,调度系统根据用户对不同场景的异构计算任务设置不同的弹性指标进行弹性伸缩的内容。参见图5,图5是本说明书一个实施例提供的一种任务处理方法中弹性伸缩的示意图,其中,弹性伸缩基于GPU利用率实现,用户可以配置每个函数的GPU弹性伸缩指标。参见图5,调度系统能够周期性的进行扩容检查,确定该函数(即函数调用请求)所 有GPU实现的汇总利用率是否高于用户配置,也即是,确定运行该函数调用请求的所有GPU的汇总利用率,是否高于用户配置的、针对每个函数调用请求的GPU弹性伸缩指标。例如,在音视频场景下,GPU弹性伸缩指标可以配置为,当GPU实例的硬件编码利用率>80%时进行扩容;在AI场景下,GPU弹性伸缩指标可以配置为,当GPU实例的CUDA CORE硬件利用率<20%时,进行缩容。
基于此,若是,即若调度系统在确定汇总利用率高于用户配置的情况下,向库存系统申请新的GPU实例,并执行调度终止。若否,即若调度系统在确定汇总利用率低于用户配置的情况下,向库存系统归还已申请的GPU实例,并执行调度终止。
需要说明的是,当任务处理方法应用在Serverless场景的情况下,针对基于GPU的异构计算任务的弹性伸缩问题,该当前状态信息还可以包括多维度混合指标,从而便于后续基于多维度混合指标进行综合调度决策,以更好的适应各种不同场景下的异构计算任务的伸缩要求。
此外,本说明书提供的任务处理方法中的GPU实例扩容,采取较为激进的策略,以保证用户函数的服务性能;而GPU实例缩容,则采取较为懒惰的策略,兼顾保证用户函数的成本。激进系数与懒惰系数,将在图5中的“该函数所有GPU实例的汇总利用率是否高于用户配置”时生效。
本说明书提供的任务处理方法,通过在接收到目标任务的情况下,基于初始虚拟节点的当前状态信息以及目标任务的任务类型信息,为该目标任务确定对应的目标虚拟节点,并通过目标虚拟节点执行目标任务,从而满足了准确的为目标任务确定目标虚拟节点的需求。
下述结合附图6,以本说明书提供的任务处理方法在基于GPU利用率实现弹性伸缩场景的应用为例,对所述任务处理方法进行进一步说明。其中,图6示出了本说明书一个实施例提供的一种任务处理方法的处理过程流程图,图6提供了一种基于GPU利用率实现弹性伸缩的系统框架,其中,该系统框架包括请求接入系统、调度系统、库存系统、GPU实例,以及GPU监测器(GPU Monitor)。该系统框架中的GPU监测器组件(GPU Monitor)用于获取各GPU实例的内部硬件部件利用率指标;需要说明的是,根据本说明书提供的任务处理方法所应用的场景不同,该硬件部件也不同;例如,在任务处理方法应用于音视频生产场景的情况下,该硬件部件可以为硬件编码单元、硬件解码单元;在任务处理方法应用于AI生产场景的情况下,该硬件部件可以为Cuda Core等等;相应地,该硬件部件利用率指标包括但不限于音视频生产场景的硬件编码利用率、音视频生产场景的硬件解码利用率、AI生产场景的Cuda Core利用率、AI生产场景的Tensor Core利用率、AI生产场景的NVLINK带宽利用率、所有场景的显存利用率等等。
基于此,该GPU监测器将这些GPU硬件部分利用率周期性地同步至调度系统,当用户完成Serverless的服务功能编写,并向Serverless平台的请求接入系统发起函数调用请求 的情况下,该接入请求系统通过接入写协议将请求调度至调度系统,Serverless调度系统将基于GPU硬件利用率以及相应的调度策略,对GPU实例进行弹性扩缩容。比如,Serverless调度系统将基于GPU硬件利用率以及相应的调度策略,向库存系统申请或归还GPU实例的方式,实现对GPU实例进行弹性扩缩容。需要说明的是,该调度策略可以根据实际应用场景进行设置,本说明书对此不做具体限定,例如图4所展示的调度策略。
同时,该Serverless调度系统在对GPU实例进行弹性扩缩容之后,将用户的函数请求,调度至不同的GPU实例,以便运行相应的功能;该GPU实例则负责函数的具体执行。
本说明书实施例提供的任务处理方法,在Serverless场景下,提供一种基于GPU不同维度指标的弹性伸缩控制方法,使不同场景的异构计算任务(音视频生产、AI生产、图形图像生产)可以设置不同的弹性伸缩指标,从而达到性能与成本兼顾的弹性伸缩策略。
与上述方法实施例相对应,本说明书还提供了任务处理装置实施例,图6示出了本说明书一个实施例提供的一种任务处理装置的结构示意图。如图6所示,该装置包括:
接收模块702,被配置为基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;
确定模块704,被配置为基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;
执行模块706,被配置为基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
可选地,所述执行模块706,还被配置为:
基于所述候选虚拟节点的当前状态信息,从所述候选虚拟节点中选择所述目标任务对应的目标虚拟节点。
可选地,所述执行模块706,还被配置为:
基于所述候选虚拟节点的当前状态信息,为所述目标任务添加对应的目标虚拟节点。
可选地,所述任务处理装置还包括节点处理模块,被配置为:
基于初始虚拟节点对应的物理计算单元,确定所述初始虚拟节点的当前状态信息;
在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点添加条件的情况下,添加新增虚拟节点;或者
在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
可选地,所述任务处理装置中,所述物理计算单元为GPU;
相应地,所述接收模块702,还被配置为:
基于接收的目标任务确定所述GPU的硬件部件,以及所述硬件部件的当前利用率;
从所述硬件部件中确定所述初始虚拟节点对应的目标硬件部件,并将所述目标硬件部 件的当前利用率作为所述初始虚拟节点的当前状态信息。
可选地,所述任务处理装置还包括删除模块,被配置为:
在基于所述目标硬件部件的当前利用率确定所述目标任务执行完毕的情况下,删除所述目标虚拟节点。
可选地,所述执行模块706,还被配置为:
基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
在所述目标计算比率大于等于第一比率阈值的情况下,为所述目标任务添加对应的目标虚拟节点。
可选地,所述执行模块706,还被配置为:
基于所述目标任务生成虚拟节点获取请求,并将所述虚拟节点获取请求发送至虚拟节点提供模块;
接收所述虚拟节点提供模块基于所述虚拟节点获取请求发送的待确定虚拟节点,并将所述待确定虚拟节点确定为所述目标任务对应的目标虚拟节点。
可选地,所述执行模块706,还被配置为:
基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
在所述目标计算比率小于第一比率阈值的情况下,从所述目标计算比率中确定最小目标计算比率;
基于所述最小目标计算比率对应的所述候选虚拟节点,确定所述目标任务对应的目标虚拟节点。
可选地,所述节点处理模块,还被配置为:
基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
在所述目标计算比率大于节点负载阈值的情况下,确定所述初始虚拟节点满足预设节点添加条件;
在所述初始虚拟节点满足预设节点添加条件的情况下,基于虚拟节点提供模块添加新增虚拟节点。
可选地,所述节点处理模块,还被配置为:
基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
在所述目标计算比率小于节点空闲阈值的情况下,确定所述初始虚拟节点满足预设节点删除条件;
在所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
可选地,所述接收模块702,还被配置为:
基于接收的目标任务,确定所述物理计算单元中的物理计算子单元,以及所述物理计算子单元的当前运行信息;
从所述物理计算子单元中确定所述初始虚拟节点对应的目标物理计算子单元;
将所述目标物理计算子单元的当前运行信息作为所述初始虚拟节点的当前状态信息。
可选地,所述任务处理装置还包括信息接收模块,被配置为:
接收信息采集模块发送的、所述物理计算单元中的所述物理计算子单元的当前运行信息,其中,所述信息采集模块为监测所述物理计算单元中的、所述物理计算子单元的当前运行信息的模块。
本说明书提供的任务处理装置,通过在接收到目标任务的情况下,基于初始虚拟节点的当前状态信息以及目标任务的任务类型信息,为该目标任务确定对应的目标虚拟节点,并通过目标虚拟节点执行目标任务,从而满足了准确的为目标任务确定目标虚拟节点的需求。
上述为本实施例的一种任务处理装置的示意性方案。需要说明的是,该任务处理装置的技术方案与上述的任务处理方法的技术方案属于同一构思,任务处理装置的技术方案未详细描述的细节内容,均可以参见上述任务处理方法的技术方案的描述。
图8示出了根据本说明书一个实施例提供的一种计算设备800的结构框图。该计算设备800的部件包括但不限于存储器810和处理器820。处理器820与存储器810通过总线830相连接,数据库850用于保存数据。
计算设备800还包括接入设备840,接入设备840使得计算设备800能够经由一个或多个网络860通信。这些网络的示例包括公用交换电话网(PSTN)、局域网(LAN)、广域网(WAN)、个域网(PAN)或诸如因特网的通信网络的组合。接入设备840可以包括有线或无线的任何类型的网络接口(例如,网络接口卡(NIC))中的一个或多个,诸如IEEE802.11无线局域网(WLAN)无线接口、全球微波互联接入(Wi-MAX)接口、以太网接口、通用串行总线(USB)接口、蜂窝网络接口、蓝牙接口、近场通信(NFC)接口,等等。
在本说明书的一个实施例中,计算设备800的上述部件以及图8中未示出的其他部件也可以彼此相连接,例如通过总线。应当理解,图8所示的计算设备结构框图仅仅是出于示例的目的,而不是对本说明书范围的限制。本领域技术人员可以根据需要,增添或替换其他部件。
计算设备800可以是任何类型的静止或移动计算设备,包括移动计算机或移动计算设备(例如,平板计算机、个人数字助理、膝上型计算机、笔记本计算机、上网本等)、移动电话(例如,智能手机)、可佩戴的计算设备(例如,智能手表、智能眼镜等)或其他类型的移动设备,或者诸如台式计算机或PC的静止计算设备。计算设备800还可以是移动式或静止式的服务器。
其中,处理器820用于执行如下计算机可执行指令,该计算机可执行指令被处理器820执行时实现上述任务处理方法的步骤。
上述为本实施例的一种计算设备的示意性方案。需要说明的是,该计算设备的技术方案与上述的任务处理方法的技术方案属于同一构思,计算设备的技术方案未详细描述的细节内容,均可以参见上述任务处理方法的技术方案的描述。
本说明书一实施例还提供一种计算机可读存储介质,其存储有计算机可执行指令,该计算机可执行指令被处理器执行时实现上述任务处理方法的步骤。
上述为本实施例的一种计算机可读存储介质的示意性方案。需要说明的是,该存储介质的技术方案与上述的任务处理方法的技术方案属于同一构思,存储介质的技术方案未详细描述的细节内容,均可以参见上述任务处理方法的技术方案的描述。
本说明书一实施例还提供一种计算机程序,其中,当所述计算机程序在计算机中执行时,令计算机执行上述任务处理方法的步骤。
上述为本实施例的一种计算机程序的示意性方案。需要说明的是,该计算机程序的技术方案与上述的任务处理方法的技术方案属于同一构思,计算机程序的技术方案未详细描述的细节内容,均可以参见上述任务处理方法的技术方案的描述。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
所述计算机指令包括计算机程序代码,所述计算机程序代码可以为源代码形式、虚拟节点代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
需要说明的是,对于前述的各方法实施例,为了简便描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本说明书实施例并不受所描述的动作顺序的限制,因为依据本说明书实施例,某些步骤可以采用其它顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定都是本说明书实施例所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。
以上公开的本说明书优选实施例只是用于帮助阐述本说明书。可选实施例并没有详尽叙述所有的细节,也不限制该发明仅为所述的具体实施方式。显然,根据本说明书实施例的内容,可作很多的修改和变化。本说明书选取并具体描述这些实施例,是为了更好地解释本说明书实施例的原理和实际应用,从而使所属技术领域技术人员能很好地理解和利用本说明书。本说明书仅受权利要求书及其全部范围和等效物的限制。

Claims (14)

  1. 一种任务处理方法,包括:
    基于接收的目标任务确定初始虚拟节点的当前状态信息,其中,所述当前状态信息基于初始虚拟节点对应的物理计算单元确定;
    基于所述目标任务的任务类型信息,从所述初始虚拟节点中确定与所述任务类型信息对应的候选虚拟节点;
    基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,并通过所述目标虚拟节点执行所述目标任务。
  2. 根据权利要求1所述的任务处理方法,所述基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,包括:
    基于所述候选虚拟节点的当前状态信息,从所述候选虚拟节点中选择所述目标任务对应的目标虚拟节点。
  3. 根据权利要求1所述的任务处理方法,所述基于所述候选虚拟节点的当前状态信息为所述目标任务确定对应的目标虚拟节点,包括:
    基于所述候选虚拟节点的当前状态信息,为所述目标任务添加对应的目标虚拟节点。
  4. 根据权利要求1所述的任务处理方法,还包括:
    基于初始虚拟节点对应的物理计算单元,确定所述初始虚拟节点的当前状态信息;
    在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点添加条件的情况下,添加新增虚拟节点;或者
    在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
  5. 根据权利要求1所述任务处理方法,所述物理计算单元为GPU;
    相应地,所述基于接收的目标任务确定初始虚拟节点的当前状态信息,包括:
    基于接收的目标任务确定所述GPU的硬件部件,以及所述硬件部件的当前利用率;
    从所述硬件部件中确定所述初始虚拟节点对应的目标硬件部件,并将所述目标硬件部件的当前利用率作为所述初始虚拟节点的当前状态信息。
  6. 根据权利要求5所述任务处理方法,所述通过所述目标虚拟节点执行所述目标任务之后,还包括:
    在基于所述目标硬件部件的当前利用率确定所述目标任务执行完毕的情况下,删除所述目标虚拟节点。
  7. 根据权利要求3所述任务处理方法,所述基于所述候选虚拟节点的当前状态信息,为所述目标任务添加对应的目标虚拟节点,包括:
    基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
    在所述目标计算比率大于等于第一比率阈值的情况下,为所述目标任务添加对应的目标虚拟节点。
  8. 根据权利要求7所述任务处理方法,所述为所述目标任务添加对应的目标虚拟节点,包括:
    基于所述目标任务生成虚拟节点获取请求,并将所述虚拟节点获取请求发送至虚拟节点提供模块;
    接收所述虚拟节点提供模块基于所述虚拟节点获取请求发送的待确定虚拟节点,并将所述待确定虚拟节点确定为所述目标任务对应的目标虚拟节点。
  9. 根据权利要求2所述任务处理方法,所述基于所述候选虚拟节点的当前状态信息,从所述候选虚拟节点中选择所述目标任务对应的目标虚拟节点,包括:
    基于所述候选虚拟节点的当前状态信息确定所述候选虚拟节点的目标计算比率;
    在所述目标计算比率小于第一比率阈值的情况下,从所述目标计算比率中确定最小目标计算比率;
    基于所述最小目标计算比率对应的所述候选虚拟节点,确定所述目标任务对应的目标虚拟节点。
  10. 根据权利要求4所述的任务处理方法,所述在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点添加条件的情况下,添加新增虚拟节点,包括:
    基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
    在所述目标计算比率大于节点负载阈值的情况下,确定所述初始虚拟节点满足预设节点添加条件;
    在所述初始虚拟节点满足预设节点添加条件的情况下,基于虚拟节点提供模块添加新增虚拟节点。
  11. 根据权利要求4所述的任务处理方法,所述在基于所述当前状态信息,确定所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点,包括:
    基于所述初始虚拟节点的当前状态信息,确定所述初始虚拟节点的目标计算比率;
    在所述目标计算比率小于节点空闲阈值的情况下,确定所述初始虚拟节点满足预设节点删除条件;
    在所述初始虚拟节点满足预设节点删除条件的情况下,删除所述初始虚拟节点中的空闲虚拟节点。
  12. 根据权利要求1所述任务处理方法,所述基于接收的目标任务确定初始虚拟节点的当前状态信息,包括:
    基于接收的目标任务,确定所述物理计算单元中的物理计算子单元,以及所述物理计算子单元的当前运行信息;
    从所述物理计算子单元中确定所述初始虚拟节点对应的目标物理计算子单元;
    将所述目标物理计算子单元的当前运行信息作为所述初始虚拟节点的当前状态信息。
  13. 根据权利要求12所述任务处理方法,所述基于接收的目标任务确定初始虚拟节点的当前状态信息之前,还包括:
    接收信息采集模块发送的、所述物理计算单元中的所述物理计算子单元的当前运行信息,其中,所述信息采集模块为监测所述物理计算单元中的、所述物理计算子单元的当前运行信息的模块。
  14. 一种计算设备,包括:
    存储器和处理器;
    所述存储器用于存储计算机可执行指令,所述处理器用于执行所述计算机可执行指令,该计算机可执行指令被处理器执行时实现权利要求1至13任意一项所述任务处理方法的步骤。
PCT/CN2023/088249 2022-04-24 2023-04-14 任务处理方法 WO2023207623A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210454886.1 2022-04-24
CN202210454886.1A CN114995997A (zh) 2022-04-24 2022-04-24 任务处理方法

Publications (1)

Publication Number Publication Date
WO2023207623A1 true WO2023207623A1 (zh) 2023-11-02

Family

ID=83026219

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/088249 WO2023207623A1 (zh) 2022-04-24 2023-04-14 任务处理方法

Country Status (2)

Country Link
CN (1) CN114995997A (zh)
WO (1) WO2023207623A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114995997A (zh) * 2022-04-24 2022-09-02 阿里巴巴(中国)有限公司 任务处理方法
CN115658269B (zh) * 2022-11-01 2024-02-27 上海玫克生储能科技有限公司 一种用于任务调度的异构计算终端

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090158275A1 (en) * 2007-12-13 2009-06-18 Zhikui Wang Dynamically Resizing A Virtual Machine Container
KR20130115553A (ko) * 2012-04-12 2013-10-22 한국전자통신연구원 동적 자원 관리를 위한 2단계 자원 관리 방법 및 장치
CN109408205A (zh) * 2017-08-16 2019-03-01 北京京东尚科信息技术有限公司 基于hadoop集群的任务调度方法和装置
CN112269641A (zh) * 2020-11-18 2021-01-26 网易(杭州)网络有限公司 一种调度方法、装置、电子设备及存储介质
CN112286644A (zh) * 2020-12-25 2021-01-29 同盾控股有限公司 Gpu虚拟化算力的弹性调度方法、系统、设备和存储介质
CN112486653A (zh) * 2020-12-02 2021-03-12 胜斗士(上海)科技技术发展有限公司 调度多类型计算资源的方法、装置和系统
WO2021128737A1 (zh) * 2019-12-25 2021-07-01 上海商汤智能科技有限公司 资源调度方法及装置、电子设备和存储介质
CN114371926A (zh) * 2022-03-22 2022-04-19 清华大学 一种精细化资源分配方法、装置、电子设备及介质
CN114995997A (zh) * 2022-04-24 2022-09-02 阿里巴巴(中国)有限公司 任务处理方法

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090158275A1 (en) * 2007-12-13 2009-06-18 Zhikui Wang Dynamically Resizing A Virtual Machine Container
KR20130115553A (ko) * 2012-04-12 2013-10-22 한국전자통신연구원 동적 자원 관리를 위한 2단계 자원 관리 방법 및 장치
CN109408205A (zh) * 2017-08-16 2019-03-01 北京京东尚科信息技术有限公司 基于hadoop集群的任务调度方法和装置
WO2021128737A1 (zh) * 2019-12-25 2021-07-01 上海商汤智能科技有限公司 资源调度方法及装置、电子设备和存储介质
CN112269641A (zh) * 2020-11-18 2021-01-26 网易(杭州)网络有限公司 一种调度方法、装置、电子设备及存储介质
CN112486653A (zh) * 2020-12-02 2021-03-12 胜斗士(上海)科技技术发展有限公司 调度多类型计算资源的方法、装置和系统
CN112286644A (zh) * 2020-12-25 2021-01-29 同盾控股有限公司 Gpu虚拟化算力的弹性调度方法、系统、设备和存储介质
CN114371926A (zh) * 2022-03-22 2022-04-19 清华大学 一种精细化资源分配方法、装置、电子设备及介质
CN114995997A (zh) * 2022-04-24 2022-09-02 阿里巴巴(中国)有限公司 任务处理方法

Also Published As

Publication number Publication date
CN114995997A (zh) 2022-09-02

Similar Documents

Publication Publication Date Title
WO2023207623A1 (zh) 任务处理方法
EP3876161A1 (en) Method and apparatus for training deep learning model
Yang et al. A framework for partitioning and execution of data stream applications in mobile cloud computing
CN104915407B (zh) 一种基于Hadoop多作业环境下的资源调度方法
US11182216B2 (en) Auto-scaling cloud-based computing clusters dynamically using multiple scaling decision makers
Xu et al. Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters
WO2023082914A1 (zh) 资源分配方法、装置、可读介质及电子设备
CN109254846B (zh) 基于两级调度的cpu与gpu协同计算的动态调度方法及系统
CN111897658B (zh) 一种基于函数计算节点的云计算系统扩容方法及装置
CN112422977A (zh) 音视频转码任务的分配方法和分配装置
CN111580974B (zh) Gpu实例分配方法、装置、电子设备和计算机可读介质
US11422858B2 (en) Linked workload-processor-resource-schedule/processing-system—operating-parameter workload performance system
CN115794262A (zh) 任务处理方法、装置、设备、存储介质以及程序产品
CN111858040A (zh) 一种资源调度方法和装置
CN114490048A (zh) 任务执行方法、装置、电子设备及计算机存储介质
WO2024082770A1 (zh) 视频转码方法、装置、设备、存储介质及视频点播系统
CN110489219B (zh) 一种调度功能对象的方法、装置、介质和电子设备
WO2023165318A1 (zh) 资源处理系统以及方法
CN111597035A (zh) 基于多线程的仿真引擎时间推进方法及系统
CN116737370A (zh) 一种多资源调度方法、系统、存储介质及终端
CN111209263A (zh) 数据存储方法、装置、设备及存储介质
CN112667368A (zh) 一种任务数据处理方法和装置
CN114640681A (zh) 一种数据处理方法和系统
Liu et al. A speculative execution strategy based on node classification and hierarchy index mechanism for heterogeneous hadoop systems
US20160110219A1 (en) Managing i/o operations in a shared file system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795056

Country of ref document: EP

Kind code of ref document: A1