CN117687774A

CN117687774A - Task model training method and computing power scheduling method and system for computing power scheduling

Info

Publication number: CN117687774A
Application number: CN202311507154.5A
Authority: CN
Inventors: 杨明烜; 洪学海
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2023-11-13
Filing date: 2023-11-13
Publication date: 2024-03-12
Anticipated expiration: 2043-11-13
Also published as: CN117687774B

Abstract

The invention provides a task model training method for computing power dispatching and a computing power dispatching method and system, wherein the computing power dispatching method comprises the following steps: acquiring a current task to be scheduled, wherein the task comprises a task type and a first resource usage amount applied for executing the task; according to the task model obtained through training, under the condition that the task performance is not influenced or within an acceptable range, carrying out dynamic telescopic adjustment on the application quota of the resource with small influence on the task performance in the first resource usage amount to obtain the resource usage amount allocated for executing the task; and performing computational scheduling on the task according to the allocated resource usage. The method and the device can complete self-adaptive resource dynamic scheduling by integrating the influence of the multidimensional resource on the task performance, and effectively improve the task deployment completion condition and the multidimensional resource utilization rate in AI computing power scheduling.

Description

Task model training method and computing power scheduling method and system for computing power scheduling

技术领域Technical field

本发明涉及云计算领域，具体涉及云计算的技术资源调度领域，尤其涉及用于算力调度的任务模型训练方法及算力调度方法和系统。The present invention relates to the field of cloud computing, specifically to the technical resource scheduling field of cloud computing, and in particular to a task model training method for computing power scheduling and a computing power scheduling method and system.

背景技术Background technique

随着近年来人工智能技术在生活中的广泛应用，越来越多的智能算力集群被构建。在云计算资源调度领域，提出人工智能(artificial intelligence,AI)算力的集群调度问题，用于解决AI算力资源的利用率和机器学习任务的部署效率问题，最终实现AI算力的计算效率增长。目前AI算力集群调度主要针对机器学习(machine learning,ML)任务特性进行分析，对AI算力中的图形处理器(graphics processing unit，GPU)资源的分配进行优化，从而提高ML任务性能或GPU资源利用率。例如，考虑ML任务在训练过程中周期性、抢占性和位置敏感性的特性用以指导GPU分配和共享，满足不同类型的训练任务需求的高效调度，并实现GPU资源利用率的提升。现有人工智能算力的资源调度方法通常只针对GPU资源这一单一维度设计调度策略，但实际的任务性能不是只考虑单一的GPU资源分配就能够决定，而是由多维度资源共同的作用下所决定的，所以现有调度方法的设计性能并不能在实际生产集群中得到完全实现。而且，这种AI算力调度方法不能充分利用AI集群中多个维度的整体资源，造成有限集群资源的浪费。With the widespread application of artificial intelligence technology in life in recent years, more and more intelligent computing power clusters have been built. In the field of cloud computing resource scheduling, the cluster scheduling problem of artificial intelligence (AI) computing power is proposed to solve the problem of utilization of AI computing power resources and deployment efficiency of machine learning tasks, and ultimately achieve the computing efficiency of AI computing power. increase. At present, AI computing cluster scheduling mainly analyzes the characteristics of machine learning (ML) tasks and optimizes the allocation of graphics processing unit (GPU) resources in AI computing to improve ML task performance or GPU Resource utilization. For example, consider the characteristics of periodicity, preemption, and location sensitivity of ML tasks during the training process to guide GPU allocation and sharing, meet the efficient scheduling of different types of training task requirements, and improve GPU resource utilization. Existing resource scheduling methods for artificial intelligence computing power usually only design scheduling strategies for a single dimension of GPU resources. However, actual task performance is not determined by considering only a single GPU resource allocation, but is determined by the joint action of multi-dimensional resources. is determined, so the design performance of existing scheduling methods cannot be fully realized in actual production clusters. Moreover, this AI computing power scheduling method cannot fully utilize the overall resources of multiple dimensions in the AI cluster, resulting in a waste of limited cluster resources.

发明内容Contents of the invention

为了解决上述问题，本发明提出结合机器学习方法构建的任务模型，并利用其对当前动态队列中的任务进行资源分配的伸缩调整，完成算力调度。In order to solve the above problems, the present invention proposes a task model constructed by combining machine learning methods, and uses it to scale and adjust resource allocation for tasks in the current dynamic queue to complete computing power scheduling.

根据第一方面，本发明实施例提供一种用于算力调度的任务模型的训练方法，包括：以任务执行时间、任务GPU利用率、任务平均内存占用、任务CPU核数、任务类型作为训练样本；其中，所述任务GPU利用率、所述任务平均内存占用、所述任务CPU核数、所述任务类型作为任务模型的输入，所述任务执行时间为标签作为期望输出值；根据训练样本中的数据，基于梯度提升的回归树对所述任务模型进行训练；将训练后的所述任务模型作为算力资源分配调度方法中任务执行时间的预测模型。According to the first aspect, embodiments of the present invention provide a training method for a task model for computing power scheduling, including: using task execution time, task GPU utilization, task average memory usage, task CPU core number, and task type as training Sample; wherein, the task GPU utilization, the average memory usage of the task, the number of CPU cores of the task, and the task type are used as the input of the task model, and the task execution time is labeled as the expected output value; according to the training sample The task model is trained based on the data in the gradient boosted regression tree; the trained task model is used as a prediction model of task execution time in the computing resource allocation and scheduling method.

优选的，同一所述训练样本中的所述任务类型是相同的。Preferably, the task types in the same training sample are the same.

优选的，所述任务类型包括计算机视觉任务、自然语言处理任务、强化学习任务、图神经网络任务、推荐任务。Preferably, the task types include computer vision tasks, natural language processing tasks, reinforcement learning tasks, graph neural network tasks, and recommendation tasks.

优选的，所述回归树的深度为10，采用的基学习器数量为100，学习率为0.1。Preferably, the depth of the regression tree is 10, the number of base learners used is 100, and the learning rate is 0.1.

根据第二方面，本发明实施例提供一种算力调度的方法，包括：获取当前待调度的任务，所述任务包括任务类型以及为所述任务的执行所申请的第一资源使用量；根据第一方面中任一方法得到的任务模型，在不影响任务性能或在可接受的范围内的条件下，对第一资源使用量中的对任务性能影响小的资源的申请配额进行动态伸缩调整，得到为所述任务的执行所分配的资源使用量；根据所述分配的资源使用量对所述任务进行算力调度。According to the second aspect, an embodiment of the present invention provides a method of computing power scheduling, including: obtaining a task currently to be scheduled, where the task includes a task type and a first resource usage amount applied for the execution of the task; according to The task model obtained by any of the methods in the first aspect dynamically adjusts the application quota for resources that have a small impact on task performance in the first resource usage without affecting task performance or within an acceptable range. , obtain the resource usage allocated for the execution of the task; perform computing power scheduling on the task according to the allocated resource usage.

优选的，所述根据第一方面中任一方法得到的任务模型，在不影响任务性能或在可接受的范围内的条件下，对第一资源使用量中的对任务性能影响小的资源申请配额进行动态伸缩调整，得到为所述任务的执行所分配的资源使用量，包括：利用根据第一方面中任一方法得到的任务模型，根据所述任务类型和第一资源使用量，输出所述任务的第一预期执行时间；对所述资源维度中任务性能影响小的资源申请配额，在不影响任务性能或在可接受的范围内的条件下，进行伸缩调整，得到第二资源使用量；根据所述第二资源使用量，利用第一方面中任一方法得到的所述任务模型，输出所述任务的第二预期执行时间；在所述第一预期执行时间和所述第二预期执行时间的差异在预设的差异阈值之下时，将所述第二资源使用量作为所述分配的资源使用量。Preferably, the task model obtained according to any method in the first aspect applies for resources that have a small impact on task performance in the first resource usage without affecting task performance or within an acceptable range. The quota is dynamically scaled and adjusted to obtain the resource usage allocated for the execution of the task, including: using the task model obtained according to any method in the first aspect, and outputting the resource usage according to the task type and the first resource usage. The first expected execution time of the above task; the resource application quota that has a small impact on task performance in the resource dimension is scaled and adjusted without affecting task performance or within an acceptable range to obtain the second resource usage ; According to the second resource usage, use the task model obtained by any method in the first aspect to output the second expected execution time of the task; between the first expected execution time and the second expected execution time When the difference in execution time is below a preset difference threshold, the second resource usage is used as the allocated resource usage.

优选的，还包括：若所述第一预期执行时间和所述第二预期执行时间的差异超出设定的差异阈值，则利用另一个资源伸缩率对所述资源维度中任务性能影响小的资源申请配额，进行资源伸缩调整。Preferably, the method further includes: if the difference between the first expected execution time and the second expected execution time exceeds a set difference threshold, using another resource scaling rate that has a small impact on task performance in the resource dimension. Apply for quotas and make resource scaling adjustments.

优选的，还包括：在利用所有的所述资源伸缩率对所述资源维度进行资源伸缩调整后，所述第一预期执行时间与所述第二预期执行时间的执行时间差异均超出所述差异阈值时，则将所述第一资源使用量作为所述分配的资源使用量。Preferably, the method further includes: after using all the resource scaling rates to perform resource scaling adjustment on the resource dimension, the execution time difference between the first expected execution time and the second expected execution time exceeds the difference. When the threshold is reached, the first resource usage is used as the allocated resource usage.

优选的，所述利用所述任务模型，针对所述资源维度执行设定资源伸缩率下的资源伸缩调整，得到第二资源使用量，包括：从所述资源维度中，筛选出对所述任务的执行时间影响程度小于设定的影响程度阈值的资源维度，作为待伸缩资源维度；针对所述待伸缩资源维度执行设定资源伸缩率下的资源伸缩调整，得到第二资源使用量。Preferably, using the task model to perform resource scaling adjustment under a set resource scaling rate for the resource dimension to obtain the second resource usage includes: filtering out the requirements for the task from the resource dimension. The resource dimension whose execution time impact is smaller than the set impact threshold is used as the resource dimension to be scaled; resource scaling adjustment is performed under the set resource scaling rate for the resource dimension to be scaled to obtain the second resource usage.

根据第三方面，本发明实施例提供一种算力调度系统，包括：服务器、任务队列、调度器，其特征在于，还包括任务历史运行日志数据库和如第一方面中任一所述的任务模型；所述任务历史运行日志数据库用于收集并保存任务的历史数据，包括资源使用情况、执行时间、任务类型；所述任务模型用于对待调度的任务中的对任务性能影响小的资源的申请配额进行动态伸缩调整，以得到为所述任务的执行所分配的资源使用量。According to the third aspect, an embodiment of the present invention provides a computing power scheduling system, which includes: a server, a task queue, and a scheduler. It is characterized in that it also includes a task history running log database and a task as described in any one of the first aspects. Model; the task historical running log database is used to collect and save historical data of tasks, including resource usage, execution time, and task type; the task model is used to identify resources that have a small impact on task performance in tasks to be scheduled. The application quota is dynamically scaled and adjusted to obtain the resource usage allocated for the execution of the task.

根据第四方面，本发明实施例提供一种存储介质，存储介质中存储有计算机可执行指令，计算机可执行指令被处理器加载并执行时，实现如第一、第二方面中任一方法的步骤。According to a fourth aspect, embodiments of the present invention provide a storage medium in which computer-executable instructions are stored. When the computer-executable instructions are loaded and executed by a processor, the methods of any one of the first and second aspects are implemented. step.

与现有技术相比，本发明的优点在于：Compared with the prior art, the advantages of the present invention are:

相比于当前局限于面向GPU资源分配而设计的AI算力调度方法，本发明调度方法能够综合多维度资源对任务性能的影响完成自适应资源动态调度，有效提升AI算力调度中任务部署完成情况和多维度资源利用率。这种动态调整资源需求的方法，可以优化在用户任务申请过程中出现的资源超额申请问题，降低资源浪费。同时，该方法也优化了由于各维度资源之间的不匹配，导致ML任务部分维度资源的低效利用，提升AI算力的整体资源利用效率。Compared with the current AI computing power scheduling method that is limited to GPU resource allocation, the scheduling method of the present invention can comprehensively integrate the impact of multi-dimensional resources on task performance to complete adaptive resource dynamic scheduling, effectively improving the completion of task deployment in AI computing power scheduling. situation and multidimensional resource utilization. This method of dynamically adjusting resource requirements can optimize the resource overapplication problem that occurs during user task application and reduce resource waste. At the same time, this method also optimizes the inefficient use of some dimensional resources in ML tasks due to the mismatch between resources in various dimensions, and improves the overall resource utilization efficiency of AI computing power.

附图说明Description of the drawings

以下参照附图对本发明实施例作进一步说明，其中：The embodiments of the present invention will be further described below with reference to the accompanying drawings, wherein:

图1为根据本发明实施例的一种算力调度系统架构示意图；Figure 1 is a schematic diagram of the architecture of a computing power scheduling system according to an embodiment of the present invention;

图2为根据本发明实施例的一种基于任务模型的算力调度方法的流程示意图；Figure 2 is a schematic flowchart of a computing power scheduling method based on a task model according to an embodiment of the present invention;

图3为根据本发明实施例的多维度资源伸缩调度步骤流程图；Figure 3 is a flow chart of multi-dimensional resource scaling and scheduling steps according to an embodiment of the present invention;

图4是根据本发明实施例的搭载调度系统的硬件设备架构示意图。Figure 4 is a schematic diagram of the hardware device architecture equipped with a scheduling system according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的，技术方案及优点更加清楚明白，以下通过具体实施例对本发明进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below through specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention.

在进行AI算力调度的研究时发明人发现，现有算力资源利用率低下甚至造成资源浪费的原因，是在调度过程中只考虑了GPU资源而忽视了多个维度资源整体对实际任务性能的影响。为了提升AI算力的任务运行性能和资源利用率，并克服现有AI算力调度方法只针对GPU资源这一维度设计调度策略的不足，发明人结合现有多维度资源的分析和面向实际生产集群的分析提出，通过机器学习建模不同类型AI算力集群中机器学习(machinelearning,ML)任务的多维度资源需求对任务执行的影响，结合算力资源状态完成自适应资源伸缩分配调度，指导AI算力场景下的资源调度，以提升用户任务侧的任务部署执行效率和硬件资源侧的资源利用率。在本发明中，所述多维度资源通常包括CPU核数、GPU用量以及内存用量；在部分实施例中，还可以包括申请占用网络资源的容量。When conducting research on AI computing power scheduling, the inventor found that the reason for the low utilization rate of existing computing power resources and even waste of resources is that only GPU resources are considered during the scheduling process and the overall impact of multiple dimensions of resources on actual task performance is ignored. Impact. In order to improve the task running performance and resource utilization of AI computing power, and overcome the shortcomings of existing AI computing power scheduling methods that only design scheduling strategies for the dimension of GPU resources, the inventor combined the analysis of existing multi-dimensional resources and the actual production-oriented The analysis of the cluster proposes to use machine learning to model the impact of multi-dimensional resource requirements of machine learning (ML) tasks in different types of AI computing clusters on task execution, and complete adaptive resource scaling allocation and scheduling based on the status of computing resources to guide Resource scheduling in AI computing power scenarios to improve task deployment execution efficiency on the user task side and resource utilization on the hardware resource side. In the present invention, the multi-dimensional resources usually include the number of CPU cores, GPU usage, and memory usage; in some embodiments, they may also include the capacity to apply for network resources.

图1为根据本发明实施例的一种算力调度系统架构示意图。所述算力调度系统由用户、不同类型的服务器、任务队列、任务历史运行日志数据库以及任务资源性能模型(以下称任务模型)和调度器所组成。其中，用户为提交机器学习任务的主体，可以在线或者离线提交任务，根据用户提交的任务可以更新当前调度轮次的任务队列。服务器为运行任务的主体，算力调度的过程为将任务部署到服务器上；服务器类型包括核心算力资源和边缘算力资源，具体包括物理机、虚拟机、边缘计算节点等。任务队列负责维护在线提交用户任务所组成的动态任务列表，保存所有等待调度的任务，在算力调度的过程中需要从任务队列中选取可以在当前服务器状态下执行的任务，以进行算力调度。任务历史运行日志数据库负责收集并保存任务的历史数据，包括资源使用情况、执行时间、任务类型等；在部分实施例中，历史数据包括任务起始时间、任务结束时间、任务GPU利用率、任务平均内存占用、任务CPU核数、任务类型(即任务负载的应用类型)。任务模型负责指导调度过程中任务的自适应资源伸缩调整过程。Figure 1 is a schematic diagram of the architecture of a computing power scheduling system according to an embodiment of the present invention. The computing power scheduling system is composed of users, different types of servers, task queues, task historical running log databases, task resource performance models (hereinafter referred to as task models) and schedulers. Among them, the user is the subject who submits machine learning tasks and can submit tasks online or offline. The task queue of the current scheduling round can be updated according to the tasks submitted by the user. The server is the main body of running tasks, and the process of computing power scheduling is to deploy tasks to the server; server types include core computing resources and edge computing resources, including physical machines, virtual machines, edge computing nodes, etc. The task queue is responsible for maintaining a dynamic task list composed of online submitted user tasks and saving all tasks waiting for scheduling. In the process of computing power scheduling, tasks that can be executed under the current server status need to be selected from the task queue for computing power scheduling. . The task historical running log database is responsible for collecting and saving historical data of tasks, including resource usage, execution time, task type, etc.; in some embodiments, historical data includes task start time, task end time, task GPU utilization, task Average memory usage, number of task CPU cores, and task type (i.e., the application type of the task load). The task model is responsible for guiding the adaptive resource scaling and adjustment process of tasks during the scheduling process.

所述算力调度系统的运行流程如下：调度器根据用户动态提交的任务，以及任务类型与多维度资源需求等具体任务属性，更新动态任务队列。对于每轮次的当前调度：在任务层次下，调度器结合任务模型，对任务队列中当前用户任务的资源需求进行自适应的伸缩调整，将更新之后的任务需求资源量返回到调度器；集群层次下，调度器结合任务需求伸缩结果和集群的多维度资源使用状态，将任务部署到可以满足资源需求的服务器节点中，实现对集群作业的资源分配情况进行动态优化。最后，调度系统更新任务层次下的执行状态和集群层次下的资源状态，并准备进行下一轮次的调度。The operation process of the computing power scheduling system is as follows: the scheduler updates the dynamic task queue according to the tasks dynamically submitted by the user, as well as specific task attributes such as task type and multi-dimensional resource requirements. For the current scheduling of each round: at the task level, the scheduler combines the task model to adaptively adjust the resource requirements of the current user tasks in the task queue, and returns the updated task demand resource amount to the scheduler; cluster At the lower level, the scheduler combines the task demand scaling results and the multi-dimensional resource usage status of the cluster to deploy tasks to server nodes that can meet the resource needs to achieve dynamic optimization of resource allocation for cluster jobs. Finally, the scheduling system updates the execution status at the task level and the resource status at the cluster level, and prepares for the next round of scheduling.

根据本发明的一个实施例，提出了一种任务模型的训练方法，基于任务历史运行日志数据库中的历史数据，采用机器学习回归分析方法完成任务模型的构建。具体而言，建模过程基于极致梯度提升(extreme gradient boosting,XGBoost)的机器学习方法，分析不同类型任务多维度资源需求，即多维度资源对任务执行时间的影响。XGBoost是基于提升(即Boosting)方法来训练一组决策树，并利用梯度提升方法对决策树的残差进行迭代拟合。XGBoost模型得到的输出结果是所有残差迭代决策树预测值的加权和。基于XGBoost的任务资源需求预测模型目标函数由衡量模型拟合度的损失函数和增加模型复杂度的正则化项，两部分组成。任务资源需求预测模型的模型结构确立过程，需要最小化目标函数值。具体方法是通过最大化分支增益，来选择叶子节点的特征值切分点，为模型中各个特征依次构建树模型的新分支，最终完成建模任务资源需求XGBoost模型。每次选择分支增益最大的特征作为叶子节点的切分特征，依次选择资源特征完成任务资源需求预测模型构建，这样得到的模型总目标函数最小，完成数据拟合预测的任务资源需求预测模型建立。优选的，所述训练方法包括：以任务执行时间、任务GPU利用率、任务平均内存占用、任务CPU核数、任务类型作为训练样本；其中，任务GPU利用率、任务平均内存占用、任务CPU核数、任务类型作为任务模型的输入，任务执行时间(即任务执行的起始时间至执行结束时间)为标签作为期望输出值。训练样本中特征(即输入)为任务GPU利用率、任务平均内存占用、任务CPU核数、任务类型。训练样本中标签(即输出)为任务执行时间；所述任务执行时间即任务执行的起始时间至执行结束时间。因原始跟踪日志记录的内容是任务起始和结束时间，所以需要通过数据预处理计算得到训练样本中的任务执行时间。训练样本中涉及的任务类型可以不同，但为实现更准确的算力调度，在部分实施例中，在训练一种任务模型时，其训练样本中的任务类型是相同的，即针对不同的任务类型训练相应的任务模型。所述任务类型可以包括计算机视觉任务、自然语言处理任务、强化学习任务、图神经网络任务、推荐任务等各种常见各领域机器学习任务类型。根据训练样本中的数据对所述任务模型进行训练；在部分实施例中，基于梯度提升树回归方法进行建模，即XGBoost库中提供的回归方法，当前任务类型对应的任务回归模型利用任务资源使用情况回归任务执行时间，即输入是资源情况和任务类型，输出是任务预测的执行时间。在部分实施例中，训练过程中，回归树深度为10，Boosting集成学习中采用的基学习器数量为100，学习率为0.1。将训练后的所述任务模型作为算力调度方法中任务执行时间的预测模型。According to an embodiment of the present invention, a task model training method is proposed. Based on the historical data in the task historical operation log database, the machine learning regression analysis method is used to complete the construction of the task model. Specifically, the modeling process is based on the machine learning method of extreme gradient boosting (XGBoost) to analyze the multi-dimensional resource requirements of different types of tasks, that is, the impact of multi-dimensional resources on task execution time. XGBoost is based on the boosting (Boosting) method to train a set of decision trees, and uses the gradient boosting method to iteratively fit the residuals of the decision trees. The output result obtained by the XGBoost model is the weighted sum of the predicted values of all residual iterative decision trees. The objective function of the task resource demand prediction model based on XGBoost consists of two parts: a loss function that measures the model's fitting degree and a regularization term that increases the complexity of the model. The process of establishing the model structure of the task resource demand prediction model requires minimizing the objective function value. The specific method is to select the feature value segmentation points of leaf nodes by maximizing the branch gain, and sequentially build new branches of the tree model for each feature in the model, and finally complete the XGBoost model of modeling task resource requirements. Each time, the feature with the largest branch gain is selected as the segmentation feature of the leaf node, and the resource features are selected in turn to complete the construction of the task resource demand prediction model. In this way, the total objective function of the model is the smallest, and the task resource demand prediction model for data fitting prediction is completed. Preferably, the training method includes: taking task execution time, task GPU utilization, task average memory occupancy, task CPU core number, and task type as training samples; wherein, task GPU utilization, task average memory occupancy, task CPU core The number and task type are used as the input of the task model, and the task execution time (that is, the start time of task execution to the execution end time) is the label as the expected output value. The features (i.e. input) in the training sample are task GPU utilization, task average memory usage, task CPU core number, and task type. The label (that is, the output) in the training sample is the task execution time; the task execution time is the start time of task execution to the execution end time. Since the content of the original tracking log record is the task start and end time, the task execution time in the training sample needs to be calculated through data preprocessing. The task types involved in the training samples can be different, but in order to achieve more accurate computing power scheduling, in some embodiments, when training a task model, the task types in the training samples are the same, that is, for different tasks. Type trains the corresponding task model. The task types may include computer vision tasks, natural language processing tasks, reinforcement learning tasks, graph neural network tasks, recommendation tasks and other common machine learning task types in various fields. The task model is trained according to the data in the training sample; in some embodiments, modeling is performed based on the gradient boosting tree regression method, that is, the regression method provided in the XGBoost library, and the task regression model corresponding to the current task type uses task resources Usage regression task execution time, that is, the input is resource status and task type, and the output is the predicted execution time of the task. In some embodiments, during the training process, the depth of the regression tree is 10, the number of base learners used in Boosting ensemble learning is 100, and the learning rate is 0.1. The trained task model is used as a prediction model for task execution time in the computing power scheduling method.

本发明实施例基于机器学习方法构建了不同任务类型的任务模型，用于指导在线调度过程中的动态资源伸缩，可以解决资源调度分配过程中任务的资源超额申请问题，提高集群资源的利用率和任务部署完成情况。The embodiment of the present invention constructs task models of different task types based on machine learning methods, which are used to guide dynamic resource scaling in the online scheduling process, which can solve the problem of resource overapplication for tasks in the resource scheduling and allocation process, and improve the utilization rate and efficiency of cluster resources. Task deployment completion status.

本发明的算力调度系统基于任务性能建模，完成在线任务资源需求配给的伸缩调整。所述系统从用户任务和集群资源两个层面入手，实现对AI算力的资源调度进行优化。具体包括：在任务层次下，基于任务预期性能和任务资源性能模型，对任务需求进行自适应伸缩调整，提升任务部署执行效率；以及集群层次下，基于多个维度资源利用情况使用状态，对资源分配调度进行优化，对完成资源伸缩后ML任务的调度策略进行设计，完成了AI算力调度中资源利用率的提升。The computing power scheduling system of the present invention is based on task performance modeling to complete the scaling adjustment of online task resource demand allocation. The system starts from two levels: user tasks and cluster resources to optimize the resource scheduling of AI computing power. Specifically, it includes: at the task level, based on the expected task performance and task resource performance model, adaptive scaling adjustment of task requirements to improve task deployment execution efficiency; and at the cluster level, based on the multi-dimensional resource utilization status, resource utilization status Allocation and scheduling were optimized, and the scheduling strategy for ML tasks after resource scaling was completed, completing the improvement of resource utilization in AI computing power scheduling.

图2为根据本发明实施例的一种基于任务模型的算力调度方法的流程示意图。所述方法包括：收到用户提交的任务，所述任务包括任务类型以及为所述任务的执行所申请的多个维度资源的数量(即第一资源使用量)。资源使用量包括CPU核数、GPU利用率占用百分比、平均内存占用、任务负载类型。在部分实施例中，资源使用量还可以包括申请占用网络资源的容量。根据当前调度轮次中待执行的用户任务，更新调度系统的任务队列，并周期性地通知调度器执行调度。其中，该周期可以根据负载的高低进行合理的调整；优选的，该周期可以设置为5分钟。调度轮次的周期可以根据系统运行需求所设定，调度队列则通过一个列表维护当前用户提交的等待执行的任务及其具体资源需求信息。调度队列根据调度策略排序当前等待调度的任务，其中调度策略包括先提交的任务优先调度(先到先服务)、最短时间剩余优先和主资源公平等，并且支持进一步修改和自定义等待调度任务的排序；其中，最短时间剩余优先指每次有新任务到达调整调度队列，优先放置最短时间内将结束的任务；主资源公平是指由于申请多个维度的资源，需要保证任务调度中的资源之间公平性，调度队列中优先放置任务申请的资源占比总资源池比例最小的资源，达到任务之间资源分配公平性的效果。调度器请求调度队列中当前任务。调度队列根据当前指定的调度策略下的任务排序结果，将首位任务由初始的等待状态置为调度状态，然后执行当前任务的调度过程。Figure 2 is a schematic flowchart of a computing power scheduling method based on a task model according to an embodiment of the present invention. The method includes: receiving a task submitted by a user, where the task includes a task type and a quantity of multi-dimensional resources applied for execution of the task (ie, first resource usage). Resource usage includes the number of CPU cores, GPU utilization percentage, average memory usage, and task load type. In some embodiments, the resource usage may also include the capacity requested to occupy network resources. According to the user tasks to be executed in the current scheduling round, the task queue of the scheduling system is updated, and the scheduler is periodically notified to perform scheduling. The period can be reasonably adjusted according to the load level; preferably, the period can be set to 5 minutes. The period of the scheduling round can be set according to the system operation requirements, and the scheduling queue maintains the tasks submitted by the current user for execution and their specific resource demand information through a list. The scheduling queue sorts the tasks currently waiting for scheduling according to the scheduling policy. The scheduling policy includes priority scheduling of tasks submitted first (first come, first served), shortest remaining time priority and main resource fairness, etc., and supports further modification and customization of tasks waiting to be scheduled. Sorting; among them, the shortest time remaining priority means that every time a new task arrives in the scheduling queue, the task that will end in the shortest time is given priority; the main resource fairness means that due to the application of resources in multiple dimensions, it is necessary to ensure that the resources in the task scheduling are balanced. Inter-task fairness, the resource requested by the task with the smallest proportion in the total resource pool is prioritized in the scheduling queue to achieve the effect of fairness in resource allocation between tasks. The scheduler requests the current task in the scheduling queue. The scheduling queue sets the first task from the initial waiting state to the scheduling state according to the task sorting result under the currently specified scheduling policy, and then executes the scheduling process of the current task.

对调度状态下的当前任务的资源申请，执行基于任务模型预测的自适应资源伸缩过程，调整任务的资源申请需求量。图3为多维度资源伸缩调度步骤流程图。首先，根据当前任务的任务类型确定其对应的已经训练好的任务模型，并根据所述任务模型预测不同多维度资源申请量下当前任务的执行时间。所述任务模型的构建输入包括历史日志数据中不同维度资源的使用量，以及与之对应的任务执行时间，基于机器学习方法得到回归模型，可以预测资源申请量下的任务预期执行时间。根据本发明的一个实施例，采用每个任务类型训练一个模型，这样得到的模型准确度更高。结合现有的模型特征重要性分析方法，得到当前任务模型中不同维度资源对最终任务执行时间的影响的程度。结合当前任务的原始资源申请结果，将对任务执行时间影响程度小的资源维度，执行设定资源伸缩率下的资源伸缩调整，得到分配的资源使用量。所述资源伸缩率是本发明的参数之一，表示对资源使用量调整的百分比幅度，通过选择合适的参数作为资源伸缩率，才能够保证实际任务执行效率的前提下，减少资源浪费。由于不同任务对资源的敏感程度不同，所以本发明自适应地查找，符合当前任务性能表现需求的资源伸缩率。其中，资源伸缩率的查找方法是结合任务模型，在不同资源伸缩率下进行二分搜索，保证任务性能需求的同时，最大化节约任务申请的资源。具体地，根据任务模型中各维度资源特征对任务执行时间的影响程度，对多维度的资源特征重要性进行排序。其中，影响程度的确定是针对每个维度资源随机洗牌其特征值，计算此时任务模型预测准确度与未随机洗牌时相比的下降值，下降值越大其所对应的维度资源对任务执行时间的影响程度越大，其重要性越高。对其中特征重要性低(对任务性能影响小)的资源伸缩调整其配额，即对其在不影响任务性能或在用户可接受的范围内的条件下，进行自适应的动态伸缩调整。举例来说，当前任务A的任务资源性能模型指出，CPU和GPU为任务执行时间影响的重要特征，内存对任务性能影响较小，那么资源伸缩阶段，对用户申请的内存进行资源伸缩，伸缩比例是通过结合任务模型，在不同资源伸缩率下进行二分搜索来确定的，即为符合当前任务性能表现需求的资源伸缩率。通过当前的任务模型，再次预测资源伸缩调整之后的资源需求对应的任务执行时间；通过比对资源伸缩调整前后的任务执行时间，判断资源伸缩调整过程，是否对任务性能产生超出性能调整阈值的显著影响，如果影响较大，该调度方法则尝试其他资源伸缩率下资源伸缩调整；最终使得影响程度在阈值范围内，该调度方法则接收当前的资源申请调整结果(也就是将分配的资源使用量作为第二资源使用量)，应用于调度过程中；如果所有设定的资源伸缩率都会对任务执行时间产生较大影响，该调度方法则放弃对该任务的资源伸缩调整，将原始资源申请结果直接应用于调度过程中。在对内存进行资源伸缩的过程中，利用任务A的任务模型进行伸缩前后的任务预期执行时间(所述第一预期执行时间与所述第二预期执行时间的执行时间差异)，如果满足伸缩前后执行时间差异小于设定的差异阈值，则执行内存伸缩，如若不满足，则调整伸缩的大小，直到满足条件。如果当前资源伸缩后，影响了任务性能或执行时间差异不在可接受范围内的条件，则直接返回当前任务的原始资源申请(即用户提交任务时为该任务申请的资源，也就是将第一资源使用量作为第二资源使用量)，不对其作出资源伸缩的调整，以免影响系统的服务质量。For the resource application of the current task in the scheduling state, the adaptive resource scaling process based on the task model prediction is executed to adjust the resource application demand of the task. Figure 3 is a flow chart of multi-dimensional resource scaling and scheduling steps. First, determine the corresponding trained task model according to the task type of the current task, and predict the execution time of the current task under different multi-dimensional resource application amounts based on the task model. The construction input of the task model includes the usage of different dimensions of resources in historical log data and the corresponding task execution time. A regression model is obtained based on the machine learning method, which can predict the expected task execution time under the resource application amount. According to an embodiment of the present invention, one model is trained for each task type, so that the accuracy of the model obtained is higher. Combined with existing model feature importance analysis methods, the degree of impact of different dimensional resources in the current task model on the final task execution time is obtained. Combined with the original resource application results of the current task, the resource dimensions that have a small impact on the task execution time will be implemented to perform resource scaling adjustments under the set resource scaling rate to obtain the allocated resource usage. The resource scaling rate is one of the parameters of the present invention, which represents the percentage range of resource usage adjustment. By selecting appropriate parameters as the resource scaling rate, resource waste can be reduced while ensuring actual task execution efficiency. Since different tasks have different sensitivity to resources, the present invention adaptively searches for a resource scaling rate that meets the performance requirements of the current task. Among them, the resource scaling rate search method is to combine the task model and conduct a binary search under different resource scaling rates to ensure the task performance requirements while maximizing the saving of resources requested by the task. Specifically, the importance of multi-dimensional resource features is sorted according to the degree of impact of each dimensional resource feature in the task model on task execution time. Among them, the degree of influence is determined by randomly shuffling the characteristic value of each dimension resource, and calculating the drop value of the task model prediction accuracy at this time compared with the time when the task model is not randomly shuffled. The greater the drop value, the corresponding dimension resource pair. The greater the impact on task execution time, the higher its importance. The quotas of resources with low feature importance (little impact on task performance) are scaled and adjusted, that is, adaptive dynamic scaling is performed on them without affecting task performance or within a range acceptable to users. For example, the task resource performance model of the current task A points out that CPU and GPU are important features that affect task execution time, and memory has a small impact on task performance. In the resource scaling phase, the memory applied by the user is scaled. The scaling ratio is It is determined by combining the task model and performing a binary search under different resource scaling rates, which is the resource scaling rate that meets the current task performance requirements. Use the current task model to predict the task execution time corresponding to the resource demand after resource scaling adjustment again; by comparing the task execution time before and after resource scaling adjustment, determine whether the resource scaling adjustment process has significant impact on task performance that exceeds the performance adjustment threshold. If the impact is large, the scheduling method will try to adjust the resource scaling at other resource scaling rates; eventually, if the impact is within the threshold range, the scheduling method will receive the current resource application adjustment result (that is, the allocated resource usage As the second resource usage), it is applied in the scheduling process; if all set resource scaling rates will have a greater impact on the task execution time, this scheduling method will give up the resource scaling adjustment for the task and replace the original resource application results with Directly applied to the scheduling process. In the process of resource scaling of the memory, the task model of task A is used to perform the expected execution time of the task before and after scaling (the difference in execution time between the first expected execution time and the second expected execution time). If the conditions before and after scaling are satisfied If the execution time difference is less than the set difference threshold, memory scaling is performed. If not, the scaling size is adjusted until the conditions are met. If the current resource scaling affects task performance or the execution time difference is not within the acceptable range, the original resource application of the current task will be directly returned (that is, the resource applied for the task when the user submitted the task, that is, the first resource will be usage as the second resource usage), no resource scaling adjustments will be made to avoid affecting the service quality of the system.

在执行资源伸缩之后，将当前调度状态任务最终的资源申请量提交到调度器中，等待调度器分配对应的可用资源。调度器根据当前AI算力集群中服务器的资源可用状态，包括当前多个维度资源的当前使用情况和服务器之间的负载平衡情况，分配对应服务器资源。如果当前服务器状态不能满足当前任务的执行，则调度失败，将当前任务重新提交到任务队列中，等待下一轮次的调度。即将调度失败的用户任务自动重新提交，直到能够进行调度。After resource scaling is performed, the final resource request amount of the current scheduling status task is submitted to the scheduler and waits for the scheduler to allocate corresponding available resources. The scheduler allocates corresponding server resources based on the resource availability status of the servers in the current AI computing cluster, including the current usage of multi-dimensional resources and the load balancing situation between servers. If the current server status cannot satisfy the execution of the current task, the scheduling fails and the current task is resubmitted to the task queue to wait for the next round of scheduling. User tasks that are about to fail to be scheduled are automatically resubmitted until they can be scheduled.

执行任务部署，将任务调度到对应的服务器进行执行，并更新当前执行任务的服务器资源使用状态，和当前任务的完成状态，并等待下一轮次调度执行新一轮在线调度。基于任务预测的自适应资源伸缩调度方法。Execute task deployment, schedule the task to the corresponding server for execution, update the resource usage status of the server currently executing the task, and the completion status of the current task, and wait for the next round of scheduling to perform a new round of online scheduling. Adaptive resource scaling scheduling method based on task prediction.

本发明实施例的算力调度方法指导AI算力场景下的资源调度，综合考虑了多个维度资源的调度分配，用来提升用户任务的任务部署执行效率。在用户任务层面中，通过分析建模多维度资源因素对任务影响，实现了自适应动态的资源伸缩调整，优化任务部署和完成的情况，提升任务效率。由于实际任务执行效率是由多维度资源所组成的整体所决定，所以本发明考虑了不同类型任务的资源需求特性和面向多维度资源的集群利用，能够解决任务层面分配的多个维度资源间的不匹配问题，也能够提升集群层面任务的部署效率和多维度资源的利用率。The computing power scheduling method of the embodiment of the present invention guides resource scheduling in AI computing power scenarios, comprehensively considers the scheduling and allocation of resources in multiple dimensions, and is used to improve the task deployment and execution efficiency of user tasks. At the user task level, by analyzing and modeling the impact of multi-dimensional resource factors on tasks, adaptive and dynamic resource scaling adjustments are realized, optimizing task deployment and completion, and improving task efficiency. Since the actual task execution efficiency is determined by the entirety of multi-dimensional resources, the present invention considers the resource demand characteristics of different types of tasks and the cluster utilization of multi-dimensional resources, and can solve the problem of multi-dimensional resources allocated at the task level. Mismatching problems can also improve the deployment efficiency of cluster-level tasks and the utilization of multi-dimensional resources.

图4是搭载该自适应任务资源伸缩调度系统的硬件设备，包括上述调度步骤的功能实现和相应的架构模块构成实现。调度架构模块可通过对应步骤所需的多个硬件或者处理器实现，并通过存储在计算机可读介质内的调度系统流程的代码执行实现调度过程。Figure 4 is a hardware device equipped with the adaptive task resource scaling and scheduling system, including the functional implementation of the above scheduling steps and the corresponding architecture module implementation. The scheduling architecture module can be implemented through multiple hardware or processors required for corresponding steps, and the scheduling process is implemented through code execution of the scheduling system process stored in a computer-readable medium.

该设备包括中央处理单元的处理器、内部总线、网络接口和计算机可读存储介质。处理器和计算机可读通过总线相互通信。可读介质内存储用于实现本发明中的自适应任务资源伸缩调度调度系统的全部功能或者部分功能的程序代码。当程序被处理器执行时，可以实现自适应任务资源伸缩调度功能。The device includes a processor of a central processing unit, an internal bus, a network interface, and a computer-readable storage medium. The processor and computer communicate with each other via the bus. The readable medium stores program codes for implementing all or part of the functions of the adaptive task resource expansion and contraction scheduling system in the present invention. When the program is executed by the processor, the adaptive task resource scaling and scheduling function can be implemented.

本发明提出了基于机器学习任务预测方法的自适应任务资源伸缩调度算法。该方法结合多维度资源下的任务性能模型，优化了用户提交任务时的资源超额申请以及多维度资源之间不匹配问题。通过机器学习方法，本发明构建了不同资源维度对任务最终性能影响的重要性，对超额申请部分进行自适应的伸缩优化。该资源调度方法通过优化用户任务的资源需求，使得算力集群各个维度的资源都能够得到高效的运用，尤其是能够充分利用稀缺昂贵的GPU资源，有效提升集群的整体利用效率。The present invention proposes an adaptive task resource expansion and contraction scheduling algorithm based on a machine learning task prediction method. This method combines the task performance model under multi-dimensional resources to optimize the resource over-application and mismatch between multi-dimensional resources when users submit tasks. Through machine learning methods, the present invention constructs the importance of the impact of different resource dimensions on the final performance of the task, and performs adaptive scaling optimization of the excess application part. This resource scheduling method optimizes the resource requirements of user tasks so that resources in all dimensions of the computing cluster can be efficiently used. In particular, it can make full use of scarce and expensive GPU resources, effectively improving the overall utilization efficiency of the cluster.

本说明书的各个实施例均采用递进的方式描述，每个实施例重点说明的都是与其他实施例或实施方式的不同之处，在本发明的各个实施例之间相同或相似的部分相互参见即可，围绕发明构思的实现原理以及产生的技术效果可相互参照，此处不再赘述。在不冲突的情况下，本发明中的各实施例或实施方式可相互组合。Each embodiment of this specification is described in a progressive manner. Each embodiment focuses on the differences from other embodiments or implementation modes. The same or similar parts among the various embodiments of the present invention are mutually exclusive. Please refer to the above, the implementation principles surrounding the inventive concept and the technical effects produced can be referred to each other, and will not be described again here. Various embodiments or implementations in the present invention may be combined with each other without conflict.

需要说明的是，虽然上文按照特定顺序描述了各个步骤，但是并不意味着必须按照上述特定顺序来执行各个步骤，实际上，这些步骤中的一些可以并发执行，甚至改变顺序，只要能够实现所需要的功能即可。It should be noted that although the steps are described in a specific order above, it does not mean that the steps must be performed in the above-mentioned specific order. In fact, some of these steps can be performed concurrently or even change the order, as long as it can be achieved Just the functions you need.

本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质，其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。The invention may be a system, method and/or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to implement various aspects of the invention.

计算机可读存储介质可以是保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以包括但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。Computer-readable storage media may be tangible devices that retain and store instructions for use by an instruction execution device. Computer-readable storage media may include, for example, but are not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or Flash memory), Static Random Access Memory (SRAM), Compact Disk Read Only Memory (CD-ROM), Digital Versatile Disk (DVD), Memory Stick, Floppy Disk, Mechanical Coding Device, such as a printer with instructions stored on it. Protruding structures in hole cards or grooves, and any suitable combination of the above.

以上已经描述了本发明的各实施例，上述说明是示例性的，并非穷尽性的，并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下，对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本发明实施例中所用术语的选择，旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进，或者使本技术领域的其它普通技术人员能理解本发明实施例披露的各实施例。The embodiments of the present invention have been described above. The above description is illustrative, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The selection of terms used in the embodiments of the present invention is intended to best explain the principles, practical applications or technical improvements in the market of each embodiment, or to enable other persons of ordinary skill in the art to understand the disclosure of the embodiments of the present invention. Various Examples.

Claims

1. A method of training a task model for computational power scheduling, comprising:

taking task execution time, task GPU utilization rate, task average memory occupation, task CPU core number and task type as training samples; the task GPU utilization rate, the task average memory occupation, the task CPU core number and the task type are used as inputs of a task model, and the task execution time is a label and is used as an expected output value;

training the task model based on a regression tree with gradient lifting according to data in the training sample;

and taking the trained task model as a prediction model of task execution time in the computational power scheduling method.

2. The method of claim 1, wherein the task types in the same training sample are the same.

3. The method of claim 2, wherein the task types include computer vision tasks, natural language processing tasks, reinforcement learning tasks, graphic neural network tasks, recommendation tasks.

4. The method of claim 1, wherein the regression tree has a depth of 10, the number of base learners employed is 100, and the learning rate is 0.1.

5. A method of power dispatch, comprising:

acquiring a current task to be scheduled, wherein the task comprises a task type and a first resource usage amount applied for executing the task;

the task model obtained by any one of the methods according to claims 1 to 4, under the condition that the task performance is not affected or within an acceptable range, performing dynamic expansion adjustment on an application quota of a resource with small influence on the task performance in the first resource usage amount, so as to obtain a resource usage amount allocated for execution of the task;

and performing computational scheduling on the task according to the allocated resource usage.

6. The method according to claim 5, wherein the task model obtained according to any one of claims 1 to 4 dynamically scaling the resource application quota having a small effect on the task performance in the first resource usage without affecting the task performance or within an acceptable range, to obtain the resource usage allocated for the execution of the task, includes:

outputting a first expected execution time of the task according to the task type and the first resource usage amount by using the task model obtained according to any one of the methods of claims 1 to 4;

the resource application quota with small influence on the task performance in the resource dimension is subjected to telescopic adjustment under the condition that the task performance is not influenced or within an acceptable range, so that a second resource usage amount is obtained;

outputting a second expected execution time of the task according to the second resource usage amount by using the task model obtained by the method of any one of claims 1 to 4;

and when the difference between the first expected execution time and the second expected execution time is below a preset difference threshold, taking the second resource usage as the allocated resource usage.

7. The method as recited in claim 6, further comprising:

if the difference between the first expected execution time and the second expected execution time exceeds a set difference threshold, resource application quota with small influence on task performance in the resource dimension by using another resource expansion rate is utilized to carry out resource expansion adjustment.

8. The method as recited in claim 6, further comprising:

and when the difference between the execution time of the first expected execution time and the execution time of the second expected execution time exceeds the difference threshold after the resource expansion and contraction adjustment is carried out on the resource dimension by utilizing all the resource expansion and contraction rates, taking the first resource usage amount as the allocated resource usage amount.

9. The method of claim 6, wherein the performing, with the task model, resource scaling adjustments at a set resource scaling rate for the resource dimension to obtain a second resource usage amount comprises:

screening out the resource dimension with the influence degree on the execution time of the task smaller than a set influence degree threshold from the resource dimension, and taking the resource dimension as the resource dimension to be stretched;

and executing resource expansion adjustment under the set resource expansion rate aiming at the dimension of the resource to be expanded, so as to obtain a second resource usage amount.

10. A computing power scheduling system, comprising: a server, a task queue, a scheduler, further comprising a task history log database and a task model according to any one of claims 1 to 4;

the task history running log database is used for collecting and storing the history data of tasks, including resource use conditions, execution time and task types;

the task model is used for dynamically adjusting the application quota of the resource with small influence on the task performance in the task to be scheduled, so as to obtain the resource usage amount allocated for the execution of the task.

11. A storage medium having stored therein computer executable instructions which when loaded and executed by a processor perform the steps of the method according to any of claims 1 to 9.