WO2022088659A1 - Resource scheduling method and apparatus, electronic device, storage medium, and program product - Google Patents

Resource scheduling method and apparatus, electronic device, storage medium, and program product Download PDF

Info

Publication number
WO2022088659A1
WO2022088659A1 PCT/CN2021/095292 CN2021095292W WO2022088659A1 WO 2022088659 A1 WO2022088659 A1 WO 2022088659A1 CN 2021095292 W CN2021095292 W CN 2021095292W WO 2022088659 A1 WO2022088659 A1 WO 2022088659A1
Authority
WO
WIPO (PCT)
Prior art keywords
gpus
gpu
resource scheduling
virtual
screening
Prior art date
Application number
PCT/CN2021/095292
Other languages
French (fr)
Chinese (zh)
Inventor
霍明明
张炜
陈界
朴元奎
陈宇恒
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to KR1020217037982A priority Critical patent/KR20220058844A/en
Publication of WO2022088659A1 publication Critical patent/WO2022088659A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A resource scheduling method and apparatus, an electronic device, a storage medium, and a program product. The method comprises: receiving a resource scheduling request for a graphics processing unit (GPU) in a GPU cluster (S201), the resource scheduling request comprising grouping information of a GPU to be requested, and the grouping information of the GPU to be requested being determined according to a task type of a task processing request corresponding to the resource scheduling request; according to the grouping information of the GPU to be requested, matching, in all GPUs of the GPU cluster, a GPU having the grouping information of the GPU to be requested, so as to obtain a matching result (S202), the matching result comprising at least one target GPU corresponding to the grouping information of the GPU to be requested; and returning the matching result (S203).

Description

资源调度方法及装置、电子设备、存储介质和程序产品Resource scheduling method and apparatus, electronic device, storage medium and program product
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请基于申请号为202011158231.7、申请日为2020年10月26日、申请名称为“资源调度方法及装置、电子设备和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本申请。This application is based on the Chinese patent application with the application number of 202011158231.7, the application date of October 26, 2020, and the application name of "Resource Scheduling Method and Device, Electronic Equipment and Storage Medium", and claims the priority of the Chinese patent application, The entire content of this Chinese patent application is hereby incorporated by reference in its entirety.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种资源调度方法及装置、电子设备、存储介质和程序产品。The present application relates to the technical field of artificial intelligence, and in particular, to a resource scheduling method and apparatus, electronic equipment, storage medium and program product.
背景技术Background technique
人工智能(Artificial Intelligence,AI)是目前比较主流的一个方向,旨在使机器变的更加智能,从而胜任一些需要人类智能才能完成的复杂工作,以方便人类的生活和生产。举例来说,智能手机不再需要手动输入密码,只需要刷脸就可以解锁屏幕。使机器变的更加智能的一种重要方式就是机器学习。目前,机器学习可划分为两类,其中一类是使计算机模拟人类的学习行为,以获取新的知识或技能,并重新组织已有的知识结构使之不断改善自身的性能;另外一类是从大量数据中获取隐藏的、有效的、可理解的知识。Artificial Intelligence (AI) is a relatively mainstream direction at present, which aims to make machines more intelligent, so as to be competent for some complex tasks that require human intelligence to complete, so as to facilitate human life and production. Smartphones, for example, no longer need to manually enter a passcode, and just swipe your face to unlock the screen. An important way to make machines smarter is machine learning. At present, machine learning can be divided into two categories, one of which is to make computers simulate human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance; the other is Gain hidden, valid, understandable knowledge from massive amounts of data.
上述第二类机器学习,需要数据、算法和算力来实现;其中,算力需要图形处理器(Graphics Processing Unit,GPU)等一些计算机硬件资源的支持,从而让算力更好地发挥出算法和数据的作用。在大规模的集群中,往往包括多台物理机器,每台物理机器包括多个GPU,当调度装置接收到资源调度请求时,就会在这些所有物理机器的GPU当中进行资源调度,而目前的调度方式均为随机调度,使得无法精准控制资源的使用。The above-mentioned second type of machine learning requires data, algorithms and computing power to realize; among them, computing power needs the support of some computer hardware resources such as Graphics Processing Unit (GPU), so that the computing power can better exert the algorithm and the role of data. In a large-scale cluster, there are often multiple physical machines, and each physical machine includes multiple GPUs. When the scheduling device receives a resource scheduling request, it will perform resource scheduling among the GPUs of all these physical machines. The scheduling methods are all random scheduling, which makes it impossible to precisely control the use of resources.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种资源调度方法及装置、电子设备、存储介质和程序产品,以精准控制资源的使用,提高资源调度效率和资源利用率。Embodiments of the present application provide a resource scheduling method and apparatus, electronic equipment, storage medium and program product, so as to precisely control the use of resources and improve resource scheduling efficiency and resource utilization.
第一方面,本申请实施例提供一种资源调度方法,包括:接收对图形处理器GPU集群中GPU的资源调度请求,所述资源调度请求包括待请求GPU的分组信息,所述待请求GPU的分组信息是根据所述资源调度请求对应的任务处理请求的任务类型确定的;根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU,得到匹配结果,所述匹配结果包括与所述待请求GPU的分组信息对应的至少一个目标GPU;返回所述匹配结果。In a first aspect, an embodiment of the present application provides a resource scheduling method, including: receiving a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the The grouping information is determined according to the task type of the task processing request corresponding to the resource scheduling request; according to the grouping information of the GPU to be requested, all GPUs in the GPU cluster are matched with the grouping information of the GPU to be requested. GPU, obtaining a matching result, where the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; and returning the matching result.
在一种可选的实施方式中,所述每个GPU包括至少一个虚拟GPU(Virtual Graphics Processing Unit,vGPU),所述资源调度请求还包括vGPU的计算参数和数量;所述根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU之后,所述方法还包括:根据所述vGPU的计算参数和数量,在所述匹配结果中筛选满足所述vGPU的计算参 数和数量的vGPU;返回满足所述vGPU的计算参数和数量的vGPU。In an optional implementation manner, each GPU includes at least one virtual GPU (Virtual Graphics Processing Unit, vGPU), and the resource scheduling request further includes computing parameters and quantity of vGPU; The grouping information of GPUs, after matching the GPUs with the grouping information of the GPUs to be requested among all the GPUs in the GPU cluster, the method further includes: according to the calculation parameters and the number of the vGPUs, in the matching result Screen the vGPUs that satisfy the computing parameters and quantity of the vGPU; return the vGPUs that satisfy the computing parameters and quantity of the vGPU.
在一种可选的实施方式中,所述根据所述vGPU的计算参数和数量,在所述匹配结果中筛选满足所述资源调度请求的vGPU,包括:在所述匹配结果中筛选满足所述计算参数的vGPU,得到第一筛选结果;在所述第一筛选结果中,筛选满足所述vGPU的数量要求的vGPU资源。In an optional implementation manner, according to the computing parameters and quantity of the vGPU, screening the vGPU that satisfies the resource scheduling request in the matching result includes: screening the matching result that satisfies the vGPU Calculate the vGPU of the parameter to obtain a first screening result; in the first screening result, screen the vGPU resources that meet the requirement of the number of vGPUs.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述在所述匹配结果中筛选满足所述计算参数的vGPU,得到第一筛选结果,包括:获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级大于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到第二筛选结果;在所述第二筛选结果中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the matching results are screened for vGPUs that satisfy the computing parameters to obtain a first screening result, including: Obtain the priority corresponding to the computing power and the video memory of each vGPU in each of the target GPUs; if the priority of the computing power is greater than the priority of the video memory, then in each of the target GPUs Screening vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling request to obtain a second screening result; screening vGPUs that meet the memory requirements of the vGPUs requested by the resource scheduling in the second screening results, obtaining the first screening results Filter results.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述在所述匹配结果中筛选满足所述计算参数的vGPU,得到第一筛选结果,包括:获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级小于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到第三筛选结果;在所述第三筛选结果中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the matching results are screened for vGPUs that satisfy the computing parameters to obtain a first screening result, including: Obtain the priority corresponding to the computing power and the video memory of each vGPU in each of the target GPUs; if the priority of the computing power is less than the priority of the video memory, then in each of the target GPUs Screening vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling request to obtain a third screening result; screening vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling in the third screening results, obtaining the first screening results Filter results.
在一种可选的实施方式中,所述在所述第一筛选结果中,筛选满足所述vGPU的数量要求的vGPU资源,包括:若所述第一筛选结果中所述vGPU的数量大于所述资源调度请求所要求的vGPU资源的数量,则在所述第一筛选结果中按照所述计算参数从小到大的顺序选取所述资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源;若所述第一筛选结果中所述vGPU的数量等于所述资源调度请求所要求的vGPU资源的数量,则返回所述第一筛选结果;若所述第一筛选结果中所述vGPU的数量小于所述资源调度请求所要求的vGPU资源的数量,则返回筛选结果为空的提示信息。In an optional implementation manner, in the first screening result, screening for vGPU resources that meet the requirement of the number of vGPUs includes: if the number of vGPUs in the first screening result is greater than the number of vGPUs The number of vGPU resources required by the resource scheduling request, then in the first screening result, the number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request is selected according to the computing parameters in ascending order; If the number of vGPUs in the first screening result is equal to the number of vGPU resources required by the resource scheduling request, return the first screening result; if the number of vGPUs in the first screening result is less than The number of vGPU resources required by the resource scheduling request, then a prompt message that the screening result is empty is returned.
在一种可选的实施方式中,所述资源调度请求包括与所述资源调度请求对应的任务处理请求的任务类型;不同的GPU中的vGPU对应有标签,所述vGPU对应的标签是根据所述资源调度请求对应的任务处理请求的任务类型确定的;所述方法还包括:根据与所述资源调度请求对应的任务处理请求的任务类型,匹配与所述资源调度请求对应的任务处理请求的任务类型对应的至少一个标签;将所述至少一个标签对应的vGPU作为所述匹配结果。In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are determined by the task type of the task processing request corresponding to the resource scheduling request; the method further includes: matching the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request at least one tag corresponding to the task type; the vGPU corresponding to the at least one tag is used as the matching result.
第二方面,本申请实施例提供一种资源调度装置,包括:接收模块,配置为接收对图形处理器GPU集群中GPU的资源调度请求,所述资源调度请求包括待请求GPU的分组信息,所述待请求GPU的分组信息是根据所述资源调度请求对应的任务处理请求的任务类型确定的;第一匹配模块,配置为根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU,得到匹配结果,所述匹配结果包括与所述待请求GPU的分组信息对应的至少一个目标GPU;第一返回模块,配置为返回所述匹配结果。In a second aspect, an embodiment of the present application provides a resource scheduling apparatus, including: a receiving module configured to receive a resource scheduling request for a GPU in a graphics processing unit GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, so The grouping information of the GPUs to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request; the first matching module is configured to, according to the grouping information of the GPUs to be requested, in all GPUs of the GPU cluster Matching the GPU with the grouping information of the GPU to be requested, and obtaining a matching result, the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; the first returning module is configured to return the matching result.
在一种可选的实施方式中,每个所述GPU包括至少一个vGPU,所述资源调度请求还包括所述vGPU的计算参数和数量;所述装置还包括:筛选模块,配置为根据所述vGPU的计算参数和数量, 在所述匹配结果中筛选满足所述vGPU的计算参数和数量的vGPU;第二返回模块,配置为返回满足所述vGPU的计算参数和数量的vGPU。In an optional implementation manner, each of the GPUs includes at least one vGPU, and the resource scheduling request further includes computing parameters and quantities of the vGPUs; the apparatus further includes: a screening module configured to The calculation parameters and the number of vGPUs are selected in the matching result to satisfy the calculation parameters and the number of vGPUs of the vGPU; the second return module is configured to return the vGPUs that satisfy the calculation parameters and the number of the vGPUs.
在一种可选的实施方式中,所述筛选模块包括:第一筛选单元,配置为在所述匹配结果中筛选满足所述计算参数的vGPU,得到第一筛选结果;第二筛选单元,配置为在所述第一筛选结果中,筛选满足vGPU的数量要求的vGPU资源。In an optional implementation manner, the screening module includes: a first screening unit, configured to screen vGPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result; a second screening unit, configured to In the first screening result, the vGPU resources that meet the requirement of the number of vGPUs are screened.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元,配置为获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级大于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到第二筛选结果;在所述第二筛选结果中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the video memory; if the priority of the computing power is greater than the priority of the video memory, the vGPU that satisfies the computing power requirement of the vGPU of the resource scheduling request is screened in each of the target GPUs, Obtaining a second screening result; screening vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling in the second screening results to obtain the first screening results.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元,配置为获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级小于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到第三筛选结果;在所述第三筛选结果中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the vGPUs that meet the video memory requirements of the vGPU requested by the resource scheduling request in each of the target GPUs, and obtain The third screening result: screening the vGPU that meets the computing power requirement of the vGPU requested by the resource scheduling in the third screening result to obtain the first screening result.
在一种可选的实施方式中,所述第二筛选单元,配置为若所述第一筛选结果中所述vGPU的数量大于所述资源调度请求所要求的vGPU资源的数量,则在所述第一筛选结果中按照所述计算参数从小到大的顺序选取所述资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源。In an optional implementation manner, the second screening unit is configured to, if the number of vGPUs in the first screening result is greater than the number of vGPU resources required by the resource scheduling request, perform the filtering in the In the first screening result, a number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request are selected according to the computing parameters in ascending order.
在一种可选的实施方式中,所述第二筛选单元,配置为若所述第一筛选结果中所述vGPU的数量等于所述资源调度请求所要求的vGPU资源的数量,则返回所述第一筛选结果。In an optional implementation manner, the second screening unit is configured to, if the number of vGPUs in the first screening result is equal to the number of vGPU resources required by the resource scheduling request, return the The first screening result.
在一种可选的实施方式中,所述第二筛选单元,配置为若所述第一筛选结果中所述vGPU的数量小于所述资源调度请求所要求的vGPU资源的数量,则返回筛选结果为空的提示信息。In an optional implementation manner, the second screening unit is configured to return a screening result if the number of vGPUs in the first screening result is less than the number of vGPU resources required by the resource scheduling request Empty prompt message.
在一种可选的实施方式中,所述资源调度请求包括与所述资源调度请求对应的任务处理请求的任务类型;不同的GPU中的vGPU对应有标签,所述vGPU对应的标签是根据所述资源调度请求对应的任务处理请求的任务类型确定的;所述装置还包括:第二匹配模块,配置为根据与所述资源调度请求对应的任务处理请求的任务类型,匹配与所述资源调度请求对应的任务处理请求的任务类型对应的至少一个标签;以及将所述至少一个标签对应的vGPU作为所述匹配结果。In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are The task type of the task processing request corresponding to the resource scheduling request is determined; the apparatus further includes: a second matching module configured to match the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request. requesting the corresponding task to process at least one tag corresponding to the task type of the request; and using the vGPU corresponding to the at least one tag as the matching result.
第三方面,本申请实施例提供一种电子设备,包括:存储器;In a third aspect, an embodiment of the present application provides an electronic device, including: a memory;
处理器;以及processor; and
计算机程序;Computer program;
其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行以实现如第一方面所述的方法。Wherein, the computer program is stored in the memory and configured to be executed by the processor to implement the method of the first aspect.
第四方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现第一方面所述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the first aspect.
第五方面,本公开实施例提供了一种计算机程序产品,包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备中的处理器执行如第一方面所述的方法。In a fifth aspect, an embodiment of the present disclosure provides a computer program product, including computer-readable code, and when the computer-readable code is executed in an electronic device, a processor in the electronic device executes the first the method described in the aspect.
本申请实施例提供的资源调度方法及装置、电子设备、存储介质和程序产品,通过接收对图形处理器GPU集群中GPU的资源调度请求,该资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,之后根据待请求GPU的分组信息,在GPU集群的所有GPU中匹配具有待请求GPU的分组信息的GPU;最后返回包括与待请求GPU的分组信息对应的至少一个目标GPU的匹配结果。由于资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,因此在进行GPU资源调度时,就可以根据该分组信息匹配到对应的GPU,从而实现更细粒度的资源调度,精准控制GPU的使用。The resource scheduling method and device, electronic device, storage medium, and program product provided by the embodiments of the present application receive a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the resource scheduling request is to be requested. The grouping information of the requested GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request, and then according to the grouping information of the GPU to be requested, the GPUs with the grouping information of the GPU to be requested are matched among all GPUs in the GPU cluster; The matching result of at least one target GPU corresponding to the grouping information of the GPU to be requested is included. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.
附图说明Description of drawings
图1为本申请实施例提供的应用场景图;FIG. 1 is an application scenario diagram provided by an embodiment of the present application;
图2为本申请实施例提供的资源调度方法流程图;FIG. 2 is a flowchart of a resource scheduling method provided by an embodiment of the present application;
图3为本申请实施例提供的对一台物理机的GPU进行分组的示意图;3 is a schematic diagram of grouping GPUs of a physical machine according to an embodiment of the present application;
图4A为本申请实施例提供的单个在线预测任务的示意图;4A is a schematic diagram of a single online prediction task provided by an embodiment of the present application;
图4B为本申请实施例提供的多个在线预测任务的示意图;4B is a schematic diagram of multiple online prediction tasks provided by an embodiment of the present application;
图5为本申请另一实施例提供的资源调度方法流程图;FIG. 5 is a flowchart of a resource scheduling method provided by another embodiment of the present application;
图6为本申请实施例提供的对一台物理机中vGPU的示意图;6 is a schematic diagram of a vGPU in a physical machine provided by an embodiment of the present application;
图7为本申请实施例提供的资源调度装置的结构示意图;FIG. 7 is a schematic structural diagram of a resource scheduling apparatus provided by an embodiment of the present application;
图8为本申请实施例提供的电子设备的框图。FIG. 8 is a block diagram of an electronic device provided by an embodiment of the present application.
通过上述附图,已示出本公开明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本公开构思的范围,而是通过参考特定实施例为本领域技术人员说明本公开的概念。The above-mentioned drawings have shown clear embodiments of the present disclosure, and will be described in more detail hereinafter. These drawings and written descriptions are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the disclosed concepts to those skilled in the art by referring to specific embodiments.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.
图1为本申请实施例提供的应用场景图。如图1所示,该应用场景包括:用户终端11、AI算法装置12、调度装置13和GPU集群14;其中,用户终端11至少可以包括:智能手机、Ipad、个人电脑等电子设备。GPU集群14是一个计算机集群,其包括多个计算机节点,其中每个计算机节点配备有一个或多个GPU。FIG. 1 is an application scenario diagram provided by an embodiment of the present application. As shown in FIG. 1 , the application scenario includes: a user terminal 11 , an AI algorithm device 12 , a scheduling device 13 and a GPU cluster 14 ; the user terminal 11 may at least include electronic devices such as smart phones, Ipads, and personal computers. GPU cluster 14 is a computer cluster that includes a plurality of computer nodes, where each computer node is equipped with one or more GPUs.
在一些可选的场景中,用户可以通过用户终端11提交任务处理请求,例如AI场景下的模型训练任务、在线预测任务等,用户提交的任务处理请求会发送至AI算法装置12,AI算法装置12根据 该任务处理请求生成资源调度请求,并将该资源调度请求发送至调度装置13,调度装置13进而根据资源调度请求在GPU集群14中进行资源调度,并将资源调度结果返回给AI算法装置12。之后调度装置13根据资源调度请求在GPU集群14中进行资源调度,即:将任务处理请求所需要的资源分配到GPU集群14中的各个GPU当中,从而使得各个GPU完成分配的任务,最终实现对用户提交的任务处理请求的处理。In some optional scenarios, the user can submit a task processing request through the user terminal 11, such as a model training task, an online prediction task, etc. in an AI scenario, and the task processing request submitted by the user will be sent to the AI algorithm device 12, and the AI algorithm device 12 generates a resource scheduling request according to the task processing request, and sends the resource scheduling request to the scheduling device 13, and the scheduling device 13 further performs resource scheduling in the GPU cluster 14 according to the resource scheduling request, and returns the resource scheduling result to the AI algorithm device. 12. Then, the scheduling device 13 performs resource scheduling in the GPU cluster 14 according to the resource scheduling request, that is, allocates the resources required by the task processing request to each GPU in the GPU cluster 14, so that each GPU completes the allocated task and finally realizes the User-submitted tasks handle the processing of requests.
上述资源调度过程中,现有技术对于资源的最小调度单元是物理机,举例来说,假设GPU集群14中包括4台物理机,现有技术只能实现对物理机的调度。In the above resource scheduling process, the minimum scheduling unit for resources in the prior art is a physical machine. For example, assuming that the GPU cluster 14 includes 4 physical machines, the prior art can only implement scheduling of physical machines.
针对上述技术问题,本申请实施例采用如下技术方案:将GPU集群14的最小调度单元(物理机)进行更细粒度的划分,且根据GPU集群14需要处理的任务的类型,预先将GPU集群14中所有GPU打上标签,这样后续在接收到用户发出的任务处理请求时,就能够根据任务处理请求对应的任务类型来筛选对应标签的GPU,从而实现更细粒度的资源调度,以及精准控制GPU的使用。In view of the above technical problems, the embodiment of the present application adopts the following technical solutions: the minimum scheduling unit (physical machine) of the GPU cluster 14 is divided into finer granularity, and according to the type of tasks that the GPU cluster 14 needs to process, the GPU cluster 14 All GPUs are tagged, so that when receiving a task processing request from the user, the GPU corresponding to the tag can be screened according to the task type corresponding to the task processing request, so as to achieve more fine-grained resource scheduling and precise control of GPU. use.
需要说明的是,AI算法装置12可以是独立的一个装置或设备,也可以是集成在用户终端11当中一个模块或部件,本实施例对此不作具体限定。It should be noted that the AI algorithm device 12 may be an independent device or device, or may be a module or component integrated in the user terminal 11 , which is not specifically limited in this embodiment.
本申请实施例可以应用于一切人工智能的场景中,例如智能视频分析领域。The embodiments of the present application can be applied to all artificial intelligence scenarios, such as the field of intelligent video analysis.
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本申请的实施例进行描述。The technical solutions of the present application and how the technical solutions of the present application solve the above-mentioned technical problems will be described in detail below with specific examples. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. The embodiments of the present application will be described below with reference to the accompanying drawings.
图2为本申请实施例提供的资源调度方法流程图。如图2所示,该资源调度方法,包括如下步骤S201至步骤S203:FIG. 2 is a flowchart of a resource scheduling method provided by an embodiment of the present application. As shown in FIG. 2, the resource scheduling method includes the following steps S201 to S203:
步骤S201、接收对图形处理器GPU集群14中GPU的资源调度请求。Step S201 , receiving a resource scheduling request for GPUs in the graphics processor GPU cluster 14 .
本实施例的执行主体为图1所示的调度装置13。调度装置13接收来自AI算法装置12的资源调度请求,该资源调度请求包括待请求GPU的分组信息,待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的。其中,任务类型可以按照任务的用途来划分。例如,在AI场景下,任务类型包括模型训练和在线预测,相应的,待请求GPU的分组信息包括模型训练分组信息和在线预测分组信息。The execution body of this embodiment is the scheduling device 13 shown in FIG. 1 . The scheduling device 13 receives a resource scheduling request from the AI algorithm device 12, the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request. Among them, the task type can be divided according to the purpose of the task. For example, in an AI scenario, the task types include model training and online prediction. Correspondingly, the grouping information of the GPU to be requested includes model training grouping information and online prediction grouping information.
举例来说,用户提交了一个任务类型是模型训练的任务处理请求至AI算法装置12,AI算法装置12根据该任务处理请求生成资源调度请求,并根据该任务处理请求对应的任务类型,确定待请求GPU的分组信息为模型训练分组信息。For example, the user submits a task processing request whose task type is model training to the AI algorithm device 12, and the AI algorithm device 12 generates a resource scheduling request according to the task processing request, and determines the task type to be processed according to the task type corresponding to the task processing request. The grouping information of the requested GPU is the model training grouping information.
在一种可选的实施方式中,待请求GPU的分组信息可以由AI算法装置12指定,若AI算法装置12未指定待请求GPU的分组信息,则默认为GPU集群14中的所有GPU都是可用的。In an optional implementation manner, the grouping information of the GPUs to be requested may be specified by the AI algorithm device 12. If the AI algorithm device 12 does not specify the grouping information of the GPUs to be requested, the default is that all GPUs in the GPU cluster 14 are usable.
步骤S202、根据待请求GPU的分组信息,在GPU集群14的所有GPU中匹配具有待请求GPU的分组信息的GPU,得到匹配结果。Step S202 , according to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster 14 to obtain a matching result.
其中,匹配结果包括与待请求GPU的分组信息对应的至少一个目标GPU。The matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested.
其中,GPU集群14包括多台物理机,每台物理机包括多个GPU,本实施例在步骤S201之前, 需要对GPU集群14中的所有GPU进行分组,在分组过程中,可以根据GPU的用途来分组,GPU的用途可以根据GPU集群14所需要执行的任务处理请求对应的任务类型来确定。下面以一台物理机为例,对GPU分组过程进行详细说明:The GPU cluster 14 includes multiple physical machines, and each physical machine includes multiple GPUs. In this embodiment, before step S201, all GPUs in the GPU cluster 14 need to be grouped. To group, the usage of the GPU can be determined according to the task type corresponding to the task processing request that needs to be executed by the GPU cluster 14 . The following takes a physical machine as an example to describe the GPU grouping process in detail:
图3为本申请实施例提供的对一台物理机的GPU进行分组的示意图。以一台物理机为例,如图3所示,该台物理机是一台9卡的物理机31(包括9个GPU卡的物理机),分别编号为0卡至8卡,假设用户计划在该台物理机上同时进行模型训练和在线预测任务,且计划将0卡至3卡用于模型训练,4卡至8卡用于在线预测,那么就可以将0卡至3卡的分组信息设置为模型训练分组信息,4卡至8卡的分组信息设置为在线预测分组信息。例如,可以将模型训练分组信息记为标签A(Label-A),在线预测分组信息记为标签B(Label-B)。FIG. 3 is a schematic diagram of grouping GPUs of a physical machine according to an embodiment of the present application. Taking a physical machine as an example, as shown in Figure 3, the physical machine is a physical machine 31 with 9 cards (including a physical machine with 9 GPU cards), which are numbered from 0 to 8 cards, assuming that the user plans Model training and online prediction tasks are carried out on this physical machine at the same time, and it is planned to use 0 to 3 cards for model training and 4 to 8 cards for online prediction, then the grouping information of 0 to 3 cards can be set. For the model training grouping information, the grouping information of 4 cards to 8 cards is set as the online prediction grouping information. For example, the model training grouping information can be marked as label A (Label-A), and the online prediction grouping information can be marked as label B (Label-B).
在一种可选的实施方式中,GPU集群14的所有GPU可以表示为列表,且每个GPU对应有分组信息,以包括9个GPU卡的一台物理机为例,其所有GPU的列表形式如下表1:In an optional implementation manner, all GPUs in the GPU cluster 14 can be represented as a list, and each GPU corresponds to grouping information. Taking a physical machine including 9 GPU cards as an example, the list of all GPUs is in the form of a list. Table 1 below:
表1一台物理机中所有GPU分组信息的列表Table 1 List of all GPU grouping information in a physical machine
GPU卡编号GPU card number 分组信息group information
0卡0 card 模型训练model training
1卡1 card 模型训练model training
2卡2 cards 模型训练model training
3卡3 cards 模型训练model training
4卡4 cards 在线预测Online prediction
5卡5 cards 在线预测Online prediction
6卡6 cards 在线预测Online prediction
7卡7 cards 在线预测Online prediction
8卡8 cards 在线预测Online prediction
如表1所示,之后在接收到资源调度请求时,假设资源调度请求中携带的GPU分组信息为模型训练分组信息,则就会匹配到0卡至3卡的GPU,假设资源调度请求中携带的GPU分组信息为在线预测分组信息,则就会匹配到4卡至8卡的GPU。As shown in Table 1, when a resource scheduling request is received, it is assumed that the GPU grouping information carried in the resource scheduling request is the model training grouping information, and then the GPUs of cards 0 to 3 will be matched. The GPU grouping information is online prediction grouping information, then it will match the GPU of 4 to 8 cards.
当然,也可以是将不同物理机中的GPU划分为一组。例如,GPU集群14包括物理机1、物理机2和物理机3;其中,物理机1包括GPU0、GPU1、GPU2;物理机2包括GPU3、GPU4、GPU5;物理机3包括GPU6、GPU7、GPU8;则可以将物理机1中的GPU1、GPU2、物理机2中的GPU5和物理机3中的GPU8划分为同一组。Of course, the GPUs in different physical machines can also be divided into one group. For example, GPU cluster 14 includes physical machine 1, physical machine 2, and physical machine 3; wherein, physical machine 1 includes GPU0, GPU1, and GPU2; physical machine 2 includes GPU3, GPU4, and GPU5; physical machine 3 includes GPU6, GPU7, and GPU8; Then the GPU1 and GPU2 in the physical machine 1, the GPU5 in the physical machine 2, and the GPU8 in the physical machine 3 can be divided into the same group.
通过对GPU集群14中的所有GPU进行分组,每个分组可以认为是一个资源池,能够实现资源(GPU)与资源(GPU)之间的逻辑隔离。By grouping all the GPUs in the GPU cluster 14, each group can be regarded as a resource pool, which can realize logical isolation between resources (GPUs) and resources (GPUs).
步骤S203、返回匹配结果。Step S203, returning a matching result.
其中,匹配结果包括与待请求GPU的分组信息对应的至少一个目标GPU。The matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested.
在一种可选的实施方式中,匹配结果可以表示为列表的形式,则调度装置13在得到上述匹配结果后,根据该匹配结果生成GPU列表,并将该GPU列表返回给AI算法装置12。在一个示例中,假设匹配结果是0卡至3卡,则GPU列表的形式可以参考如下表2:In an optional implementation manner, the matching result may be expressed in the form of a list, and after obtaining the above matching result, the scheduling device 13 generates a GPU list according to the matching result, and returns the GPU list to the AI algorithm device 12 . In an example, assuming that the matching result is from 0 cards to 3 cards, the form of the GPU list can refer to the following Table 2:
表2 GPU列表Table 2 GPU list
0卡0 card
1卡1 card
2卡2 cards
3卡3 cards
本申请实施例通过接收对图形处理器GPU集群14中GPU的资源调度请求,该资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,之后根据待请求GPU的分组信息,在GPU集群14的所有GPU中匹配具有待请求GPU的分组信息的GPU;最后返回包括与待请求GPU的分组信息对应的至少一个目标GPU的匹配结果。由于资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,因此在进行GPU资源调度时,就可以根据该分组信息匹配到对应的GPU,从而实现更细粒度的资源调度,精准控制GPU的使用。The embodiment of the present application receives a resource scheduling request for GPUs in the graphics processor GPU cluster 14, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is based on the task processing request corresponding to the resource scheduling request If the task type is determined, then according to the grouping information of the GPUs to be requested, the GPUs with the grouping information of the GPUs to be requested are matched among all the GPUs in the GPU cluster 14; match results. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.
本申请可以改善在vGPU模式下,AI算法应用的资源调度的可控性。举例来说,对于1台8卡的GPU机器,其中,0卡至3卡利用vGPU模式进行资源分配;4卡至7卡利用非vGPU模式进行资源分配。在现有技术中对于GPU的选择是随机的,无法控制vGPU模式的申请调度到0卡至3卡上。而使用本申请实施例的资源调度方法,通过给0卡至3卡打上vGPU的标签,那么在申请资源的时候,调度装置13在打了vGPU标签的GPU中进行资源分配,如此就能够很精准的控制资源的使用。The present application can improve the controllability of resource scheduling of AI algorithm applications in vGPU mode. For example, for an 8-card GPU machine, among them, 0 to 3 cards use vGPU mode for resource allocation; 4 to 7 cards use non-vGPU mode for resource allocation. In the prior art, the selection of GPU is random, and it is impossible to control the scheduling of applications in the vGPU mode to cards 0 to 3. However, using the resource scheduling method of the embodiment of the present application, by labeling cards 0 to 3 with vGPU labels, when applying for resources, the scheduling device 13 allocates resources among the GPUs labelled with vGPU, so that it can be very accurate. control the use of resources.
另外,本申请的资源调度方法还可以满足单台GPU机器上GPU资源的隔离、分类使用,满足不同需求的资源最大化利用。举例来说,用户资源紧张,只有1台8卡的GPU机器,但是又想在这一台8卡的GPU机器上同时进行模型训练任务和在线预测任务,且能够进行比较好的隔离,互不影响。在这种场景下,通常会通过静态指定方式来进行使用,但是静态指定方式费时费力。而使用本申请的资源调度方法,通过将一部分GPU卡打上模型训练的标签,给另一部分GPU卡打上在线预测的标签,后续在接收到两类任务(模型训练任务和在线预测任务)申请资源时,AI算法装置12通知调度装置13使用对应标签的GPU卡资源即可,选择哪块GPU卡由调度装置13实现,不需要用户参与,在一定程度上提高了易用性。In addition, the resource scheduling method of the present application can also satisfy the isolation and classified use of GPU resources on a single GPU machine, and maximize the utilization of resources that meet different requirements. For example, the user's resources are tight, and there is only one 8-card GPU machine, but they want to perform model training tasks and online prediction tasks on this 8-card GPU machine at the same time, and they can be well isolated from each other. Influence. In this scenario, it is usually used by static designation, but static designation is time-consuming and labor-intensive. Using the resource scheduling method of the present application, by labeling some GPU cards with model training labels, and labeling other GPU cards with online prediction labels, when two types of tasks (model training tasks and online prediction tasks) are subsequently received to apply for resources , the AI algorithm device 12 informs the scheduling device 13 to use the GPU card resource corresponding to the tag, and selects which GPU card is implemented by the scheduling device 13 without user participation, which improves the usability to a certain extent.
上述实施例介绍了对GPU级别的资源调度过程,其在单任务的资源调度场景中,一个任务需要一个GPU卡即可实现,然而在多任务并行的资源调度场景中,就需要较多的GPU卡来满足多任务的并发需求。举例来说,某城市对机动车限行,在该城市道路上设置了很多摄像头监测该条城市道 路上车辆的行驶,当监测到有车辆违反限行规则时,摄像头就会将该车辆拍照,然后发送通知信息给车主,提示车主交罚款。在这一过程中,摄像头拍摄得到图像后,需要对图像中的车辆进行识别,然后将图像中的车辆以矩形框圈出,再识别车牌信息。在车牌信息识别过程中,就需要使用在线预测任务,如图4A所示,如果摄像头拍摄的图像中包括一辆车,那么在线预测任务就只有一个,此时只需要一张GPU卡即可。而实际应用过程中,如图4B所示,摄像头拍摄的图像往往包括多辆车,这时在线预测任务就对应有多个。如果使用GPU级别的资源调度,那么这多个在线预测任务就会分配到多个GPU上,从而使得GPU的资源得不到充分的利用,造成昂贵的GPU资源的浪费。因此,还可以将每个GPU划分为更小的调度单元,即采用虚拟机技术将图1中的每个GPU虚拟化,得到多个vGPU,再将多个并行的在线预测任务分配到不同的vGPU上,使得多个任务共享同一个GPU,从而提高单个GPU的资源利用率。在上述实施例的基础上,本申请还可以在GPU共享场景下实现资源调度,实施方式如下:The above embodiment introduces the GPU-level resource scheduling process. In a single-task resource scheduling scenario, a task can be implemented by only one GPU card. However, in a multi-task parallel resource scheduling scenario, more GPUs are required. card to meet the concurrent needs of multitasking. For example, a city restricts motor vehicles, and many cameras are set up on the city road to monitor the driving of vehicles on the city road. When a vehicle is detected to violate the traffic restriction rules, the camera will take a picture of the vehicle and send it. Notify the information to the owner, prompting the owner to pay the fine. In this process, after the camera captures the image, it needs to identify the vehicle in the image, then circle the vehicle in the image with a rectangular frame, and then identify the license plate information. In the process of license plate information recognition, an online prediction task needs to be used. As shown in Figure 4A, if the image captured by the camera includes a car, then there is only one online prediction task, and only one GPU card is needed at this time. In the actual application process, as shown in Figure 4B, the images captured by the camera often include multiple vehicles, and then there are multiple online prediction tasks. If GPU-level resource scheduling is used, the multiple online prediction tasks will be allocated to multiple GPUs, so that GPU resources cannot be fully utilized, resulting in a waste of expensive GPU resources. Therefore, each GPU can also be divided into smaller scheduling units, that is, using virtual machine technology to virtualize each GPU in Figure 1 to obtain multiple vGPUs, and then assign multiple parallel online prediction tasks to different On vGPU, multiple tasks share the same GPU, thereby improving the resource utilization of a single GPU. On the basis of the above embodiment, the present application can also implement resource scheduling in a GPU sharing scenario, and the implementation manner is as follows:
图5为本申请另一实施例提供的资源调度方法流程图。在上述实施例的基础上,资源调度请求还可以包括vGPU的计算参数和vGPU的数量,其中,vGPU的数量为N,N为大于0的正整数。如图5所示,本实施例提供的资源调度方法,包括如下步骤:FIG. 5 is a flowchart of a resource scheduling method provided by another embodiment of the present application. On the basis of the foregoing embodiment, the resource scheduling request may further include the computing parameters of the vGPU and the number of vGPUs, where the number of vGPUs is N, and N is a positive integer greater than 0. As shown in FIG. 5 , the resource scheduling method provided by this embodiment includes the following steps:
步骤S501、根据vGPU的计算参数和vGPU的数量,在匹配结果中筛选满足资源调度请求的vGPU。Step S501 , according to the calculation parameters of the vGPU and the number of the vGPU, filter the vGPU that satisfies the resource scheduling request in the matching result.
在一种可选的实施方式中,本步骤可以是在匹配结果中筛选满足资源调度请求对应的vGPU的计算参数和数量要求的vGPU。In an optional implementation manner, this step may be to screen the vGPUs that satisfy the computing parameters and quantity requirements of the vGPUs corresponding to the resource scheduling request in the matching result.
图6为本申请实施例提供的一台物理机中vGPU的示意图。如图6所示,每个GPU又可以划分为多个vGPU(如图6中圆圈示出)。需要说明的是,图6中每个GPU包括3个vGPU仅为示例性说明,并不对vGPU的数量进行限制。FIG. 6 is a schematic diagram of a vGPU in a physical machine according to an embodiment of the present application. As shown in FIG. 6 , each GPU can be further divided into multiple vGPUs (as shown by the circles in FIG. 6 ). It should be noted that each GPU in FIG. 6 includes 3 vGPUs for exemplary illustration only, and does not limit the number of vGPUs.
其中,步骤S501是在步骤S202得到匹配结果之后执行。本实施例中的匹配结果可以通过GPU列表表示,匹配结果还可以包括每一目标GPU的每个vGPU的算力(vcore)和/或显存(vmemory)等计算参数,其中,vGPU的算力是指vGPU的计算能力。Wherein, step S501 is executed after the matching result is obtained in step S202. The matching result in this embodiment may be represented by a GPU list, and the matching result may also include computing parameters such as computing power (vcore) and/or video memory (vmemory) of each vGPU of each target GPU, where the computing power of the vGPU is Refers to the computing power of the vGPU.
假设GPU列表包括0卡至3卡,则GPU列表的另一种形式可以参考如下表3:Assuming that the GPU list includes cards 0 to 3, another form of the GPU list can refer to the following Table 3:
表3 GPU列表Table 3 GPU list
Figure PCTCN2021095292-appb-000001
Figure PCTCN2021095292-appb-000001
Figure PCTCN2021095292-appb-000002
Figure PCTCN2021095292-appb-000002
在一种可选的实施方式中,步骤S501又可以包括如下步骤:In an optional implementation manner, step S501 may further include the following steps:
步骤S501a、在匹配结果中筛选满足计算参数的vGPU,得到第一筛选结果。Step S501a: Screen the vGPUs satisfying the calculation parameters in the matching result to obtain the first screening result.
在一种可选的实施方式中,满足计算参数的vGPU可以以列表的形式示出,该vGPU列表包括满足计算参数的至少一个vGPU。若用户提交的任务处理请求需要的计算参数包括算力,且资源调度请求所请求的每个vGPU的算力分别为:3.5、3.0、5.2、6.1;则表3中满足资源调度请求的算力要求的vGPU(第一筛选结果)包括:vGPU-2、vGPU-4、vGPU-8、vGPU-9、vGPU-10、vGPU-11、vGPU-12。第一筛选结果同样可以以列表的形式给出,其形式如下表4:In an optional implementation manner, the vGPUs that satisfy the computing parameters may be shown in the form of a list, and the vGPU list includes at least one vGPU that satisfies the computing parameters. If the computing parameters required by the task processing request submitted by the user include computing power, and the computing power of each vGPU requested by the resource scheduling request is: 3.5, 3.0, 5.2, and 6.1, then the computing power that satisfies the resource scheduling request in Table 3 Required vGPUs (first screening results) include: vGPU-2, vGPU-4, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU-12. The first screening result can also be given in the form of a list, and its form is as follows in Table 4:
表4第一筛选结果Table 4 The first screening results
Figure PCTCN2021095292-appb-000003
Figure PCTCN2021095292-appb-000003
若用户提交的任务处理请求需要的计算参数包括显存,且资源调度请求所请求的每个vGPU的显存分别为:6GB、8GB、8GB、6GB;则满足资源调度请求的vGPU包括:vGPU-3、vGPU-6、vGPU-7、vGPU-8、vGPU-10、vGPU-11、vGPU-12。If the computing parameters required by the task processing request submitted by the user include video memory, and the video memory of each vGPU requested by the resource scheduling request is: 6GB, 8GB, 8GB, and 6GB respectively; then the vGPUs that satisfy the resource scheduling request include: vGPU-3, vGPU-6, vGPU-7, vGPU-8, vGPU-10, vGPU-11, vGPU-12.
若用户提交的任务处理请求需要的计算参数包括算力和显存,且资源调度请求所请求的每个vGPU的算力分别为:3.5、3.0、5.2、6.1,显存分别为:6GB、8GB、8GB、6GB;则满足资源调度请求的vGPU包括:vGPU-2、vGPU-3、vGPU-4、vGPU-6、vGPU-7、vGPU-8、vGPU-9、vGPU-10、vGPU-11、vGPU-12。If the computing parameters required by the task processing request submitted by the user include computing power and video memory, and the computing power of each vGPU requested by the resource scheduling request is: 3.5, 3.0, 5.2, 6.1, and the video memory is: 6GB, 8GB, 8GB , 6GB; vGPUs that satisfy resource scheduling requests include: vGPU-2, vGPU-3, vGPU-4, vGPU-6, vGPU-7, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU- 12.
步骤S501b、在第一筛选结果中,筛选满足资源调度请求中vGPU的数量要求的vGPU资源。Step S501b, in the first screening result, screening vGPU resources that meet the requirement of the number of vGPUs in the resource scheduling request.
其中,本步骤是在第一筛选结果中,筛选出N个vGPU。Wherein, this step is to filter out N vGPUs in the first screening result.
假设用户提交的任务处理请求所需要的vGPU的数量为4个,则还需要在表4中选取4个vGPU。在一种可选的实施方式中,可以是在表4中随机选取4个vGPU。在另一种可选的实施方式中,也可以是在表4中按照算力或显存从小到大的顺序来选取前4个vGPU。以用户提交的任务处理请求 需要的计算参数包括算力为例,满足算力的vGPU包括:vGPU-2、vGPU-4、vGPU-8、vGPU-9、vGPU-10、vGPU-11、vGPU-12,进一步地,还可以在这7个vGPU中随机选取4个vGPU,即为满足vGPU的计算参数和数量的vGPU。Assuming that the number of vGPUs required by the task processing request submitted by the user is 4, it is also necessary to select 4 vGPUs in Table 4. In an optional implementation manner, 4 vGPUs may be randomly selected in Table 4. In another optional implementation manner, the first 4 vGPUs may also be selected in Table 4 in ascending order of computing power or video memory. Taking the computing parameters required for the task processing request submitted by the user including computing power as an example, the vGPUs that meet the computing power include: vGPU-2, vGPU-4, vGPU-8, vGPU-9, vGPU-10, vGPU-11, vGPU- 12. Further, 4 vGPUs may be randomly selected from the 7 vGPUs, that is, vGPUs that satisfy the computing parameters and quantity of the vGPUs.
步骤S502、返回满足资源调度请求的vGPU。Step S502, returning to the vGPU that satisfies the resource scheduling request.
在一种可选的实施方式中,步骤S502可以是将满足vGPU的计算参数要求和vGPU的数量要求的vGPU返回给AI算法装置12。In an optional implementation manner, step S502 may be to return the vGPU that meets the requirements for computing parameters of the vGPU and the requirements for the number of vGPUs to the AI algorithm device 12 .
本实施例中,针对匹配结果,又进行了二次过滤和筛选。其中第一次是根据分组信息进行过滤,当GPU集群14规模特别大的时候,通过分组信息可以过滤掉很多不在筛选范围内的GPU,那么在第二次筛选过程中,就可以将第二次筛选范围缩小,如此可以大大提高资源调度效率。举例来说,现有技术中调度装置13根据资源调度请求需要在GPU集群14的所有GPU中挨个筛选能够满足计算参数和数量要求的GPU资源,若GPU集群14的规模很大,那么筛选范围就会很大,筛选的时间也会很长,使得资源调度效率低。In this embodiment, secondary filtering and screening are performed for the matching results. The first time is to filter according to the grouping information. When the GPU cluster 14 is very large, many GPUs that are not within the screening range can be filtered out through the grouping information. Then, in the second screening process, the second time The screening range is narrowed, which can greatly improve the efficiency of resource scheduling. For example, in the prior art, the scheduling device 13 needs to screen all GPUs in the GPU cluster 14 one by one for the GPU resources that can meet the computing parameters and quantity requirements according to the resource scheduling request. If the scale of the GPU cluster 14 is large, the screening range is It will be very large, and the screening time will be very long, making resource scheduling inefficient.
上述实施例介绍了根据计算参数和数量N共同确定vGPU的实施方式,若计算参数包括算力和显存,则根据算力和显存共同确定vGPU时,又可以包括如下两种实施方式:The above embodiment introduces the implementation of jointly determining the vGPU according to the computing parameters and the number N. If the computing parameters include computing power and video memory, when the vGPU is jointly determined according to the computing power and video memory, the following two implementations may be included:
在一种可选的实施方式中:先根据资源调度请求所请求的算力在匹配结果中进行第一次筛选,再根据资源调度请求所要求的显存在第一次筛选结果中进行第二次筛选。在一种可选的实施方式中,计算参数包括以下至少之一:算力、显存,步骤S501a介绍的在匹配结果中筛选满足计算参数的vGPU,得到第一筛选结果,包括如下步骤:In an optional embodiment, the first screening is performed in the matching results according to the computing power requested by the resource scheduling request, and then the second screening is performed in the first screening results according to the display content required by the resource scheduling request. filter. In an optional implementation manner, the calculation parameters include at least one of the following: computing power and video memory. The matching results introduced in step S501a are screened for vGPUs that satisfy the calculation parameters to obtain the first screening result, including the following steps:
步骤a1、获取每一目标GPU中每一vGPU的算力和显存对应的优先级。Step a1: Obtain the computing power of each vGPU in each target GPU and the priority corresponding to the video memory.
步骤a2、若算力的优先级大于显存的优先级,则在每一所述目标GPU中筛选满足资源调度请求的vGPU的算力要求的vGPU,得到第二筛选结果。Step a2: If the priority of computing power is greater than the priority of video memory, screen each of the target GPUs for vGPUs that meet the computing power requirements of the vGPUs requested by resource scheduling to obtain a second screening result.
步骤a3、在第二筛选结果中筛选满足资源调度请求的vGPU的显存要求的vGPU,得到第一筛选结果。Step a3: Screen the vGPUs that meet the video memory requirements of the vGPUs requested by the resource scheduling in the second screening result to obtain the first screening result.
在另一种可选的实施方式中:先根据资源调度请求所请求的显存在匹配结果中进行第一次筛选,再根据资源调度请求所要求的算力在第一次筛选结果中进行第二次筛选。在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;步骤S501a介绍的在匹配结果中确定满足算力和显存的vGPU,包括:In another optional implementation, the first screening is performed in the matching result according to the display content requested by the resource scheduling request, and then the second screening is performed in the first screening result according to the computing power required by the resource scheduling request. secondary filter. In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; and determining the vGPU that satisfies the computing power and video memory in the matching result introduced in step S501a, including:
步骤b1、获取每一所述目标GPU中每一所述vGPU的算力和显存对应的优先级。Step b1: Acquire the priority corresponding to the computing power and the video memory of each of the vGPUs in each of the target GPUs.
步骤b2、若算力的优先级小于显存的优先级,则在每一所述目标GPU中筛选满足资源调度请求的vGPU的显存要求的vGPU,得到第三筛选结果。Step b2: If the priority of computing power is lower than the priority of video memory, select vGPUs that meet the video memory requirements of the vGPUs requested by resource scheduling in each of the target GPUs to obtain a third screening result.
步骤b3、在第三筛选结果中筛选满足资源调度请求的vGPU的算力要求的vGPU,得到第一筛选结果。Step b3: Screen the vGPUs that meet the computing power requirements of the vGPUs requested by the resource scheduling in the third screening result to obtain the first screening result.
在上述实施例的基础上,根据算力和/或显存在匹配结果中筛选出符合算力和/或显存的vGPU之后,可能存在如下几种情况:On the basis of the above embodiment, after filtering out the vGPU that matches the computing power and/or the video memory from the matching results according to the computing power and/or the video memory, the following situations may exist:
在第一种可选的实施方式中,第一筛选结果中vGPU的数量大于资源调度请求所请求的vGPU的数量,该种情形下,还需要在第一筛选结果中筛选出资源调度请求所请求的vGPU的数量对应的数量个vGPU(在第一筛选结果中筛选出N个vGPU)。举例来说,第一筛选结果包括5个vGPU,若资源调度请求所请求的vGPU的数量是4个,则还需要在这5个vGPU中进一步筛选出4个vGPU,调度装置13再将这4个vGPU返回给AI算法装置12;In a first optional implementation manner, the number of vGPUs in the first screening result is greater than the number of vGPUs requested by the resource scheduling request. In this case, it is also necessary to filter out the number of vGPUs requested by the resource scheduling request in the first screening result. The number of vGPUs corresponds to the number of vGPUs (N vGPUs are filtered out in the first screening result). For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 4, then 4 vGPUs need to be further screened out of the 5 vGPUs, and the scheduling device 13 then selects these 4 vGPUs. The vGPUs are returned to the AI algorithm device 12;
在第二种可选的实施方式中,第一筛选结果中vGPU的数量等于资源调度请求所请求的vGPU的数量,则直接将第一筛选结果中的vGPU作为目标vGPU返回给调度装置13。举例来说,第一筛选结果包括5个vGPU,若资源调度请求所请求的vGPU的数量是5个,则直接将这5个vGPU返回给AI算法装置12。In a second optional implementation manner, if the number of vGPUs in the first screening result is equal to the number of vGPUs requested by the resource scheduling request, the vGPU in the first screening result is directly returned to the scheduling apparatus 13 as the target vGPU. For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 5, the 5 vGPUs are directly returned to the AI algorithm device 12 .
在第三种可选的实施方式中,第一筛选结果中vGPU的数量小于资源调度请求所请求的vGPU的数量,则返回结果为空的消息至调度装置13。举例来说,第一筛选结果包括5个vGPU,若资源调度请求所请求的vGPU的数量是7个,此时第一筛选结果无法满足资源调度请求所请求的vGPU的数量要求,代表该GPU集群14无法满足该资源调度请求,调度装置13就会返回结果为空的消息给AI算法装置12,以通知AI算法装置12GPU集群14无法满足该资源调度请求。In a third optional implementation manner, if the number of vGPUs in the first screening result is less than the number of vGPUs requested by the resource scheduling request, a message that the result is empty is returned to the scheduling apparatus 13 . For example, the first screening result includes 5 vGPUs. If the number of vGPUs requested by the resource scheduling request is 7, then the first screening result cannot meet the requirement of the number of vGPUs requested by the resource scheduling request, representing the GPU cluster. 14 If the resource scheduling request cannot be satisfied, the scheduling device 13 will return a message that the result is empty to the AI algorithm device 12 to notify the AI algorithm device 12 that the GPU cluster 14 cannot meet the resource scheduling request.
在上述第一种可选的实施方式中,在第一筛选结果中筛选N个vGPU时,可以将第一筛选结果按照计算参数从小到大的顺序排序,进而按照计算参数从小到大的顺序选取资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源,也就是在排序结果中选取前N个vGPU。In the above-mentioned first optional embodiment, when N vGPUs are screened in the first screening result, the first screening results may be sorted according to the order of computing parameters from small to large, and then selected according to the order of computing parameters from small to large The number of vGPU resources required by the resource scheduling request corresponds to the number of vGPU resources, that is, the first N vGPUs are selected in the sorting result.
举例来说,在计算参数包括算力的实施方式中,可以将第一筛选结果按照算力从小到大的顺序排序,然后从中选取前N个vGPU。假设第一筛选结果为如下表5所示:For example, in an embodiment in which the computing parameters include computing power, the first screening results may be sorted in ascending order of computing power, and then the top N vGPUs are selected therefrom. Suppose the first screening result is as shown in Table 5 below:
表5第一筛选结果Table 5 The first screening results
vGPU编号vGPU number 算力Computing power
0卡:vGPU-20 card: vGPU-2 3.53.5
1卡:vGPU-41 card: vGPU-4 5.25.2
2卡:vGPU-82 cards: vGPU-8 6.16.1
2卡:vGPU-92 cards: vGPU-9 3.03.0
3卡:vGPU-103 cards: vGPU-10 6.16.1
3卡:vGPU-113 cards: vGPU-11 5.25.2
3卡:vGPU-123 cards: vGPU-12 3.03.0
将第一筛选结果按照算力从小到大的顺序排序后,可以得到如下表6:After sorting the first screening results in ascending order of computing power, the following table 6 can be obtained:
表6排序后的第一筛选结果The first screening results after sorting in Table 6
vGPU编号vGPU number 算力Computing power
3卡:vGPU-123 cards: vGPU-12 3.03.0
2卡:vGPU-92 cards: vGPU-9 3.03.0
0卡:vGPU-20 card: vGPU-2 3.53.5
1卡:vGPU-41 card: vGPU-4 5.25.2
3卡:vGPU-113 cards: vGPU-11 5.25.2
2卡:vGPU-82 cards: vGPU-8 6.16.1
3卡:vGPU-103 cards: vGPU-10 6.16.1
从表6中可以看出,符合算力要求的vGPU是7个,假设资源调度请求所请求的vGPU数量是5个,则可以选取表5中前5个vGPU返回AI算法装置12。It can be seen from Table 6 that the number of vGPUs that meet the computing power requirements is 7. Assuming that the number of vGPUs requested by the resource scheduling request is 5, the first 5 vGPUs in Table 5 can be selected and returned to the AI algorithm device 12.
在一种可选的实施方式中,若计算参数包括显存,则在第一筛选结果中按照显存从小到大的顺序选取资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源。对于计算参数包括显存的实施方式,与计算参数包括算力的实施方式类似,可以参见在第一筛选结果中按照算力从小到大的顺序选取资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源的实施方式,此处不再赘述。In an optional implementation manner, if the calculation parameter includes video memory, in the first screening result, the number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request is selected in descending order of video memory. For the implementation in which the computing parameters include video memory, similar to the implementation in which the computing parameters include computing power, please refer to the first screening result to select the number of vGPU resources required by the resource scheduling request in the order of computing power from small to large. The implementation manner of the vGPU resource will not be repeated here.
在一种可选的实施方式中,若计算参数包括算力和显存,则还可以根据预先设置的算力和显存的优先级,决定在第一筛选结果中是按照算力还是按照显存从小到大的顺序选取N个vGPU资源。In an optional implementation manner, if the computing parameters include computing power and video memory, it is also possible to decide whether to use computing power or video memory in the first screening result according to the preset priority of computing power and video memory. Select N vGPU resources in a large order.
本实施例在根据计算参数和数量进行二次筛选的过程中,将第一次筛选得到的可用vGPU按照计算参数由低到高进行排序,筛选时优先选择最小能满足资源需求的GPU卡(小作业),这样可以让已有资源得到最大化利用,减少碎片的产生,剩余的资源可以尽可能的满足长作业的需求,从而提高资源利用率。In this embodiment, in the process of secondary screening according to the calculation parameters and quantity, the available vGPUs obtained from the first screening are sorted according to the calculation parameters from low to high, and the GPU card (smallest) that can meet the resource requirements is preferentially selected during screening. In this way, existing resources can be maximized, the generation of fragments can be reduced, and the remaining resources can meet the needs of long jobs as much as possible, thereby improving resource utilization.
在上述实施例的基础上,资源调度请求还包括与资源调度请求对应的任务处理请求的任务类型,且不同的GPU中的vGPU对应有标签,而vGPU对应的标签是根据资源调度请求对应的任务处理请求的任务类型确定的;本申请实施例的方法还包括如下方法步骤:On the basis of the above embodiment, the resource scheduling request further includes the task type of the task processing request corresponding to the resource scheduling request, and vGPUs in different GPUs correspond to tags, and the tags corresponding to the vGPUs are tasks corresponding to the resource scheduling request The task type of the processing request is determined; the method of the embodiment of the present application further includes the following method steps:
根据与资源调度请求对应的任务处理请求的任务类型,匹配与资源调度请求对应的任务处理请求的任务类型对应的至少一个标签;以及将至少一个标签对应的vGPU作为匹配结果。According to the task type of the task processing request corresponding to the resource scheduling request, at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request is matched; and the vGPU corresponding to the at least one tag is used as the matching result.
在本实施例中,可以理解为不同的GPU中的vGPU对应的标签即为资源调度请求对应的任务处理请求的任务类型。举例来说,请继续参阅图6,假设图6中0卡至8卡上的27个vGPU中,有一部分例如13个vGPU对应的标签为模型训练任务,而这13个vGPU可以分布在0卡至8卡中的任意至少两张卡,剩余的14个vGPU对应的标签为在线预测任务,那么若资源调度请求对应的任务处理请求的任务类型为模型训练任务,则匹配结果即为分布在0卡至8卡中的任意至少两张卡上的13个vGPU中的部分或全部vGPU。In this embodiment, it can be understood that the labels corresponding to vGPUs in different GPUs are the task types of the task processing requests corresponding to the resource scheduling requests. For example, please continue to refer to Figure 6, assuming that among the 27 vGPUs on cards 0 to 8 in Figure 6, some of the labels corresponding to 13 vGPUs are model training tasks, and these 13 vGPUs can be distributed on 0 cards. To any at least two of the 8 cards, the tags corresponding to the remaining 14 vGPUs are online prediction tasks, then if the task type of the task processing request corresponding to the resource scheduling request is a model training task, the matching result is distributed in 0 Some or all of the 13 vGPUs on any at least two of the 8 cards.
图7为本申请实施例提供的资源调度装置的结构示意图。本申请实施例提供的资源调度装置可 以执行资源调度方法实施例提供的处理流程,如图7所示,资源调度装置70包括:接收模块71、第一匹配模块72和第一返回模块73;其中,接收模块71,配置为接收对图形处理器GPU集群14中GPU的资源调度请求,所述资源调度请求包括待请求GPU的分组信息,所述待请求GPU的分组信息是根据所述资源调度请求对应的任务处理请求的任务类型确定的;第一匹配模块72,配置为根据所述待请求GPU的分组信息,在所述GPU集群14的所有GPU中匹配具有所述待请求GPU的分组信息的GPU,得到匹配结果,所述匹配结果包括与所述待请求GPU的分组信息对应的至少一个目标GPU;第一返回模块73,配置为返回所述匹配结果。FIG. 7 is a schematic structural diagram of a resource scheduling apparatus provided by an embodiment of the present application. The resource scheduling apparatus provided by the embodiment of the present application may execute the processing flow provided by the resource scheduling method embodiment. As shown in FIG. 7 , the resource scheduling apparatus 70 includes: a receiving module 71, a first matching module 72, and a first returning module 73; wherein , the receiving module 71 is configured to receive a resource scheduling request for the GPUs in the graphics processor GPU cluster 14, the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is based on the resource scheduling request. The task type of the corresponding task processing request is determined; the first matching module 72 is configured to, according to the grouping information of the GPU to be requested, match all GPUs in the GPU cluster 14 with the grouping information of the GPU to be requested. The GPU obtains a matching result, where the matching result includes at least one target GPU corresponding to the grouping information of the GPU to be requested; the first returning module 73 is configured to return the matching result.
在一种可选的实施方式中,每个GPU包括至少一个vGPU,所述资源调度请求还包括vGPU的计算参数和数量;该装置还包括:筛选模块74,配置为根据所述vGPU的计算参数和数量,在所述匹配结果中筛选满足所述vGPU的计算参数和数量的vGPU;第二返回模块75,配置为返回满足所述vGPU的计算参数和数量的vGPU。In an optional implementation manner, each GPU includes at least one vGPU, and the resource scheduling request further includes the computing parameters and quantity of the vGPU; the apparatus further includes: a screening module 74, configured to calculate according to the computing parameters of the vGPU and the number of vGPUs that satisfy the computing parameters and number of the vGPUs in the matching result; the second returning module 75 is configured to return the vGPUs that satisfy the computing parameters and number of the vGPUs.
在一种可选的实施方式中,所述筛选模块74包括:第一筛选单元741,配置为在所述匹配结果中筛选满足所述计算参数的vGPU,得到第一筛选结果;第二筛选单元742,配置为在所述第一筛选结果中,筛选满足所述vGPU的数量要求的vGPU资源。In an optional embodiment, the screening module 74 includes: a first screening unit 741, configured to screen the vGPUs that satisfy the calculation parameters in the matching results to obtain a first screening result; a second screening unit 742. Configure, in the first screening result, to screen for vGPU resources that meet the requirement on the number of vGPUs.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元741,配置为获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级大于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到第二筛选结果;在所述第二筛选结果中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit 741 is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the power and the video memory; if the priority of the computing power is greater than the priority of the video memory, the vGPU that satisfies the computing power requirement of the vGPU of the resource scheduling request is screened in each of the target GPUs , and obtain a second screening result; in the second screening result, screen the vGPU that meets the video memory requirement of the vGPU requested by the resource scheduling, and obtain the first screening result.
在一种可选的实施方式中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元741,配置为获取每一所述目标GPU中每一vGPU的所述算力和所述显存对应的优先级;若所述算力的优先级小于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的vGPU的显存要求的vGPU,得到第三筛选结果;在所述第三筛选结果中筛选满足所述资源调度请求的vGPU的算力要求的vGPU,得到所述第一筛选结果。In an optional implementation manner, the computing parameters include at least one of the following: computing power and video memory; the first screening unit 741 is configured to obtain the computing power of each vGPU in each of the target GPUs The priority corresponding to the power and the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the vGPU that meets the video memory requirements of the vGPU requested by the resource scheduling request in each of the target GPUs, Obtaining a third screening result; screening vGPUs that meet the computing power requirement of the vGPU requested by the resource scheduling in the third screening results, to obtain the first screening result.
在一种可选的实施方式中,所述第二筛选单元742,配置为若所述第一筛选结果中所述vGPU的数量大于所述资源调度请求所要求的所述vGPU资源的数量,则在所述第一筛选结果中按照计算参数从小到大的顺序选取所述资源调度请求所要求的vGPU资源的数量对应数量的vGPU资源;若所述第一筛选结果中所述vGPU的数量等于所述资源调度请求所要求的所述vGPU资源的数量,则返回所述第一筛选结果;若所述第一筛选结果中所述vGPU的数量小于所述资源调度请求所要求的所述vGPU资源的数量,则返回筛选结果为空的提示信息。In an optional implementation manner, the second screening unit 742 is configured to, if the number of vGPUs in the first screening result is greater than the number of vGPU resources required by the resource scheduling request, then In the first screening result, select a number of vGPU resources corresponding to the number of vGPU resources required by the resource scheduling request according to the order of computing parameters from small to large; if the number of vGPUs in the first screening result is equal to the The number of the vGPU resources required by the resource scheduling request, the first screening result is returned; if the number of vGPUs in the first screening result is less than the number of vGPU resources required by the resource scheduling request number, the prompt message that the filter result is empty is returned.
在一种可选的实施方式中,所述资源调度请求包括与所述资源调度请求对应的任务处理请求的任务类型;不同的GPU中的vGPU对应有标签,所述vGPU对应的标签是根据所述资源调度请求对应的任务处理请求的任务类型确定的;第二匹配模块76,配置为根据与所述资源调度请求对应的任务处理请求的任务类型,匹配与所述资源调度请求对应的任务处理请求的任务类型对应的至少一个 标签;以及将所述至少一个标签对应的vGPU作为所述匹配结果。In an optional implementation manner, the resource scheduling request includes the task type of the task processing request corresponding to the resource scheduling request; vGPUs in different GPUs have tags corresponding to the tags, and the tags corresponding to the vGPUs are The task type of the task processing request corresponding to the resource scheduling request is determined; the second matching module 76 is configured to match the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request at least one tag corresponding to the requested task type; and using the vGPU corresponding to the at least one tag as the matching result.
图7所示实施例的资源调度装置可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The resource scheduling apparatus in the embodiment shown in FIG. 7 can be used to execute the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
图8为本申请实施例提供的电子设备的结构示意图。本申请实施例提供的电子设备可以执行资源调度方法实施例提供的处理流程,如图8所示,电子设备80包括:存储器81、处理器82、计算机程序和通信接口83;其中,计算机程序存储在存储器81中,并被配置为由处理器82执行以上方法实施例的方法步骤。FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device provided by the embodiment of the present application can execute the processing flow provided by the resource scheduling method embodiment. As shown in FIG. 8 , the electronic device 80 includes: a memory 81, a processor 82, a computer program, and a communication interface 83; wherein the computer program stores In the memory 81, and configured to be performed by the processor 82, the method steps of the above method embodiments.
图8所示实施例的电子设备可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The electronic device in the embodiment shown in FIG. 8 can be used to implement the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and are not repeated here.
另外,本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现上述实施例所述的资源调度方法。In addition, an embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the resource scheduling method described in the foregoing embodiment.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-mentioned integrated units implemented in the form of software functional units can be stored in a computer-readable storage medium. The above-mentioned software function unit is stored in a storage medium, and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute the methods described in the various embodiments of the present application. some steps. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的装置的工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above functional modules is used for illustration. The internal structure is divided into different functional modules to complete all or part of the functions described above. For the working process of the apparatus described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施 例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. scope.
工业实用性Industrial Applicability
本申请实施例中,通过接收对图形处理器GPU集群中GPU的资源调度请求,该资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,之后根据待请求GPU的分组信息,在GPU集群的所有GPU中匹配具有待请求GPU的分组信息的GPU;最后返回包括与待请求GPU的分组信息对应的至少一个目标GPU的匹配结果。由于资源调度请求包括待请求GPU的分组信息,且待请求GPU的分组信息是根据资源调度请求对应的任务处理请求的任务类型确定的,因此在进行GPU资源调度时,就可以根据该分组信息匹配到对应的GPU,从而实现更细粒度的资源调度,精准控制GPU的使用。In the embodiment of the present application, by receiving a resource scheduling request for a GPU in a graphics processor GPU cluster, the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is a task processing request corresponding to the resource scheduling request The task type is determined, then according to the grouping information of the GPU to be requested, the GPUs with the grouping information of the GPU to be requested are matched among all the GPUs of the GPU cluster; finally, the grouping information that includes at least one target GPU corresponding to the grouping information of the GPU to be requested is returned. match results. Since the resource scheduling request includes the grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is determined according to the task type of the task processing request corresponding to the resource scheduling request, when performing GPU resource scheduling, it can be matched according to the grouping information. to the corresponding GPU, so as to achieve more fine-grained resource scheduling and precisely control the use of GPU.

Claims (21)

  1. 一种资源调度方法,包括:A resource scheduling method, comprising:
    接收对图形处理器GPU集群中GPU的资源调度请求,所述资源调度请求包括待请求GPU的分组信息,所述待请求GPU的分组信息是根据所述资源调度请求对应的任务处理请求的任务类型确定的;Receive a resource scheduling request for a GPU in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is the task type of the task processing request corresponding to the resource scheduling request definite;
    根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU,得到匹配结果,所述匹配结果包括与所述待请求GPU的分组信息对应的至少一个目标GPU;According to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster, and obtain a matching result, where the matching result includes the grouping information of the GPU to be requested corresponding at least one target GPU;
    返回所述匹配结果。The matching result is returned.
  2. 根据权利要求1所述的方法,其中,每个所述GPU包括至少一个虚拟GPU,所述资源调度请求还包括虚拟GPU的计算参数和数量;The method according to claim 1, wherein each of the GPUs includes at least one virtual GPU, and the resource scheduling request further includes computing parameters and quantities of the virtual GPUs;
    所述根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU之后,所述方法还包括:After matching the GPUs with the grouping information of the GPUs to be requested in all GPUs of the GPU cluster according to the grouping information of the GPUs to be requested, the method further includes:
    根据所述虚拟GPU的计算参数和数量,在所述匹配结果中筛选满足所述虚拟GPU的计算参数和数量的虚拟GPU;According to the computing parameters and the number of the virtual GPUs, screening the virtual GPUs that satisfy the computing parameters and the number of the virtual GPUs in the matching result;
    返回满足所述虚拟GPU的计算参数和数量的虚拟GPU。Returns a virtual GPU that satisfies the computing parameters and quantity of the virtual GPU.
  3. 根据权利要求2所述的方法,其中,所述根据所述虚拟GPU的计算参数和数量,在所述匹配结果中筛选满足所述资源调度请求的虚拟GPU,包括:The method according to claim 2, wherein, according to the calculation parameters and quantity of the virtual GPUs, filtering the virtual GPUs that satisfy the resource scheduling request in the matching results, comprising:
    在所述匹配结果中筛选满足所述计算参数的虚拟GPU,得到第一筛选结果;Screening virtual GPUs that satisfy the computing parameters in the matching results to obtain a first screening result;
    在所述第一筛选结果中,筛选满足所述虚拟GPU的数量要求的虚拟GPU资源。In the first screening result, the virtual GPU resources that meet the requirement of the number of virtual GPUs are screened.
  4. 根据权利要求3所述的方法,其中,所述计算参数包括以下至少之一:算力、显存;所述在所述匹配结果中筛选满足所述计算参数的虚拟GPU,得到第一筛选结果,包括:The method according to claim 3, wherein the calculation parameters include at least one of the following: computing power and video memory; the screening of virtual GPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result, include:
    获取每一所述目标GPU中每一所述虚拟GPU的所述算力和所述显存对应的优先级;Acquire the priority corresponding to the computing power and the video memory of each of the virtual GPUs in each of the target GPUs;
    若所述算力的优先级大于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的虚拟GPU的算力要求的虚拟GPU,得到第二筛选结果;If the priority of the computing power is greater than the priority of the video memory, screening a virtual GPU that satisfies the computing power requirement of the virtual GPU of the resource scheduling request in each of the target GPUs to obtain a second screening result;
    在所述第二筛选结果中筛选满足所述资源调度请求的虚拟GPU的显存要求的虚拟GPU,得到所述第一筛选结果。The virtual GPU that meets the video memory requirement of the virtual GPU of the resource scheduling request is screened from the second screening result, and the first screening result is obtained.
  5. 根据权利要求3所述的方法,其中,所述计算参数包括以下至少之一:算力、显存;所述在所述匹配结果中筛选满足所述计算参数的虚拟GPU,得到第一筛选结果,包括:The method according to claim 3, wherein the calculation parameters include at least one of the following: computing power and video memory; the screening of virtual GPUs that satisfy the calculation parameters in the matching results, to obtain a first screening result, include:
    获取每一所述目标GPU中每一所述虚拟GPU的所述算力和所述显存对应的优先级;Acquire the priority corresponding to the computing power and the video memory of each of the virtual GPUs in each of the target GPUs;
    若所述算力的优先级小于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的虚拟GPU的显存要求的虚拟GPU,得到第三筛选结果;If the priority of the computing power is less than the priority of the video memory, screening the virtual GPUs that meet the video memory requirements of the virtual GPUs requested by the resource scheduling in each of the target GPUs to obtain a third screening result;
    在所述第三筛选结果中筛选满足所述资源调度请求的虚拟GPU的算力要求的虚拟GPU,得到 所述第一筛选结果。The virtual GPU that meets the computing power requirement of the virtual GPU requested by the resource scheduling is screened in the third screening result, and the first screening result is obtained.
  6. 根据权利要求3至5任一项所述的方法,其中,所述在所述第一筛选结果中,筛选满足所述虚拟GPU的数量要求的虚拟GPU资源,包括:The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:
    若所述第一筛选结果中所述虚拟GPU的数量大于所述资源调度请求所要求的所述虚拟GPU资源的数量,则在所述第一筛选结果中按照计算参数从小到大的顺序选取所述资源调度请求所要求的虚拟GPU资源的数量对应数量的虚拟GPU资源。If the number of the virtual GPUs in the first screening result is greater than the number of the virtual GPU resources required by the resource scheduling request, select the first screening results according to the order of computing parameters from small to large. The number of virtual GPU resources corresponding to the number of virtual GPU resources required by the resource scheduling request.
  7. 根据权利要求3至5任一项所述的方法,其中,所述在所述第一筛选结果中,筛选满足所述虚拟GPU的数量要求的虚拟GPU资源,包括:The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:
    若所述第一筛选结果中所述虚拟GPU的数量等于所述资源调度请求所要求的所述虚拟GPU资源的数量,则返回所述第一筛选结果。If the quantity of the virtual GPUs in the first screening result is equal to the quantity of the virtual GPU resources required by the resource scheduling request, the first screening result is returned.
  8. 根据权利要求3至5任一项所述的方法,其中,所述在所述第一筛选结果中,筛选满足所述虚拟GPU的数量要求的虚拟GPU资源,包括:The method according to any one of claims 3 to 5, wherein, in the first screening result, screening virtual GPU resources that meet the requirement of the number of virtual GPUs, comprising:
    若所述第一筛选结果中所述虚拟GPU的数量小于所述资源调度请求所要求的所述虚拟GPU资源的数量,则返回筛选结果为空的提示信息。If the quantity of the virtual GPUs in the first screening result is less than the quantity of the virtual GPU resources required by the resource scheduling request, a prompt message indicating that the screening result is empty is returned.
  9. 根据权利要求3至5任一项所述的方法,其中,所述资源调度请求包括与所述资源调度请求对应的任务处理请求的任务类型;不同的GPU中的虚拟GPU对应有标签,所述虚拟GPU对应的标签是根据所述资源调度请求对应的任务处理请求的任务类型确定的;所述方法还包括:The method according to any one of claims 3 to 5, wherein the resource scheduling request includes a task type of the task processing request corresponding to the resource scheduling request; virtual GPUs in different GPUs are correspondingly tagged, and the The label corresponding to the virtual GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request; the method further includes:
    根据与所述资源调度请求对应的任务处理请求的任务类型,匹配与所述资源调度请求对应的任务处理请求的任务类型对应的至少一个标签;matching at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request according to the task type of the task processing request corresponding to the resource scheduling request;
    将所述至少一个标签对应的虚拟GPU作为所述匹配结果。The virtual GPU corresponding to the at least one tag is used as the matching result.
  10. 一种资源调度装置,包括:A resource scheduling device, comprising:
    接收模块,配置为接收对图形处理器GPU集群中GPU的资源调度请求,所述资源调度请求包括待请求GPU的分组信息,所述待请求GPU的分组信息是根据所述资源调度请求对应的任务处理请求的任务类型确定的;A receiving module, configured to receive a resource scheduling request for GPUs in a graphics processor GPU cluster, where the resource scheduling request includes grouping information of the GPU to be requested, and the grouping information of the GPU to be requested is a task corresponding to the resource scheduling request Determined by the type of task processing the request;
    第一匹配模块,配置为根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU,得到匹配结果,所述匹配结果包括与所述待请求GPU的分组信息对应的至少一个目标GPU;The first matching module is configured to, according to the grouping information of the GPU to be requested, match the GPUs with the grouping information of the GPU to be requested among all the GPUs in the GPU cluster, and obtain a matching result, where the matching result includes the Describe at least one target GPU corresponding to the grouping information of the GPU to be requested;
    第一返回模块,配置为返回所述匹配结果。The first returning module is configured to return the matching result.
  11. 根据权利要求10所述的装置,其中,每个所述GPU包括至少一个虚拟GPU,所述资源调度请求还包括虚拟GPU的计算参数和数量;所述根据所述待请求GPU的分组信息,在所述GPU集群的所有GPU中匹配具有所述待请求GPU的分组信息的GPU之后,所述装置还包括:The apparatus according to claim 10, wherein each of the GPUs includes at least one virtual GPU, and the resource scheduling request further includes calculation parameters and quantities of the virtual GPUs; the grouping information according to the GPUs to be requested, in the After matching the GPUs with the grouping information of the GPUs to be requested among all the GPUs in the GPU cluster, the apparatus further includes:
    筛选模块,配置为根据所述虚拟GPU的计算参数和数量,在所述匹配结果中筛选满足所述虚拟GPU的计算参数和数量的虚拟GPU;A screening module, configured to screen the virtual GPUs that satisfy the computing parameters and the number of the virtual GPUs in the matching result according to the computing parameters and the number of the virtual GPUs;
    第二返回模块,配置为返回满足所述虚拟GPU的计算参数和数量的虚拟GPU。The second returning module is configured to return the virtual GPUs satisfying the computing parameters and quantity of the virtual GPUs.
  12. 根据权利要求11所述的装置,所述筛选模块包括:第一筛选单元,配置为在所述匹配结果中筛选满足所述计算参数的虚拟GPU,得到第一筛选结果;第二筛选单元,配置为在所述第一筛选结果中,筛选满足所述虚拟GPU的数量要求的虚拟GPU资源。The apparatus according to claim 11, wherein the screening module comprises: a first screening unit, configured to screen virtual GPUs that satisfy the calculation parameters in the matching result, to obtain a first screening result; a second screening unit, configured to In the first screening result, the virtual GPU resources that meet the requirement of the number of virtual GPUs are screened.
  13. 根据权利要求12所述的装置,其中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元,配置为获取每一所述目标GPU中每一所述虚拟GPU的所述算力和所述显存对应的优先级;若所述算力的优先级大于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的虚拟GPU的算力要求的虚拟GPU,得到第二筛选结果;在所述第二筛选结果中筛选满足所述资源调度请求的虚拟GPU的显存要求的虚拟GPU,得到所述第一筛选结果。The device according to claim 12, wherein the calculation parameters include at least one of the following: computing power and video memory; the first screening unit is configured to obtain the data of each of the virtual GPUs in each of the target GPUs The priority corresponding to the computing power and the video memory; if the priority of the computing power is greater than the priority of the video memory, filter the computing power of the virtual GPU that satisfies the resource scheduling request in each of the target GPUs The virtual GPU required by the resource scheduling request is selected to obtain a second screening result; the virtual GPU that meets the video memory requirement of the virtual GPU requested by the resource scheduling is screened in the second screening result, and the first screening result is obtained.
  14. 根据权利要求12所述的装置,其中,所述计算参数包括以下至少之一:算力、显存;所述第一筛选单元,配置为每一所述目标GPU中每一所述虚拟GPU的所述算力和所述显存对应的优先级;若所述算力的优先级小于所述显存的优先级,则在每一所述目标GPU中筛选满足所述资源调度请求的虚拟GPU的显存要求的虚拟GPU,得到第三筛选结果;在所述第三筛选结果中筛选满足所述资源调度请求的虚拟GPU的算力要求的虚拟GPU,得到所述第一筛选结果。The device according to claim 12, wherein the calculation parameters include at least one of the following: computing power and video memory; the first screening unit is configured to be all the target GPUs of each of the virtual GPUs. The priority corresponding to the computing power and the video memory; if the priority of the computing power is less than the priority of the video memory, then filter the video memory requirements of the virtual GPU that satisfies the resource scheduling request in each of the target GPUs The virtual GPU is selected to obtain a third screening result; the virtual GPU that meets the computing power requirement of the virtual GPU requested by the resource scheduling is screened in the third screening result, and the first screening result is obtained.
  15. 根据权利要求12至14任一项所述的装置,其中,所述第二筛选单元,配置为若所述第一筛选结果中所述虚拟GPU的数量大于所述资源调度请求所要求的所述虚拟GPU资源的数量,则在所述第一筛选结果中按照所述计算参数从小到大的顺序选取所述资源调度请求所要求的虚拟GPU资源的数量对应数量的虚拟GPU资源。The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is greater than the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, the number of virtual GPU resources corresponding to the number of virtual GPU resources required by the resource scheduling request is selected in the first screening result according to the computing parameters in ascending order.
  16. 根据权利要求12至14任一项所述的装置,其中,所述第二筛选单元,配置为若所述第一筛选结果中所述虚拟GPU的数量等于所述资源调度请求所要求的所述虚拟GPU资源的数量,则返回所述第一筛选结果。The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is equal to the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, the first screening result is returned.
  17. 根据权利要求12至14任一项所述的装置,其中,所述第二筛选单元,配置为若所述第一筛选结果中所述虚拟GPU的数量小于所述资源调度请求所要求的所述虚拟GPU资源的数量,则返回筛选结果为空的提示信息。The apparatus according to any one of claims 12 to 14, wherein the second screening unit is configured to, if the number of the virtual GPUs in the first screening result is less than the number of the virtual GPUs required by the resource scheduling request The number of virtual GPU resources, and a prompt message that the filter result is empty is returned.
  18. 根据权利要求12至14任一项所述的装置,其中,所述资源调度请求包括与所述资源调度请求对应的任务处理请求的任务类型;不同的GPU中的虚拟GPU对应有标签,所述虚拟GPU对应的标签是根据所述资源调度请求对应的任务处理请求的任务类型确定的;所述装置还包括:第二匹配模块,配置为根据与所述资源调度请求对应的任务处理请求的任务类型,匹配与所述资源调度请求对应的任务处理请求的任务类型对应的至少一个标签;将所述至少一个标签对应的虚拟GPU作为所述匹配结果。The apparatus according to any one of claims 12 to 14, wherein the resource scheduling request includes a task type of the task processing request corresponding to the resource scheduling request; virtual GPUs in different GPUs are correspondingly tagged, and the The label corresponding to the virtual GPU is determined according to the task type of the task processing request corresponding to the resource scheduling request; the apparatus further includes: a second matching module configured to process the requested task according to the task corresponding to the resource scheduling request type, matching at least one tag corresponding to the task type of the task processing request corresponding to the resource scheduling request; and using the virtual GPU corresponding to the at least one tag as the matching result.
  19. 一种电子设备,包括:An electronic device comprising:
    存储器;memory;
    处理器;以及processor; and
    计算机程序;Computer program;
    其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行以实现如权利要求 1至9中任一项所述的方法。wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1 to 9.
  20. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至9中任一项所述的方法。A computer-readable storage medium having a computer program stored thereon, the computer program implementing the method of any one of claims 1 to 9 when executed by a processor.
  21. 一种计算机程序产品,包括计算机可读代码,在所述计算机可读代码在电子设备中运行的情况下,所述电子设备中的处理器执行如权利要求1至9任一项所述的方法。A computer program product comprising computer readable code, where the computer readable code is run in an electronic device, a processor in the electronic device executes the method of any one of claims 1 to 9 .
PCT/CN2021/095292 2020-10-26 2021-05-21 Resource scheduling method and apparatus, electronic device, storage medium, and program product WO2022088659A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020217037982A KR20220058844A (en) 2020-10-26 2021-05-21 Resource scheduling method and apparatus, electronic device, storage medium and program product

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011158231.7A CN112346859B (en) 2020-10-26 2020-10-26 Resource scheduling method and device, electronic equipment and storage medium
CN202011158231.7 2020-10-26

Publications (1)

Publication Number Publication Date
WO2022088659A1 true WO2022088659A1 (en) 2022-05-05

Family

ID=74358745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/095292 WO2022088659A1 (en) 2020-10-26 2021-05-21 Resource scheduling method and apparatus, electronic device, storage medium, and program product

Country Status (3)

Country Link
KR (1) KR20220058844A (en)
CN (1) CN112346859B (en)
WO (1) WO2022088659A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820279A (en) * 2022-05-18 2022-07-29 北京百度网讯科技有限公司 Distributed deep learning method and device based on multiple GPUs and electronic equipment
CN115965517A (en) * 2023-01-09 2023-04-14 摩尔线程智能科技(北京)有限责任公司 Graphics processor resource management method and device, electronic device and storage medium
CN115981871A (en) * 2023-03-17 2023-04-18 苏州万店掌网络科技有限公司 GPU resource scheduling method, device, equipment and storage medium
CN117687802A (en) * 2024-02-02 2024-03-12 湖南马栏山视频先进技术研究院有限公司 Deep learning parallel scheduling method and device based on cloud platform and cloud platform
CN117687802B (en) * 2024-02-02 2024-04-30 湖南马栏山视频先进技术研究院有限公司 Deep learning parallel scheduling method and device based on cloud platform and cloud platform

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112346859B (en) * 2020-10-26 2023-06-16 北京市商汤科技开发有限公司 Resource scheduling method and device, electronic equipment and storage medium
CN113204428B (en) * 2021-05-28 2023-01-20 北京市商汤科技开发有限公司 Resource scheduling method, device, electronic equipment and computer readable storage medium
CN114968272A (en) * 2022-05-31 2022-08-30 京东方科技集团股份有限公司 Algorithm operation method, device, equipment and storage medium
CN116302568A (en) * 2023-05-17 2023-06-23 算力互联(北京)科技有限公司 Computing power resource scheduling method and system, scheduling center and data center
CN116643893B (en) * 2023-07-27 2023-10-20 合肥中科类脑智能技术有限公司 Method and device for scheduling computing task, storage medium and server
CN116757915B (en) * 2023-08-16 2023-11-28 北京蓝耘科技股份有限公司 Cluster GPU resource scheduling method
CN117539639A (en) * 2024-01-05 2024-02-09 北京趋动智能科技有限公司 Video memory resource scheduling method, device, system, storage medium and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2620873A1 (en) * 2012-01-27 2013-07-31 Samsung Electronics Co., Ltd Resource allocation method and apparatus of GPU
CN109144710A (en) * 2017-06-16 2019-01-04 中国移动通信有限公司研究院 Resource regulating method, device and computer readable storage medium
CN109375992A (en) * 2018-08-17 2019-02-22 华为技术有限公司 A kind of resource regulating method and device
CN109376011A (en) * 2018-09-26 2019-02-22 郑州云海信息技术有限公司 The method and apparatus of resource are managed in virtualization system
CN109634748A (en) * 2018-12-12 2019-04-16 深圳前海微众银行股份有限公司 Cluster resource dispatching method, device, equipment and computer readable storage medium
CN110503593A (en) * 2018-05-18 2019-11-26 微软技术许可有限责任公司 The scheduling of multiple graphics processing units
CN110941481A (en) * 2019-10-22 2020-03-31 华为技术有限公司 Resource scheduling method, device and system
CN112346859A (en) * 2020-10-26 2021-02-09 北京市商汤科技开发有限公司 Resource scheduling method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074281B (en) * 2016-04-28 2022-05-24 华为技术有限公司 Method and device for distributing graphics processor tasks
US10262390B1 (en) * 2017-04-14 2019-04-16 EMC IP Holding Company LLC Managing access to a resource pool of graphics processing units under fine grain control
CN110688218B (en) * 2019-09-05 2022-11-04 广东浪潮大数据研究有限公司 Resource scheduling method and device
CN111158879B (en) * 2019-12-31 2024-03-22 上海依图网络科技有限公司 Scheduling method, device, machine-readable medium and system for system resources

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2620873A1 (en) * 2012-01-27 2013-07-31 Samsung Electronics Co., Ltd Resource allocation method and apparatus of GPU
CN109144710A (en) * 2017-06-16 2019-01-04 中国移动通信有限公司研究院 Resource regulating method, device and computer readable storage medium
CN110503593A (en) * 2018-05-18 2019-11-26 微软技术许可有限责任公司 The scheduling of multiple graphics processing units
CN109375992A (en) * 2018-08-17 2019-02-22 华为技术有限公司 A kind of resource regulating method and device
CN109376011A (en) * 2018-09-26 2019-02-22 郑州云海信息技术有限公司 The method and apparatus of resource are managed in virtualization system
CN109634748A (en) * 2018-12-12 2019-04-16 深圳前海微众银行股份有限公司 Cluster resource dispatching method, device, equipment and computer readable storage medium
CN110941481A (en) * 2019-10-22 2020-03-31 华为技术有限公司 Resource scheduling method, device and system
CN112346859A (en) * 2020-10-26 2021-02-09 北京市商汤科技开发有限公司 Resource scheduling method and device, electronic equipment and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820279A (en) * 2022-05-18 2022-07-29 北京百度网讯科技有限公司 Distributed deep learning method and device based on multiple GPUs and electronic equipment
CN114820279B (en) * 2022-05-18 2023-03-24 北京百度网讯科技有限公司 Distributed deep learning method and device based on multiple GPUs and electronic equipment
CN115965517A (en) * 2023-01-09 2023-04-14 摩尔线程智能科技(北京)有限责任公司 Graphics processor resource management method and device, electronic device and storage medium
CN115965517B (en) * 2023-01-09 2023-10-20 摩尔线程智能科技(北京)有限责任公司 Graphics processor resource management method and device, electronic equipment and storage medium
CN115981871A (en) * 2023-03-17 2023-04-18 苏州万店掌网络科技有限公司 GPU resource scheduling method, device, equipment and storage medium
CN115981871B (en) * 2023-03-17 2024-01-26 苏州万店掌网络科技有限公司 GPU resource scheduling method, device, equipment and storage medium
CN117687802A (en) * 2024-02-02 2024-03-12 湖南马栏山视频先进技术研究院有限公司 Deep learning parallel scheduling method and device based on cloud platform and cloud platform
CN117687802B (en) * 2024-02-02 2024-04-30 湖南马栏山视频先进技术研究院有限公司 Deep learning parallel scheduling method and device based on cloud platform and cloud platform

Also Published As

Publication number Publication date
CN112346859B (en) 2023-06-16
KR20220058844A (en) 2022-05-10
CN112346859A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
WO2022088659A1 (en) Resource scheduling method and apparatus, electronic device, storage medium, and program product
CN109983480B (en) Training neural networks using cluster loss
CN110837410B (en) Task scheduling method and device, electronic equipment and computer readable storage medium
US10841236B1 (en) Distributed computer task management of interrelated network computing tasks
CN114741207B (en) GPU resource scheduling method and system based on multi-dimensional combination parallelism
CN110389816B (en) Method, apparatus and computer readable medium for resource scheduling
CN109657793B (en) Model training method and device, storage medium and electronic equipment
CN113946431B (en) Resource scheduling method, system, medium and computing device
CN103493088A (en) Label privileges
CN114416352A (en) Computing resource allocation method and device, electronic equipment and storage medium
CN112416585A (en) GPU resource management and intelligent scheduling method for deep learning
CN109598250A (en) Feature extracting method, device, electronic equipment and computer-readable medium
CN115292014A (en) Image rendering method and device and server
CN111124644B (en) Method, device and system for determining task scheduling resources
CN115586961A (en) AI platform computing resource task scheduling method, device and medium
CN112148467A (en) Dynamic allocation of computing resources
CN116820714A (en) Scheduling method, device, equipment and storage medium of computing equipment
US11941519B2 (en) Machine learning training platform
CN109753353A (en) Resources of virtual machine distribution method, device and electronic equipment
CN115909009A (en) Image recognition method, image recognition device, storage medium and electronic equipment
CN111813541B (en) Task scheduling method, device, medium and equipment
CN111796934B (en) Task issuing method and device, storage medium and electronic equipment
CN115378806A (en) Flow distribution method and device, computer equipment and storage medium
CN112988383A (en) Resource allocation method, device, equipment and storage medium
CN115080242A (en) Method, device and medium for unified scheduling of PCI equipment resources

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021569056

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884402

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21884402

Country of ref document: EP

Kind code of ref document: A1