CN107193650B - Method and device for scheduling display card resources in distributed cluster - Google Patents

Method and device for scheduling display card resources in distributed cluster Download PDF

Info

Publication number
CN107193650B
CN107193650B CN201710250265.0A CN201710250265A CN107193650B CN 107193650 B CN107193650 B CN 107193650B CN 201710250265 A CN201710250265 A CN 201710250265A CN 107193650 B CN107193650 B CN 107193650B
Authority
CN
China
Prior art keywords
pci
display cards
bus
display
cards
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710250265.0A
Other languages
Chinese (zh)
Other versions
CN107193650A (en
Inventor
李远策
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201710250265.0A priority Critical patent/CN107193650B/en
Publication of CN107193650A publication Critical patent/CN107193650A/en
Application granted granted Critical
Publication of CN107193650B publication Critical patent/CN107193650B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Abstract

The invention discloses a method and a device for scheduling display card resources in a distributed cluster. The method comprises the following steps: acquiring display card resources in the distributed cluster, and recording the number of available display cards on each PCI-E bus in a display card resource scheduling table; receiving submitted jobs, wherein the jobs comprise the number of the display cards applied by the jobs; and searching the display card resource scheduling table, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the operation, selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as the display card resources distributed to the operation. The technical scheme can ensure that each submitted job is executed by the display card which does not need to carry out communication across the PCI-E bus as far as possible, avoids low efficiency caused by communication across the PCI-E bus, greatly improves the efficiency of deep learning jobs and other job types with high requirements on display card resources, has fine scheduling granularity, and meets the requirements of distributed clusters.

Description

Method and device for scheduling display card resources in distributed cluster
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for scheduling video card resources in a distributed cluster.
Background
There are many kinds of resource managers or resource schedulers in distributed clusters, such as k8s, messes, yarn, etc. However, they cannot schedule the graphics card resources well, and for the computation tasks with high demand on the graphics card resources, such as deep learning, the performance of the computation tasks will be greatly affected by the quality of the allocated graphics card resources.
Disclosure of Invention
In view of the above, the present invention has been made to provide a method and apparatus for scheduling graphics card resources in a distributed cluster that overcomes or at least partially solves the above-mentioned problems.
According to an aspect of the present invention, there is provided a method for scheduling graphics card resources in a distributed cluster, comprising:
acquiring display card resources in the distributed cluster, and recording the number of available display cards on each PCI-E bus in a display card resource scheduling table;
receiving submitted jobs, wherein the jobs comprise the number of the display cards applied by the jobs;
and searching the display card resource scheduling table, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the operation, selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as the display card resources distributed to the operation.
Optionally, the acquiring the graphics card resource in the distributed cluster includes:
and reading the display card resources on each computing device deployed in the distributed cluster from the PCI-E bus of the computing device.
Optionally, the recording, in the display card resource scheduling table, the number of available display cards on each PCI-E bus includes:
and recording the IDs of the available display cards on each PCI-E bus in an open linked list, and sequencing according to the number of the available display cards on each PCI-E bus.
Optionally, the sorting is an ascending order, and the searching for the display card resource scheduling table includes:
and traversing the open linked list through a depth-first algorithm to judge whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
Optionally, when the number of the available display cards on all the PCI-E buses does not satisfy the number of the display cards applied for the job, the open linked list is traversed again through a depth-first algorithm, and the display cards with the number matched with the number of the display cards applied for the job are selected from the PCI-E buses as the display card resources allocated to the job.
Optionally, re-traversing the open linked list through a depth-first algorithm, and selecting, from the multiple PCI-E buses, display cards of which the number matches the number of display cards applied for the job as display card resources allocated to the job includes:
distributing all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation, if so, selecting the display cards with the number matched with the number of the display cards applied by the operation from the PCI-E bus as display card resources distributed to the operation, if not, distributing all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation or not until the number of the remaining display cards applied by the operation is met.
Optionally, the method further comprises:
deleting all available display cards distributed for the operation from the open linked list, and reordering the open linked list;
and/or the presence of a gas in the gas,
and modifying the open linked list according to the released display card resources, and reordering the open linked list.
According to another aspect of the present invention, there is provided an apparatus for scheduling graphics card resources in a distributed cluster, including:
the recording unit is suitable for acquiring the display card resources in the distributed cluster and recording the number of the available display cards on each PCI-E bus in a display card resource scheduling table;
and the scheduling unit is suitable for receiving submitted jobs, the jobs comprise the number of the display cards applied for the jobs, the display card resource scheduling table is searched, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the jobs, the display cards with the number matched with the number of the display cards applied for the jobs are selected from the PCI-E bus to serve as the display card resources distributed to the jobs.
Optionally, the recording unit is adapted to read, from a PCI-E bus of each computing device deployed in the distributed cluster, a graphics card resource on the computing device.
Optionally, the recording unit is adapted to record the IDs of the available graphics cards on each PCI-E bus in an open linked list, and sort the IDs by the number of available graphics cards on each PCI-E bus.
Optionally, the recording unit performs ascending sorting in an open-link table;
and the scheduling unit is suitable for traversing the open linked list through a depth-first algorithm and judging whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
Optionally, the scheduling unit is further adapted to, when the number of the available display cards on all the PCI-E buses does not satisfy the number of the display cards applied for the job, traverse the open linked list again through a depth-first algorithm, and select, from the plurality of PCI-E buses, the display cards whose number matches the number of the display cards applied for the job as the display card resources allocated to the job.
Optionally, the scheduling unit is adapted to allocate all available display cards on the found first PCI-E bus to the job, determine whether the number of available display cards on the next PCI-E bus meets the number of remaining display cards in the job application, select, if yes, a number of display cards matching the number of display cards in the job application from the PCI-E bus as display card resources allocated to the job, and if not, allocate all available display cards on the PCI-E bus to the job, and determine whether the number of available display cards on the next PCI-E bus meets the number of remaining display cards in the job application until the number of remaining display cards in the job application is met.
Optionally, the recording unit is adapted to delete all available display cards allocated for the job from the open linked list, and reorder the open linked list; and/or the method is suitable for modifying the open linked list according to the released display card resources and reordering the open linked list.
According to the technical scheme, after the display card resources in the distributed cluster are obtained, the number of the available display cards on each PCI-E bus is recorded in one display card resource scheduling table, when the operation containing the number of the applied display cards is received, the display card resource scheduling table is searched, the PCI-E buses which can meet the number of the display cards applied by the operation are selected from the display card resource scheduling table, and the corresponding number of display cards are distributed to the operation from the buses. The technical scheme can ensure that each submitted job is executed by the display card which does not need to carry out communication across the PCI-E bus as far as possible, avoids low efficiency caused by communication across the PCI-E bus, greatly improves the efficiency of deep learning jobs and other job types with high requirements on display card resources, has fine scheduling granularity, and meets the requirements of distributed clusters.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow diagram illustrating a method for scheduling graphics card resources in a distributed cluster according to one embodiment of the invention;
fig. 2 is a schematic structural diagram illustrating an apparatus for scheduling graphics card resources in a distributed cluster according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 is a flowchart illustrating a method for scheduling graphics card resources in a distributed cluster according to an embodiment of the present invention, where as shown in fig. 1, the method includes:
and step S110, acquiring the display card resources in the distributed cluster, and recording the number of available display cards on each PCI-E bus in a display card resource scheduling table.
The PCI-E (PCI-Express) bus is a relatively new bus protocol, and in most of the computing devices, devices such as a video card and a network card are connected to the PCI-E bus.
Step S120, receiving the submitted job, where the job includes the number of display cards applied for the job.
And step S130, searching a display card resource scheduling table, and selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as the display card resources distributed to the operation when the number of the available display cards on the PCI-E bus meets the number of the display cards applied for the operation.
In practice it has been found that the efficiency becomes particularly low if multiple graphics cards to which a job is assigned need to communicate across the PCI-E bus, and relatively high if multiple graphics cards are connected to the same PCI-E bus. The present embodiment is proposed to avoid the situation of communication across the bus.
It can be seen that, in the method shown in fig. 1, after the graphics card resources in the distributed cluster are acquired, the number of available graphics cards on each PCI-E bus is recorded in one graphics card resource scheduling table, when a job including the number of the applied graphics cards is received, the graphics card resource scheduling table is searched, a PCI-E bus that can satisfy the number of the graphics cards applied by the job is selected from the schedule table, and the corresponding number of graphics cards are allocated to the job from the bus. The technical scheme can ensure that each submitted job is executed by the display card which does not need to carry out communication across the PCI-E bus as far as possible, avoids low efficiency caused by communication across the PCI-E bus, greatly improves the efficiency of deep learning jobs and other job types with high requirements on display card resources, has fine scheduling granularity, and meets the requirements of distributed clusters.
In an embodiment of the present invention, the acquiring the video card resources in the distributed cluster includes: and reading the display card resources on each computing device deployed in the distributed cluster from the PCI-E bus of the computing device.
For example, all devices on the PCI-E bus are checked through the lspci command, and graphics card resources are screened out.
In an embodiment of the present invention, the recording, in the display card resource scheduling table, the number of available display cards on each PCI-E bus includes: and recording the IDs of the available display cards on each PCI-E bus in an open linked list, and sequencing according to the number of the available display cards on each PCI-E bus.
For example: the affinity of the video cards on the PCI-E bus is high for PCI-E0 [ GPU0, GPU1], PCI-E1 [ GPU2, GPU3] … …, such as GPU0 and GPU 1. Thus, a display card resource scheduling table is obtained. The next work is how to realize the assignment of the display card with high affinity to the job. In an embodiment of the present invention, in the method, the sorting is in an ascending order, and the searching for the display card resource scheduling table includes: and traversing the open linked list through a depth-first algorithm to judge whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
In the above method, if the job requires 1 graphics card, it is obvious that the graphics cards on the PCI-E0 and the PCI-E1 can both satisfy the condition, and taking the above sequence as an example, the GPU0 on the line of the PCI-E0 that is found first can be allocated to the job.
And if the following example: PCI-E0 [ GPU0], PCI-E1 [ GPU1, GPUs 2, GPUs 3], where 2 graphics cards are needed for a job, then the graphics cards on PCI-E0 are not allocated for the job, and GPUs 1, 2 on PCI-E1 are allocated for the job. The depth-first algorithm can save time and quickly schedule to the display card meeting the operation requirement. The problem is that the method can meet the operation with less required number of the display cards, and the operation cannot be processed when the number of the available display cards on all the PCI-E buses does not meet the number of the display cards applied by the operation.
Therefore, in an embodiment of the present invention, in the above method, when the number of available display cards on all PCI-E buses does not satisfy the number of display cards of the job application, the open linked list is traversed again by the depth-first algorithm, and the number of display cards matching the number of display cards of the job application is selected from the multiple PCI-E buses as the display card resource allocated to the job. This solves the problem.
However, a second traversal also introduces new problems. For example, when 4 graphics cards are needed for a job, the graphics card available on PCI-E0 is GPU0, the graphics card available on PCI-E1 is GPU1, the graphics cards available on PCI-E2 are GPU2 and 3, and the graphics cards available on PCI-E3 are GPU4 and GPU 5. Then, it is worth exploring whether the combination of PCI-E2 and PCI-E3 is better or the combination of PCI-E0, PCI-E1 and PCI-E2 is better.
Since both of the above two methods require cross-bus communication of the graphics card, we choose a combination of PCI-E0, PCI-E1, and PCI-E2 in order to reduce the residual fragments. To implement this selection, in an embodiment of the present invention, the above method, in which the traversing the open-linked list again by using a depth-first algorithm, and selecting, from the multiple PCI-E buses, the number of display cards that matches the number of display cards applied for the job as the display card resource allocated to the job includes: distributing all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation, if so, selecting the display cards with the number matched with the number of the display cards applied by the operation from the PCI-E bus as display card resources distributed to the operation, if not, distributing all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation or not until the number of the remaining display cards applied by the operation is met.
In order to ensure the accuracy of scheduling, in an embodiment of the present invention, in the method, all available display cards allocated to the job are deleted from the open link list, and the open link list is reordered; and/or modifying the open linked list according to the released display card resources and sequencing the open linked list again. This ensures correct implementation of the scheduling algorithm described above.
Fig. 2 is a schematic structural diagram illustrating an apparatus for scheduling graphics card resources in a distributed cluster according to an embodiment of the present invention, and as shown in fig. 2, an apparatus 200 for scheduling graphics card resources in a distributed cluster includes:
the recording unit 210 is adapted to acquire the graphics card resources in the distributed cluster, and record the number of available graphics cards on each PCI-E bus in the graphics card resource scheduling table.
The scheduling unit 220 is adapted to receive the submitted job, where the job includes the number of display cards to which the job applies, search for a display card resource scheduling table, and select, from a PCI-E bus, display cards with a number matching the number of display cards to which the job applies as display card resources allocated to the job when the number of available display cards on the PCI-E bus satisfies the number of display cards to which the job applies.
In practice it has been found that the efficiency becomes particularly low if multiple graphics cards to which a job is assigned need to communicate across the PCI-E bus, and relatively high if multiple graphics cards are connected to the same PCI-E bus. The present embodiment is proposed to avoid the situation of communication across the bus.
It can be seen that, in the apparatus shown in fig. 2, through the mutual cooperation of the units, after the graphics card resources in the distributed cluster are acquired, the number of available graphics cards on each PCI-E bus is recorded in one graphics card resource scheduling table, when a job including the number of the graphics cards requested is received, the graphics card resource scheduling table is searched, a PCI-E bus that can satisfy the number of the graphics cards requested by the job is selected from the list, and a corresponding number of graphics cards are allocated to the job from the bus. The technical scheme can ensure that each submitted job is executed by the display card which does not need to carry out communication across the PCI-E bus as far as possible, avoids low efficiency caused by communication across the PCI-E bus, greatly improves the efficiency of deep learning jobs and other job types with high requirements on display card resources, has fine scheduling granularity, and meets the requirements of distributed clusters.
In an embodiment of the present invention, in the above apparatus, the recording unit 210 is adapted to read, from a PCI-E bus of each computing device deployed in the distributed cluster, a graphics card resource on the computing device.
For example, all devices on the PCI-E bus are checked through the lspci command, and graphics card resources are screened out.
In an embodiment of the present invention, in the above apparatus, the recording unit 210 is adapted to record the IDs of the available graphics cards on each PCI-E bus in an open list and sort the IDs by the number of available graphics cards on each PCI-E bus.
For example: the affinity of the video cards on the PCI-E bus is high for PCI-E0 [ GPU0, GPU1], PCI-E1 [ GPU2, GPU3] … …, such as GPU0 and GPU 1. Thus, a display card resource scheduling table is obtained. The next work is how to realize the assignment of the display card with high affinity to the job. In an embodiment of the present invention, in the above apparatus, the recording unit 210 performs ascending sorting in the open linked list; the scheduling unit 220 is adapted to traverse the open linked list through a depth-first algorithm, and determine whether the number of available display cards on each PCI-E bus satisfies the number of display cards applied for the job.
In the above example, if a job needs 1 graphics card, it is obvious that the graphics cards on the PCI-E0 and the PCI-E1 can both satisfy the condition, and taking the above sequence as an example, the GPU0 on the line of the PCI-E0 that is found first can be allocated to the job.
And if the following example: PCI-E0 [ GPU0], PCI-E1 [ GPU1, GPUs 2, GPUs 3], where 2 graphics cards are needed for a job, then the graphics cards on PCI-E0 are not allocated for the job, and GPUs 1, 2 on PCI-E1 are allocated for the job. The depth-first algorithm can save time and quickly schedule to the display card meeting the operation requirement. The problem is that the method can meet the operation with less required number of the display cards, and the operation cannot be processed when the number of the available display cards on all the PCI-E buses does not meet the number of the display cards applied by the operation.
Therefore, in an embodiment of the present invention, in the above apparatus, the scheduling unit 220 is further adapted to, when the number of available display cards on all PCI-E buses does not satisfy the number of display cards applied for the job, traverse the open-linked list again through a depth-first algorithm, and select, from the multiple PCI-E buses, display cards with the number matching the number of display cards applied for the job as display card resources allocated to the job.
However, a second traversal also introduces new problems. For example, when 4 graphics cards are needed for a job, the graphics card available on PCI-E0 is GPU0, the graphics card available on PCI-E1 is GPU1, the graphics cards available on PCI-E2 are GPU2 and 3, and the graphics cards available on PCI-E3 are GPU4 and GPU 5. Then, it is worth exploring whether the combination of PCI-E2 and PCI-E3 is better or the combination of PCI-E0, PCI-E1 and PCI-E2 is better.
Since both of the above two methods require cross-bus communication of the graphics card, we choose a combination of PCI-E0, PCI-E1, and PCI-E2 in order to reduce the residual fragments. In order to implement this selection, in an embodiment of the present invention, in the above apparatus, the scheduling unit 220 is adapted to allocate all available display cards on the found first PCI-E bus to the job, determine whether the number of available display cards on the next PCI-E bus meets the number of remaining display cards applied for the job, if so, select display cards with a number matching the number of display cards applied for the job from the PCI-E bus as display card resources allocated to the job, if not, allocate all available display cards on the PCI-E bus to the job, and determine whether the number of available display cards on the next PCI-E bus meets the number of remaining display cards applied for the job until the number of remaining display cards applied for the job is met.
In order to ensure the scheduling accuracy, in an embodiment of the present invention, in the apparatus, the recording unit 210 is adapted to delete all available display cards allocated for the job from the open linked list, and reorder the open linked list; and/or the method is suitable for modifying the open linked list according to the released display card resources and sequencing the open linked list again. This ensures correct implementation of the scheduling algorithm described above.
In summary, according to the technical scheme of the present invention, after the graphics card resources in the distributed cluster are obtained, the number of available graphics cards on each PCI-E bus is recorded in one graphics card resource scheduling table, when a job including the number of the applied graphics cards is received, the graphics card resource scheduling table is searched, a PCI-E bus that can satisfy the number of the graphics cards applied by the job is selected from the schedule table, and a corresponding number of graphics cards are allocated to the job from the bus. The technical scheme can ensure that each submitted job is executed by the display card which does not need to carry out communication across the PCI-E bus as far as possible, avoids low efficiency caused by communication across the PCI-E bus, greatly improves the efficiency of deep learning jobs and other job types with high requirements on display card resources, has fine scheduling granularity, and meets the requirements of distributed clusters.
It should be noted that:
the algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may be used with the teachings herein. The required structure for constructing such a device will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components of the apparatus for scheduling graphics card resources in a distributed cluster according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
The embodiment of the invention discloses A1, a method for scheduling graphics card resources in a distributed cluster, wherein the method comprises the following steps:
acquiring display card resources in the distributed cluster, and recording the number of available display cards on each PCI-E bus in a display card resource scheduling table;
receiving submitted jobs, wherein the jobs comprise the number of the display cards applied by the jobs;
and searching the display card resource scheduling table, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the operation, selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as the display card resources distributed to the operation.
A2, the method as in A1, wherein the acquiring the graphics card resources in the distributed cluster comprises:
and reading the display card resources on each computing device deployed in the distributed cluster from the PCI-E bus of the computing device.
A3, the method as in A1, wherein the recording the number of available graphics cards on each PCI-E bus in the graphics card resource schedule includes:
and recording the IDs of the available display cards on each PCI-E bus in an open linked list, and sequencing according to the number of the available display cards on each PCI-E bus.
A4, the method as in A3, wherein the sorting is in ascending order, and the searching the graphics card resource schedule table comprises:
and traversing the open linked list through a depth-first algorithm to judge whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
The method a5, as in a4, wherein when the number of available graphics cards on all PCI-E buses does not satisfy the number of graphics cards applied for the job, the open-linked list is traversed again by a depth-first algorithm, and the graphics cards with the number matching the number of graphics cards applied for the job are selected from the PCI-E buses as the graphics card resources allocated to the job.
A6, the method as in a5, wherein the re-traversing the open-linked list by a depth-first algorithm, and the selecting, from the plurality of PCI-E buses, a number of display cards that matches the number of display cards that are applied for the job as the display card resources allocated to the job comprises:
distributing all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation, if so, selecting the display cards with the number matched with the number of the display cards applied by the operation from the PCI-E bus as display card resources distributed to the operation, if not, distributing all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation or not until the number of the remaining display cards applied by the operation is met.
A7, the method of A3, wherein the method further comprises:
deleting all available display cards distributed for the operation from the open linked list, and reordering the open linked list;
and/or the presence of a gas in the gas,
and modifying the open linked list according to the released display card resources, and reordering the open linked list.
The embodiment of the invention also discloses B8, a device for scheduling graphics card resources in a distributed cluster, wherein the device comprises:
the recording unit is suitable for acquiring the display card resources in the distributed cluster and recording the number of the available display cards on each PCI-E bus in a display card resource scheduling table;
and the scheduling unit is suitable for receiving submitted jobs, the jobs comprise the number of the display cards applied for the jobs, the display card resource scheduling table is searched, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the jobs, the display cards with the number matched with the number of the display cards applied for the jobs are selected from the PCI-E bus to serve as the display card resources distributed to the jobs.
B9, the device of B8, wherein,
the recording unit is suitable for reading the video card resources on the computing equipment from the PCI-E bus of each computing equipment deployed in the distributed cluster.
B10, the device of B8, wherein,
and the recording unit is suitable for recording the IDs of the available display cards on each PCI-E bus in the open linked list and sequencing the IDs according to the number of the available display cards on each PCI-E bus.
B11, the device of B10, wherein,
the recording unit performs ascending sequencing in the open chain table;
and the scheduling unit is suitable for traversing the open linked list through a depth-first algorithm and judging whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
B12, the device of B11, wherein,
the scheduling unit is further adapted to traverse the open linked list again through a depth-first algorithm when the number of the available display cards on all the PCI-E buses does not satisfy the number of the display cards applied for the job, and select the display cards with the number matched with the number of the display cards applied for the job from the PCI-E buses as the display card resources allocated to the job.
B13, the device of B12, wherein,
the dispatching unit is suitable for distributing all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the residual display cards applied by the operation, if so, selecting the display cards with the number matched with the number of the display cards applied by the operation from the PCI-E bus as display card resources distributed to the operation, if not, distributing all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the residual display cards applied by the operation until the number of the residual display cards applied by the operation is met.
B14, the device of B10, wherein,
the recording unit is suitable for deleting all available display cards distributed for the operation from the open linked list and reordering the open linked list; and/or the method is suitable for modifying the open linked list according to the released display card resources and reordering the open linked list.

Claims (14)

1. A method for scheduling graphics card resources in a distributed cluster, wherein the method comprises:
acquiring display card resources in the distributed cluster, and recording the number of available display cards on each PCI-E bus in a display card resource scheduling table;
receiving submitted jobs, wherein the jobs comprise the number of the display cards applied by the jobs;
searching the display card resource scheduling table, and selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as the display card resources distributed to the operation when the number of the available display cards on the PCI-E bus meets the number of the display cards applied for the operation;
and if so, selecting the display cards with the number matched with the number of the display cards applied for the operation from the PCI-E bus as display card resources allocated to the operation.
2. The method of claim 1, wherein the obtaining graphics card resources in a distributed cluster comprises:
and reading the display card resources on each computing device deployed in the distributed cluster from the PCI-E bus of the computing device.
3. The method of claim 1, wherein the recording the number of available graphics cards on each PCI-E bus in the graphics card resource schedule comprises:
and recording the IDs of the available display cards on each PCI-E bus in an open linked list, and sequencing according to the number of the available display cards on each PCI-E bus.
4. The method of claim 3, wherein the ordering is in ascending order, and the finding the schedule of graphics card resources comprises:
and traversing the open linked list through a depth-first algorithm to judge whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
5. The method of claim 4, wherein when the number of available graphics cards on all PCI-E buses does not satisfy the number of graphics cards of the job application, the open list is traversed again through a depth-first algorithm, and the number of graphics cards matching the number of graphics cards of the job application is selected from the plurality of PCI-E buses as the graphics card resources allocated to the job.
6. The method of claim 5, wherein re-traversing the open-linked list through a depth-first algorithm, the selecting a number of display cards from the plurality of PCI-E buses that matches the number of display cards applied for the job as the display card resource allocated to the job comprises:
and allocating all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation, if not, allocating all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation until the number of the remaining display cards applied by the operation is met.
7. The method of claim 3, wherein the method further comprises:
deleting all available display cards distributed for the operation from the open linked list, and reordering the open linked list;
and/or the presence of a gas in the gas,
and modifying the open linked list according to the released display card resources, and reordering the open linked list.
8. An apparatus for scheduling graphics card resources in a distributed cluster, the apparatus comprising:
the recording unit is suitable for acquiring the display card resources in the distributed cluster and recording the number of the available display cards on each PCI-E bus in a display card resource scheduling table;
the scheduling unit is suitable for receiving submitted jobs, the jobs comprise the number of the display cards applied for the jobs, the display card resource scheduling table is searched, and when the number of the available display cards on one PCI-E bus meets the number of the display cards applied for the jobs, the display cards with the number matched with the number of the display cards applied for the jobs are selected from the PCI-E bus to serve as display card resources distributed to the jobs;
the scheduling unit is suitable for allocating all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the remaining display cards applied by the operation, and if so, selecting the display cards with the number matched with the number of the display cards applied by the operation from the PCI-E bus as the display card resources allocated to the operation.
9. The apparatus of claim 8, wherein,
the recording unit is suitable for reading the video card resources on the computing equipment from the PCI-E bus of each computing equipment deployed in the distributed cluster.
10. The apparatus of claim 8, wherein,
and the recording unit is suitable for recording the IDs of the available display cards on each PCI-E bus in the open linked list and sequencing the IDs according to the number of the available display cards on each PCI-E bus.
11. The apparatus of claim 10, wherein,
the recording unit performs ascending sequencing in the open chain table;
and the scheduling unit is suitable for traversing the open linked list through a depth-first algorithm and judging whether the number of the available display cards on each PCI-E bus meets the number of the display cards applied by the operation.
12. The apparatus of claim 11, wherein,
the scheduling unit is further adapted to traverse the open linked list again through a depth-first algorithm when the number of the available display cards on all the PCI-E buses does not satisfy the number of the display cards applied for the job, and select the display cards with the number matched with the number of the display cards applied for the job from the PCI-E buses as the display card resources allocated to the job.
13. The apparatus of claim 12, wherein,
the scheduling unit is suitable for allocating all the searched available display cards on the first PCI-E bus to the operation, judging whether the number of the available display cards on the next PCI-E bus meets the number of the residual display cards applied by the operation, if not, allocating all the available display cards on the PCI-E bus to the operation, and judging whether the number of the available display cards on the next PCI-E bus meets the number of the residual display cards applied by the operation until the number of the residual display cards applied by the operation is met.
14. The apparatus of claim 10, wherein,
the recording unit is suitable for deleting all available display cards distributed for the operation from the open linked list and reordering the open linked list; and/or the method is suitable for modifying the open linked list according to the released display card resources and reordering the open linked list.
CN201710250265.0A 2017-04-17 2017-04-17 Method and device for scheduling display card resources in distributed cluster Active CN107193650B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710250265.0A CN107193650B (en) 2017-04-17 2017-04-17 Method and device for scheduling display card resources in distributed cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710250265.0A CN107193650B (en) 2017-04-17 2017-04-17 Method and device for scheduling display card resources in distributed cluster

Publications (2)

Publication Number Publication Date
CN107193650A CN107193650A (en) 2017-09-22
CN107193650B true CN107193650B (en) 2021-01-19

Family

ID=59871030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710250265.0A Active CN107193650B (en) 2017-04-17 2017-04-17 Method and device for scheduling display card resources in distributed cluster

Country Status (1)

Country Link
CN (1) CN107193650B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144578B (en) * 2018-06-28 2021-09-03 中国船舶重工集团公司第七0九研究所 Display card resource allocation method and device based on Loongson computer
CN115129483B (en) * 2022-09-01 2022-12-02 武汉凌久微电子有限公司 Multi-display-card cooperative display method based on display area division

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421755B1 (en) * 1999-05-26 2002-07-16 Dell Usa, L.P. System resource assignment for a hot inserted device
CN101387993A (en) * 2007-09-14 2009-03-18 凹凸科技(中国)有限公司 Method and system for dynamically collocating resource for equipment in computer system
CN101916209A (en) * 2010-08-06 2010-12-15 华东交通大学 Cluster task resource allocation method for multi-core processor
CN102609978A (en) * 2012-01-13 2012-07-25 中国人民解放军信息工程大学 Method for accelerating cone-beam CT (computerized tomography) image reconstruction by using GPU (graphics processing unit) based on CUDA (compute unified device architecture) architecture
CN102902589A (en) * 2012-08-31 2013-01-30 浪潮电子信息产业股份有限公司 Method for managing and scheduling cluster MIS (Many Integrated Core) job
CN103105895A (en) * 2011-11-15 2013-05-15 辉达公司 Computer system and display cards thereof and method for processing graphs of computer system
CN103248659A (en) * 2012-02-13 2013-08-14 北京华胜天成科技股份有限公司 Method and system for dispatching cloud computed resources
CN104954400A (en) * 2014-03-27 2015-09-30 中国电信股份有限公司 Cloud computing system and realizing method thereof
CN105718316A (en) * 2014-12-01 2016-06-29 中国移动通信集团公司 Job scheduling method and apparatus
CN106557366A (en) * 2015-09-28 2017-04-05 阿里巴巴集团控股有限公司 Task distribution method, apparatus and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7398337B2 (en) * 2005-02-25 2008-07-08 International Business Machines Corporation Association of host translations that are associated to an access control level on a PCI bridge that supports virtualization
US8319782B2 (en) * 2008-07-08 2012-11-27 Dell Products, Lp Systems and methods for providing scalable parallel graphics rendering capability for information handling systems
JP5180729B2 (en) * 2008-08-05 2013-04-10 株式会社日立製作所 Computer system and bus allocation method
US9524138B2 (en) * 2009-12-29 2016-12-20 Nvidia Corporation Load balancing in a system with multi-graphics processors and multi-display systems
US10310879B2 (en) * 2011-10-10 2019-06-04 Nvidia Corporation Paravirtualized virtual GPU

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421755B1 (en) * 1999-05-26 2002-07-16 Dell Usa, L.P. System resource assignment for a hot inserted device
CN101387993A (en) * 2007-09-14 2009-03-18 凹凸科技(中国)有限公司 Method and system for dynamically collocating resource for equipment in computer system
CN101916209A (en) * 2010-08-06 2010-12-15 华东交通大学 Cluster task resource allocation method for multi-core processor
CN103105895A (en) * 2011-11-15 2013-05-15 辉达公司 Computer system and display cards thereof and method for processing graphs of computer system
CN102609978A (en) * 2012-01-13 2012-07-25 中国人民解放军信息工程大学 Method for accelerating cone-beam CT (computerized tomography) image reconstruction by using GPU (graphics processing unit) based on CUDA (compute unified device architecture) architecture
CN103248659A (en) * 2012-02-13 2013-08-14 北京华胜天成科技股份有限公司 Method and system for dispatching cloud computed resources
CN102902589A (en) * 2012-08-31 2013-01-30 浪潮电子信息产业股份有限公司 Method for managing and scheduling cluster MIS (Many Integrated Core) job
CN104954400A (en) * 2014-03-27 2015-09-30 中国电信股份有限公司 Cloud computing system and realizing method thereof
CN105718316A (en) * 2014-12-01 2016-06-29 中国移动通信集团公司 Job scheduling method and apparatus
CN106557366A (en) * 2015-09-28 2017-04-05 阿里巴巴集团控股有限公司 Task distribution method, apparatus and system

Also Published As

Publication number Publication date
CN107193650A (en) 2017-09-22

Similar Documents

Publication Publication Date Title
CN109684065B (en) Resource scheduling method, device and system
CN111176852A (en) Resource allocation method, device, chip and computer readable storage medium
CN105892996A (en) Assembly line work method and apparatus for batch data processing
CN108984317B (en) Method and device for realizing IPC (inter-process communication)
CN107193650B (en) Method and device for scheduling display card resources in distributed cluster
CN111190712A (en) Task scheduling method, device, equipment and medium
US20200142754A1 (en) Computing system and method for operating computing system
CN111381972A (en) Distributed task scheduling method, device and system
CN107357640B (en) Request processing method and device for multi-thread database and electronic equipment
WO2019205370A1 (en) Electronic device, task distribution method and storage medium
CN108829510A (en) Thread binds treating method and apparatus
CN110851236A (en) Real-time resource scheduling method and device, computer equipment and storage medium
WO2014046885A2 (en) Concurrency identification for processing of multistage workflows
CN105874426A (en) Batch processing method and device for system invocation commands
CN108023905B (en) Internet of things application system and method
CN112181637B (en) Memory resource allocation method and device
CN105975329B (en) A kind of creation method and device of virtual machine
CN109800078B (en) Task processing method, task distribution terminal and task execution terminal
CN113760499A (en) Method, device, computing equipment and medium for scheduling computing unit
CN108228355A (en) Task processing method and device, method for scheduling task and device
CN111383037B (en) Method and device for constructing advertisement materials
CN116260876A (en) AI application scheduling method and device based on K8s and electronic equipment
CN102521043B (en) A kind of task processing method and application system
CN111813541B (en) Task scheduling method, device, medium and equipment
CN114675954A (en) Task scheduling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant