CN108710536A - A kind of multi-level fine-grained virtualization GPU method for optimizing scheduling - Google Patents
A kind of multi-level fine-grained virtualization GPU method for optimizing scheduling Download PDFInfo
- Publication number
- CN108710536A CN108710536A CN201810285080.8A CN201810285080A CN108710536A CN 108710536 A CN108710536 A CN 108710536A CN 201810285080 A CN201810285080 A CN 201810285080A CN 108710536 A CN108710536 A CN 108710536A
- Authority
- CN
- China
- Prior art keywords
- scheduling
- gpu
- ring
- event
- virtual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
Abstract
The invention discloses a kind of multi-level fine-grained virtualization GPU method for optimizing scheduling, respectively with 3 kinds of modes come Optimized Operation strategy:Scheduling based on Time And Event, the seamless scheduling based on assembly line, and mix the scheduling based on ring and based on virtual machine.This 3 kinds of scheduling strategies have been utilized respectively expense caused by two virtual machines switch, virtual machine operation is divided into multiple stages while operation and multiple virtual machines work at the same time this 3 points methods as an optimization using different rings.The present invention is by changing scheduler and scheduling strategy, greatly reduce the expense of handoff procedure, and the parallel execution between multiple virtual GPU is supported, therefore the performance of the shared multiple virtual GPU of a physics GPU can be obviously improved, to promote overall performance.Invention makes the utilization rate of physics GPU be promoted, to further make the performance boost of virtual GPU.In addition, this method ensure that virtual GPU still conforms to quality of service requirement simultaneously.
Description
Technical field
The present invention relates to GPU vitualization and its task scheduling fields, and in particular to arrives a kind of multi-level fine-grained virtual
Change GPU method for optimizing scheduling.Specifically, the performance of GPU vitualization is mainly promoted using the scheduling strategy of optimization.Pass through
Optimize GPU scheduling strategies, be fine-grained scheduling by the optimizing scheduling of original coarseness, takes full advantage of original unserviceable
Time and resource, in the case where physical condition is constant, promote the performance of virtual GPU to promote the overall utilization rate of GPU.
Background technology
Nowadays, GPU technologies are more and more important in high-performance computing sector, such as AI, deep learning, data analysis,
And the fields such as cloud game are required for the participation of GPU.GPU cloud services are also come into being, and Tencent, Ali are proposed respective
GPU Cloud Servers are supplied to user as a kind of new calculating pattern.
Therefore GPU vitualization technology has higher requirement for high-performance calculation.Current solution is using complete
GPU vitualization (full GPU virtualization) technology.Program advantage is preferably isolation and safety, and
The special support of hardware is not needed.But in current full GPU vitualization technology, scheduling strategy granularity is big, still has in performance
Prodigious room for promotion.In the present invention, our experimental section is based on GVT-g (Intel Graphics Virtualization
Technology for shared vGPU technology, Intel GPU vitualizations technology).The technology allows the GPU of Intel
Multiple virtual GPU can be virtualized into, to be used to multiple virtual machines.Although testing the GPU based on Intel, we
Method be still a kind of general method.
The every 1 millisecond of scheduling of the existing schedulers of GVT-g is primary, passes through polling dispatching (Round-Robin Scheduling)
Mode, the virtual machine that is able to carry out of scheduling selection subsequent time period every time, within this period, the task of the virtual machine
(workload) allow to execute on physics GPU.Each virtual machine has certain quality of service requirement, there is three classes, limit
(cap), proportion (weight) and priority (priority), original scheduling strategy ensure that service quality reaches given and wants
It asks.It is arrived whenever the time, scheduler just determines whether to have completed task.If completing task, scheduler is just
New task can be dispatched;Otherwise next task is determined according to the requirement of service quality.The execution of each task
It only is likely to really be executed by scheduler, and is serializing on the whole, serial, two tasks will not be by
It is performed simultaneously.
This scheduling mode is fairly simple blunt, and convenient for the calculating of service quality.But meanwhile this scheduling strategy grain
Degree is thicker, the time for having many free times and being wasted.
Invention content
For the defects in the prior art, the object of the present invention is to provide a kind of multi-level fine-grained virtualization GPU tune
Optimization method is spent, 3 kinds of compatible scheduling strategies are largely divided into, is utilized respectively and original do not used from 3 angles
Part, to improving performance.This 3 kinds of strategies are respectively:Scheduling based on Time And Event, the seamless scheduling based on assembly line,
And mixing is based on ring (ring) and the scheduling based on virtual machine.
The present invention is realized according to following technical scheme:
A kind of multi-level fine-grained virtualization GPU method for optimizing scheduling, which is characterized in that include the following steps:
Step S1:The scheduling based on Time And Event is added, reduces the expense of two virtual GPU switchings;
Step S2:The seamless scheduling based on assembly line is added, allows a part of virtual GPU can be parallel, promotes virtual GPU
Efficiency when co-operation;
Step S3:Scheduling of the mixing based on ring and based on virtual machine is added, different virtual machine is allowed concurrently to utilize physics completely
GPU promotes overall utilization rate.
In above-mentioned technical proposal, step S1 includes the following steps:
Step S101:Decouple scheduling strategy frame and task dispatcher:Scheduling strategy frame is used for realizing amended
Scheduling strategy, and task dispatcher is dispatched to realize;
Step S102:Increase context and complete event, event can be triggered after the completion of context, pass to scheduling strategy
Frame, to further trigger corresponding task scheduling;
Step S103:Increase context and submit event, submits event when job scheduler receives context, and at this time
When the virtual GPU free time, the event can be handled at once, execute task;
Step S104:Change scheduling strategy frame, to support increased event, scheduling strategy frame in the time of receipt (T of R) or
After event, processing is responded, and submits to job scheduler execution;
Step S105:The service quality changed in scheduling strategy frame calculates.
In above-mentioned technical proposal, step S2 includes the following steps:
Step S201:Workflow is divided into audit and shadow stage and scheduling and held by disintegration scheduler flow
Row order section, wherein the former is the preparation stage, and the latter is the execution stage;
Step S202:Divide work and submit path, allow multiple work while submitting, makes full use of pipelining scheduling excellent
Gesture;
Step S203:Original shadow correlative code is removed task to assign, the code of different phase is separated;This
When, whole workflow code is divided into relatively independent two parts, audit and shadow and scheduling and execution, not
It can run and be independent of each other simultaneously with the stage, improve efficiency.
Step S204:When each virtual GPU only there are one shadow context when, if virtual GPU is not current at this time
GPU, it is just that first job task is shadowed.
In above-mentioned technical proposal, step S3 includes the following steps:
Step S301:Ring or engine scheduling are pressed in introducing, and the support of ring is added, and minimum thread is each from being integrally changed to
A ring is worked at the same time and is independent of each other so if work operates on different rings;
Step S302:Modification to task dispatcher will be originally with virtual by all correlative codes from individually array is changed to
Machine is that unit is changed to as unit of ring, is related to rescheduling, changes current GPU and ring, next GPU and ring;
Step S303:Modification to task dispatcher reconstructs correlative code logic, will only be supported with virtual in original logic
Machine is that the logic of unit is changed to the new logic as unit of ring;
Step S304:To dispatching the modification of policy framework, scheduling data structure is changed to array from single, while running more
Scheduling data structure is changed to array and supports to run while multiple rings by a ring;
Step S305:To dispatching the modification of policy framework, time or event triggering are changed to respectively to the branch of each ring
It holds;
Step S306:It needs global state consistent by ring scheduling, judges current virtual machine and next void with pointer CRC32
Whether the global state of quasi- machine is consistent;
Step S307:When storing every time or restoring MMIO, the value of CRC32 is calculated, if unanimously, using step S301-
Ring scheduling is pressed in 305;If inconsistent, scheduling virtual machine is pressed using original;
Step S308:Maintenance for service quality calculates separately each ring, redefines the ginseng of service quality
Number ensures also maintain correct service quality by ring scheduling;
Step S309:The switching for changing by virtual machine and pressing ring scheduling, ensure that program is correctly run.
Compared with prior art, the present invention has following advantageous effect:
The time that the present invention takes full advantage of free time and is wasted, to not change hardware and maintain Service Quality
Under the premise of amount, whole performance is improved.Such as a virtual GPU network provider, after the present invention, Ke Yi
Identical fund, the lower more overall performances of acquisition of same hardware configuration, are sold to more users, obtain more incomes.And
The user of specific objective is completed for desired, such as wants that target program is made to reach 60 frames, then can buy less equipment just
Target can be completed, to reduce expense.
Description of the drawings
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention,
Objects and advantages will become more apparent upon:
Fig. 1 is the scheduling general frame based on Time And Event;
Fig. 2 is the comparison schematic diagram of time-based scheduling and the scheduling based on Time And Event;
Fig. 3 is the seamless scheduling schematic diagram based on assembly line;
Fig. 4 is scheduling schematic diagram of the mixing based on ring and based on virtual machine;
Fig. 5-1 is that the scheduling 3dmark06 based on Time And Event tests score schematic diagram;
Fig. 5-2 is that the scheduling heaven based on Time And Event tests score schematic diagram;
Fig. 6-1 is that the seamless scheduling 3dmark06 based on assembly line tests score schematic diagram;
Fig. 6-2 is that the seamless scheduling heaven based on assembly line tests score schematic diagram;
Fig. 7-1 is that score schematic diagram is tested in the scheduling based on ring;
Fig. 7-2 is scheduling experiment score of the mixing based on ring and based on virtual machine.
Specific implementation mode
With reference to specific embodiment, the present invention is described in detail.Following embodiment will be helpful to the technology of this field
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field
For personnel, without departing from the inventive concept of the premise, several changes and improvements can also be made.These belong to the present invention
Protection domain.
First in existing dispatching method, triggering scheduling is determined by the time completely, but very may be used after the completion per task
There can be the vacant time, when especially small task is in the majority, these vacant times are wasted, therefore present invention adds tasks
Completion event triggers scheduler to make full use of the event of this part in advance.Fig. 5-1 is the scheduling based on Time And Event
3dmark06 tests score schematic diagram, and Fig. 5-2 is that the scheduling heaven based on Time And Event tests score schematic diagram, from figure
It can be seen that having larger promotion using performance after the present invention.Experiment is shown through this side dispatched based on Time And Event
Method, GPU overall performances can promote 3.2%-21.5%.
Then in existing dispatching method, each task be it is ordered executed, the former all it is complete
At later, the latter can just be performed.Actually task is segmented into two stages, and the first stage is the preparation stage, the
Two-stage is only real execution.The two stages have invoked physics GPU different pieces, therefore can be according to a 2 stage flowing water
The mode of line is dispatched.Present invention utilizes the method for this similar pipeline schedule, allow two stages of different task can be by
It dispatches simultaneously.Fig. 6-1 is that the seamless scheduling 3dmark06 based on assembly line tests score, and Fig. 6-2 is based on the seamless of assembly line
It dispatches heaven and tests score schematic diagram, as can be seen from the figure have larger promotion using part benchmark after this method.Experiment
Show that the seamless scheduling based on assembly line, GPU overall performances can promote 0%-19.7% by this.
Finally in existing dispatching method, task is all based on what entire GPU was dispatched as unit, even if multiple tasks tune
With different ring/engines, such as image rendering and streaming coding decoding, it is required for that the former is waited for complete.Therefore the present invention draws
Enter the method dispatched based on ring, if task needs to call different rings, allows these tasks while called execution.By
Inadequate in the support of GPU hardware, the operating system of this method of calling requirement virtual machine is consistent and global state is consistent, if
It is inconsistent to use original method call.Fig. 7-1 is that score schematic diagram is tested in the scheduling based on ring, and Fig. 7-2 is that mixing is based on
Score is tested in ring and scheduling based on virtual machine.Experiment shows the scheduling based on ring and based on virtual machine by this mixing
Method, for two tasks when calling different rings, performance can promote 34.0% and 70.6% respectively, and have little influence on it
His part.
A kind of multi-level fine-grained virtualization GPU method for optimizing scheduling of the present invention, which is characterized in that including walking as follows
Suddenly:
Step S1:The scheduling based on Time And Event is added, reduces the expense of two virtual GPU switchings;
Step S2:The seamless scheduling based on assembly line is added, allows a part of virtual GPU can be parallel, promotes virtual GPU
Efficiency when co-operation;
Step S3:Scheduling of the mixing based on ring and based on virtual machine is added, different virtual machine is allowed concurrently to utilize physics completely
GPU promotes overall utilization rate.
Wherein, Fig. 1 is the scheduling general frame based on Time And Event, essentially consists in and adds Time And Event and change
Frame.Fig. 2 is the comparison of time-based scheduling and the scheduling based on Time And Event, and top is time-based scheduling,
Produce the free time;Lower section is the scheduling based on Time And Event, takes full advantage of the free time.The scheduling based on Time And Event is added
Include the following steps:
Step S101:Decouple scheduling strategy frame (scheduling policy framework) and task dispatcher
(workload scheduler):Scheduling strategy frame is used for realizing amended scheduling strategy, and task dispatcher comes in fact
Now dispatch;Mixing both in original framework can not be revised as customized tactful dispatching method, it is therefore desirable to by its point
It opens.Scheduling strategy frame after separating only is responsible for realizing scheduling strategy, and task dispatcher is only responsible for specific implementation scheduling.It is this
Modification after the segmentation of Focus is allowed to is possibly realized.
Step S102:Increase context (context) and complete event, event can be triggered after the completion of context, transmit
Scheduling strategy frame is given, to further trigger corresponding task scheduling;Original mode only exists the event of time triggered, i.e.,
Per 1ms, triggering is primary.In the present invention, it on the basis of original scheduler, increases context and completes event, which can be
It is triggered after the completion of context, passes to scheduling strategy frame, to further trigger corresponding task scheduling.
Step S103:Increase context and submit event, submits event when job scheduler receives context, and at this time
When the virtual GPU free time, the event can be handled at once, execute task;On the basis of original scheduler, increases context and submit thing
Part, the event can be periodically executed task in the virtual GPU free time of work at present.Once job scheduler receives context
Submission event, and at this time the virtual GPU free time when, can handle the event at once, execute task.In original method, because
Lack the event, job scheduler meeting idle waiting wastes the working time.
Step S104:Change scheduling strategy frame, to support increased event, scheduling strategy frame in the time of receipt (T of R) or
After event, processing is responded, and submits to job scheduler execution;Time triggered is only supported in original design, is increased now
Event triggering, needs the support of scheduling strategy frame.Scheduling strategy frame responds place after time of receipt (T of R) or event
Reason, and submit to job scheduler execution.
Step S105:The service quality changed in scheduling strategy frame calculates.In original scheme, in order to keep needs
Service quality needs to count corresponding scheduling time, and by being then based on time scheduling, comparison for calculation methods is simple.In the present invention
In, due to adding the scheduling based on event, the calculating needs of service quality recalculate, not only need to consider the time, also need
Consider the influence that each event is brought.In the present invention, content is calculated to the part and has done algorithm again, to ensure totality
Service quality can still reach given requirement.
Fig. 3 is the seamless scheduling schematic diagram based on assembly line, and top is former design, and lower section is modified design, is decomposed
Task simultaneously allows streamlined to complete scheduling.Seamless scheduling based on assembly line, includes the following steps:
Step S201:Workflow is divided into audit and shadow (audit& by disintegration scheduler flow;Shadow) rank
Section and scheduling and execution (scheduling&Execution in the) stage, wherein the former is the preparation stage, and the latter is the execution stage;
In original design, workflow ordered can execute, from the beginning each task is held every time there is no by careful decomposition
Row to the end, just starts next task.
Step S202:Divide work and submit path, allow multiple work while submitting, makes full use of pipelining scheduling excellent
Gesture;It can only support the same time that can be submitted there are one task in original design, the present invention allows multiple.Allow multiple
It works while submitting the advantage that could be enjoyed and pipeline scheduling in the present invention.
Step S203:Original shadow correlative code is removed task to assign, the code of different phase is separated;This
When, whole workflow code is divided into relatively independent two parts, audit and shadow and scheduling and execution, not
It can run and be independent of each other simultaneously with the stage, improve efficiency.At this point, original whole workflow code is divided into relatively
Independent two parts, audit and shadow and scheduling and execution.The purpose of segmentation is that different phase can be run and mutual simultaneously
It does not influence, therefore when encountering multiple events, the execution stage of previous event can be same with the preparation stage of the latter event
Shi Yunhang, to improve efficiency.
Step S204:When each virtual GPU only there are one shadow context when, if virtual GPU is not current at this time
GPU, it is just that first job task is shadowed.This step ensure that the dispatching method can correctly be held in all cases
It goes, is consistent with original design in correctness.
Fig. 4 is scheduling schematic diagram of the mixing based on ring and based on virtual machine, be respectively compared original design, by ring dispatch with
And the scheduling mode of mixed scheduling is different, it is also seen that the advantage of mixed scheduling from figure.Mixing is based on ring (ring) and base
In the scheduling of virtual machine, include the following steps:
Step S301:Ring (ring) or engine (engine) scheduling are pressed in introducing, the support of ring are added, by minimum thread
It is changed to each ring from whole, so if work operates on different rings, works at the same time and is independent of each other;Original design
In, for virtual GPU based on being dispatched as unit of entire virtual machine, virtual machine has respective live load, operates in difference
On ring.Under normal circumstances, a GPU has 3 or more rings, each handles different types of task (such as image rendering, stream
Media coding decoding etc.).
Step S302:Modification to task dispatcher will be originally with virtual by all correlative codes from individually array is changed to
Machine is that unit is changed to as unit of ring, is related to rescheduling, changes current GPU and ring, next GPU and ring;
Step S303:Modification to task dispatcher reconstructs correlative code logic, will only be supported with virtual in original logic
Machine is that the logic of unit is changed to the new logic as unit of ring;
Step S304:To dispatching the modification of policy framework, scheduling data structure is changed to array from single, while running more
Scheduling data structure is changed to array and supports to run while multiple rings by a ring;Scheduling data structure, scheduling plan are related generally to
Slightly equal relevant portions.In original design, it is only necessary to which a structure can complete scheduling.
Step S305:To dispatching the modification of policy framework, time or event triggering are changed to respectively to the branch of each ring
It holds;In original design, only time triggered, and only triggering entirety GPU in of the invention, not only needs to support time and thing
Part triggers, and is responded with greater need for each ring correspondence.
Step S306:It needs global state consistent by ring scheduling, judges current virtual machine and next void with pointer CRC32
Whether the global state of quasi- machine is consistent;Pointer CRC32 has been directed toward the content of the global state information of virtual GPU, due in the part
Piece of content can only be retained in system design by holding, and need to use the pointer as judgement.
Step S307:When storing every time or restoring MMIO, the value of CRC32 is calculated, if unanimously, using step S301-
Ring scheduling is pressed in 305;If inconsistent, scheduling virtual machine is pressed using original;The value of CRC32 can find the corresponding overall situation
State, when the multiple operating systems executed at present are consistent, which can be multiplexed;If inconsistent, can not just use
Press ring scheduling, it is necessary to be switched to original by scheduling virtual machine.
Step S308:Maintenance for service quality calculates separately each ring, redefines the ginseng of service quality
Number ensures also maintain correct service quality by ring scheduling;
Step S309:The switching for changing by virtual machine and pressing ring scheduling, ensure that program is correctly run.The present invention is perfect
The switching of two kinds of scheduling strategies.
The present invention combines these three dispatching methods, and three kinds of dispatching methods do not conflict, can be by same luck
It uses in GPU, therefore the overall performance of GPU can be therefore greatly increased.In specific experiment, present invention uses Intel
GPU is as test object, but the method for the present invention is a kind of method of versatility, can be used for other GPU manufacturers systems
The GPU made.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can make a variety of changes or change within the scope of the claims, this not shadow
Ring the substantive content of the present invention.In the absence of conflict, the feature in embodiments herein and embodiment can arbitrary phase
Mutually combination.
Claims (4)
1. a kind of multi-level fine-grained virtualization GPU method for optimizing scheduling, which is characterized in that include the following steps:
Step S1:The scheduling based on Time And Event is added, reduces the expense of two virtual GPU switchings;
Step S2:The seamless scheduling based on assembly line is added, allows a part of virtual GPU can be parallel, it is common to promote virtual GPU
Efficiency when work;
Step S3:Scheduling of the mixing based on ring and based on virtual machine is added, different virtual machine is allowed concurrently to utilize physics GPU completely,
Promote overall utilization rate.
2. a kind of multi-level fine-grained virtualization GPU method for optimizing scheduling according to claim 1, which is characterized in that
Step S1 includes the following steps:
Step S101:Decouple scheduling strategy frame and task dispatcher:Scheduling strategy frame is used for realizing amended scheduling
Strategy, and task dispatcher is dispatched to realize;
Step S102:Increase context and complete event, event can be triggered after the completion of context, pass to scheduling strategy frame
Frame, to further trigger corresponding task scheduling;
Step S103:Increase context and submit event, submits event when job scheduler receives context, and virtual at this time
When the GPU free time, the event can be handled at once, execute task;
Step S104:Scheduling strategy frame is changed, to support increased event, scheduling strategy frame is in time of receipt (T of R) or event
Afterwards, processing is responded, and submits to job scheduler execution;
Step S105:The service quality changed in scheduling strategy frame calculates.
3. a kind of multi-level fine-grained virtualization GPU method for optimizing scheduling according to claim 1, which is characterized in that
Step S2 includes the following steps:
Step S201:Workflow is divided into audit and shadow stage and scheduling and executes rank by disintegration scheduler flow
Section, wherein the former is the preparation stage, and the latter is the execution stage;
Step S202:Divide work and submit path, allow multiple work while submitting, makes full use of pipelining scheduling advantage;
Step S203:Original shadow correlative code is removed task to assign, the code of different phase is separated;At this point, will
Whole workflow code is divided into relatively independent two parts, audit and shadow and scheduling and execution, in not same order
Section can run and be independent of each other simultaneously, improve efficiency.
Step S204:When each virtual GPU only there are one shadow context when, if virtual GPU is not current GPU at this time,
It is just that first job task is shadowed.
4. a kind of multi-level fine-grained virtualization GPU method for optimizing scheduling according to claim 1, which is characterized in that
Step S3 includes the following steps:
Step S301:Ring or engine scheduling are pressed in introducing, and the support of ring is added, and minimum thread is changed to each from whole
Ring is worked at the same time and is independent of each other so if work operates on different rings;
Step S302:Modification to task dispatcher will be originally with virtual machine by all correlative codes from individually array is changed to
Unit is changed to as unit of ring, is related to rescheduling, is changed current GPU and ring, next GPU and ring;
Step S303:Modification to task dispatcher reconstructs correlative code logic, will a support be with virtual machine in original logic
The logic of unit is changed to the new logic as unit of ring;
Step S304:To dispatching the modification of policy framework, by scheduling data structure from being individually changed to array, while running multiple
Scheduling data structure is changed to array and supports to run while multiple rings by ring;
Step S305:To dispatching the modification of policy framework, time or event triggering are changed to the support to each ring respectively;
Step S306:It needs global state consistent by ring scheduling, judges current virtual machine and next virtual machine with pointer CRC32
Global state it is whether consistent;
Step S307:When storing every time or restoring MMIO, the value of CRC32 is calculated, if unanimously, using in step S301-305
Press ring scheduling;If inconsistent, scheduling virtual machine is pressed using original;
Step S308:Maintenance for service quality calculates separately each ring, redefines the parameter of service quality, protects
Card can also maintain correct service quality by ring scheduling;
Step S309:The switching for changing by virtual machine and pressing ring scheduling, ensure that program is correctly run.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810285080.8A CN108710536B (en) | 2018-04-02 | 2018-04-02 | Multilevel fine-grained virtualized GPU (graphics processing Unit) scheduling optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810285080.8A CN108710536B (en) | 2018-04-02 | 2018-04-02 | Multilevel fine-grained virtualized GPU (graphics processing Unit) scheduling optimization method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108710536A true CN108710536A (en) | 2018-10-26 |
CN108710536B CN108710536B (en) | 2021-08-06 |
Family
ID=63867079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810285080.8A Active CN108710536B (en) | 2018-04-02 | 2018-04-02 | Multilevel fine-grained virtualized GPU (graphics processing Unit) scheduling optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108710536B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109656714A (en) * | 2018-12-04 | 2019-04-19 | 成都雨云科技有限公司 | A kind of GPU resource dispatching method virtualizing video card |
CN109753134A (en) * | 2018-12-24 | 2019-05-14 | 四川大学 | A kind of GPU inside energy consumption control system and method based on overall situation decoupling |
CN109766189A (en) * | 2019-01-15 | 2019-05-17 | 北京地平线机器人技术研发有限公司 | Colony dispatching method and apparatus |
CN110442389A (en) * | 2019-08-07 | 2019-11-12 | 北京技德系统技术有限公司 | A kind of shared method using GPU of more desktop environments |
CN113274736A (en) * | 2021-07-22 | 2021-08-20 | 北京蔚领时代科技有限公司 | Cloud game resource scheduling method, device, equipment and storage medium |
CN113742085A (en) * | 2021-09-16 | 2021-12-03 | 中国科学院上海高等研究院 | Execution port time channel safety protection system and method based on branch filtering |
US11321126B2 (en) | 2020-08-27 | 2022-05-03 | Ricardo Luis Cayssials | Multiprocessor system for facilitating real-time multitasking processing |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102662725A (en) * | 2012-03-15 | 2012-09-12 | 中国科学院软件研究所 | Event-driven high concurrent process virtual machine realization method |
CN103336718A (en) * | 2013-07-04 | 2013-10-02 | 北京航空航天大学 | GPU thread scheduling optimization method |
CN104714850A (en) * | 2015-03-02 | 2015-06-17 | 心医国际数字医疗系统(大连)有限公司 | Heterogeneous joint account balance method based on OPENCL |
CN106663021A (en) * | 2014-06-26 | 2017-05-10 | 英特尔公司 | Intelligent gpu scheduling in a virtualization environment |
US20170221173A1 (en) * | 2016-01-28 | 2017-08-03 | Qualcomm Incorporated | Adaptive context switching |
CN107357661A (en) * | 2017-07-12 | 2017-11-17 | 北京航空航天大学 | A kind of fine granularity GPU resource management method for mixed load |
-
2018
- 2018-04-02 CN CN201810285080.8A patent/CN108710536B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102662725A (en) * | 2012-03-15 | 2012-09-12 | 中国科学院软件研究所 | Event-driven high concurrent process virtual machine realization method |
CN103336718A (en) * | 2013-07-04 | 2013-10-02 | 北京航空航天大学 | GPU thread scheduling optimization method |
CN106663021A (en) * | 2014-06-26 | 2017-05-10 | 英特尔公司 | Intelligent gpu scheduling in a virtualization environment |
CN104714850A (en) * | 2015-03-02 | 2015-06-17 | 心医国际数字医疗系统(大连)有限公司 | Heterogeneous joint account balance method based on OPENCL |
US20170221173A1 (en) * | 2016-01-28 | 2017-08-03 | Qualcomm Incorporated | Adaptive context switching |
CN107357661A (en) * | 2017-07-12 | 2017-11-17 | 北京航空航天大学 | A kind of fine granularity GPU resource management method for mixed load |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109656714A (en) * | 2018-12-04 | 2019-04-19 | 成都雨云科技有限公司 | A kind of GPU resource dispatching method virtualizing video card |
CN109753134A (en) * | 2018-12-24 | 2019-05-14 | 四川大学 | A kind of GPU inside energy consumption control system and method based on overall situation decoupling |
CN109753134B (en) * | 2018-12-24 | 2022-04-15 | 四川大学 | Global decoupling-based GPU internal energy consumption control system and method |
CN109766189A (en) * | 2019-01-15 | 2019-05-17 | 北京地平线机器人技术研发有限公司 | Colony dispatching method and apparatus |
CN110442389A (en) * | 2019-08-07 | 2019-11-12 | 北京技德系统技术有限公司 | A kind of shared method using GPU of more desktop environments |
CN110442389B (en) * | 2019-08-07 | 2024-01-09 | 北京技德系统技术有限公司 | Method for sharing GPU (graphics processing Unit) in multi-desktop environment |
US11321126B2 (en) | 2020-08-27 | 2022-05-03 | Ricardo Luis Cayssials | Multiprocessor system for facilitating real-time multitasking processing |
CN113274736A (en) * | 2021-07-22 | 2021-08-20 | 北京蔚领时代科技有限公司 | Cloud game resource scheduling method, device, equipment and storage medium |
CN113742085A (en) * | 2021-09-16 | 2021-12-03 | 中国科学院上海高等研究院 | Execution port time channel safety protection system and method based on branch filtering |
CN113742085B (en) * | 2021-09-16 | 2023-09-08 | 中国科学院上海高等研究院 | Execution port time channel safety protection system and method based on branch filtering |
Also Published As
Publication number | Publication date |
---|---|
CN108710536B (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108710536A (en) | A kind of multi-level fine-grained virtualization GPU method for optimizing scheduling | |
EP3425502B1 (en) | Task scheduling method and device | |
CN106802826A (en) | A kind of method for processing business and device based on thread pool | |
CN103069389B (en) | High-throughput computing method and system in a hybrid computing environment | |
CN101894047B (en) | Kernel virtual machine scheduling policy-based implementation method | |
CN101751289B (en) | Mixed scheduling method of embedded real-time operating system | |
CN103069390B (en) | Method and system for re-scheduling workload in a hybrid computing environment | |
US20040199927A1 (en) | Enhanced runtime hosting | |
US20200012507A1 (en) | Control system for microkernel architecture of industrial server and industrial server comprising the same | |
CN105183698B (en) | A kind of control processing system and method based on multi-core DSP | |
CN104866374A (en) | Multi-task-based discrete event parallel simulation and time synchronization method | |
CN112463709A (en) | Configurable heterogeneous artificial intelligence processor | |
CN103365718A (en) | Thread scheduling method, thread scheduling device and multi-core processor system | |
CN101694633A (en) | Equipment, method and system for dispatching of computer operation | |
Hirales-Carbajal et al. | A grid simulation framework to study advance scheduling strategies for complex workflow applications | |
CN110795254A (en) | Method for processing high-concurrency IO based on PHP | |
CN105550040A (en) | KVM platform based virtual machine CPU resource reservation algorithm | |
CN114637536A (en) | Task processing method, computing coprocessor, chip and computer equipment | |
CN100440153C (en) | Processor | |
CN111258655A (en) | Fusion calculation method and readable storage medium | |
CN103810041A (en) | Parallel computing method capable of supporting dynamic compand | |
CN115794355B (en) | Task processing method, device, terminal equipment and storage medium | |
CN110928659A (en) | Numerical value pool system remote multi-platform access method with self-adaptive function | |
Aggarwal et al. | On the optimality of scheduling dependent mapreduce tasks on heterogeneous machines | |
Lam et al. | Performance guarantee for online deadline scheduling in the presence of overload |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |