CN108733490A

CN108733490A - A kind of GPU vitualization QoS control system and method based on resource-sharing adaptive configuration

Info

Publication number: CN108733490A
Application number: CN201810454727.5A
Authority: CN
Inventors: 管海兵; 卢秋旻; 姚建国
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2018-05-14
Filing date: 2018-05-14
Publication date: 2018-11-02

Abstract

The present invention provides a kind of GPU vitualization QoS control systems and method based on resource-sharing adaptive configuration.The present invention realizes that carrying out GPU resource capacity between all virtualization cloud computing workloads under QoS constraints shares and distribute by monitoring module, control module and scheduler module.The present invention is obviously improved the QoS stability that figure load is concurrently run in the virtualization cloud system platform based on virtualization GPU resource, and compared to remaining traditional QoS control method, load frame per second degree of stability is obviously improved.Meanwhile the present invention, under the premise of maintaining all concurrent efforts load stabilization smoothness operations, GPU computing resource occupancies are decreased obviously so that system platform can obviously support more workloads to maintain acceptable service quality level.

Description

A kind of GPU vitualization QoS control system based on resource-sharing adaptive configuration and Method

Technical field

The technical field of virtualization of cloud system framework of the present invention, more particularly to it is a kind of based on resource-sharing adaptive configuration GPU vitualization QoS control system and method.

Background technology

Currently, the cloud system framework based on virtualization technology has been widely used for business, each neck such as scientific research and education Domain.Broad applicability in this practical application area is established substantially on the inherent characteristic of cloud platform framework, these characteristics The flexibility in terms of height concurrency and resource-sharing in terms of including task processing.It is obvious that above-mentioned all characteristics are all based on The scheduling of resource for including in virtualization technology, shared and isolation features.But these above-mentioned work(can be maturely applied at present Can, and it is still very limited with the computing resource type of hardware limitation to avoid performance from declining, and largely it is directed to specific application journey The workload of sequence or purposes has certain types of extra resource demand toward contact in addition to them.For example, for being related to The workload that graphics calculations or extensive floating number calculate needs GPU resource being used for its realization demand, because GPU can be with The calculating speed of these workloads is greatly speeded up using its high parallel floating point number computing capability.

However, if target is to carry out resource-sharing to virtualization GPU, it just will appear such challenge：If no Device driver is changed, most of difficulty can not all solve, but the GPU products of most of practicalities do not provide driving journey of increasing income Sequence, hardware specification or communication protocol, this so that function is added or changed in GPU operation module to be practically impossible.And Another challenge is how to realize that the service quality (QoS) during GPU resource is shared ensures.When multiple concurrent efforts load shared meter When calculating resource, it will usually performance oscillation occur, and primary scheduling strategy can cause different operating load to have different QoS water It is flat, while generating unnecessary high resource occupation.Therefore, the present invention needs such a strategy, which should be able to basis Demand is that each workload distributes accurate stock number, and can dispatch all resources glibly to reach identical predetermined Adopted QoS level.

Currently, the GPU vitualization solution for having had some practical, can pass through emulator command but the data access that direct transfers To realize the fully virtualized framework with acceptable performance.But such framework still has significant problem, is especially providing On source is shared and is isolated.For example, in primary configuration, Intel's GPU vitualization solution uses one based on timing The scheduler of device is responsible for context switching scheduling.In this mechanism, all virtual GPU contexts are all used as an annular Together, and scheduling signals are triggered once Queue links every a fixed time interval, call corresponding function.It is adjusting During degree, the context of all hang-up is all traversed and is selected in a looping fashion in circle queue.That is, dispatching Journey always selects next context object after current context.Then, context switching signal is triggered, kernel-driven Program receives signal and current context is switched to selected object.

In the universal design of most of primary schedulers, an existing problem is：All virtual machines, it is either busy Or it is idle, active state is all switched to identical chance in turn, without stable resource allocation or isolation.That is, The resource share of static configuration virtual machine or the workload that redundant resource is transferred to heavyweight are impossible, needless to say It is occupied according to different resource requirement dynamic adjustresources to keep stable QoS level.Therefore, multiple work are run when simultaneously When load, the workload waste of resource of lightweight is always had, and it is bad compared with the workload performance of heavyweight.In addition, former Another problem of raw system is to cannot achieve adaptively.After all, even if to can be configured to adapt to different work negative for scheduler program Resource requirement is carried, it is also impossible persistently to monitor that modification configuration is constantly manually operated in run time behaviour.Therefore, such system System can not QoS caused by coping resources changes in demand fluctuate.But in current system design, for reply similar problems from Adaptive functions, and there is no can realize its system design.

Invention content

It is an object of the invention to be directed to the deficiencies in the prior art and defect, the present invention provides a kind of based on resource The GPU vitualization QoS control system and method for shared adaptive configuration.

The present invention is realized according to following technical scheme：

A kind of control system of the GPU vitualization QoS based on resource-sharing adaptive configuration, the control system structure exist On virtualization cloud framework based on Xen platforms, which is characterized in that including：Monitoring module, control module and scheduler module, three A module realizes and is inserted into the different levels and component of Xen platforms that the monitoring module is collected from guest virtual machine respectively With the runtime data of interface record open on physical machine kernel；The control module reception summarizes data, and according to operation When state computation QoS targets are approached to the adjustment of resource allocation, the scheduler module obtains the adjustment that control module is submitted and refers to It enables, then under resource capacity limitation after the adjustment, the scheduling of context switching is carried out to guest virtual machine, under QoS constraints GPU resource capacity is carried out between all virtualization cloud computing workloads to share and distribute.

In above-mentioned technical proposal, the scheduler module is included among kernel drive module, is directly manipulated all virtual GPU equipment realizes time slot distribution and context switching.

An interface is added in above-mentioned technical proposal, between the kernel-driven module and user's space to receive GPU to hold Amount configuration, by the interface, scheduling process in kernel responds being changed for GPU capacity limits from control module Request, when a request is received, scheduling process need not make an immediate response configuration variation, and capacity parameter will be every time in the determining time It is updated behind interval, comes equilibrium response speed and overhead.

In above-mentioned technical proposal, the monitoring module is responsible for monitoring that all virtual machines being currently running, record institute are stateful Data simultaneously summarize data input as control module, and data input includes workload FPS and GPU utilization rate.

In above-mentioned technical proposal, the control module includes a closed loop controlling structure, and the closed loop controlling structure is established Rise the resource capacity configuration submitted from the QoS feedback data obtained in monitoring module and to scheduler module adjust between connection System.

A kind of GPU vitualization QoS guarantee method based on resource-sharing adaptive configuration of the present invention, according to above-mentioned control What system was realized, which is characterized in that include the following steps：

Step S1：On the basis of GPU vitualization framework, the tune of the GPU resource capacity of each virtual machine of dynamic configuration Module is spent, then context switching scheduling of the root between all virtual machines carries out operation adjustment；

Step S2：A closed loop controlling structure is configured, the closed loop controlling structure detects and ensures to provide virtualization GPU moneys The QoS level that all working loads on the cloud computing platform in source；

Step S3：Control structure will be fed back according to the QoS received by function for monitoring, therefrom calculate current QoS shape State and preconfigured QoS goal discrepancies are away from and as adjusting the virtual machine phase for each including workload with reference to attempting dynamic The resource capacity configuration answered, achievees the purpose that adjust QoS.

Compared with prior art, the present invention has following advantageous effect：

The present invention, which is obviously improved, concurrently runs figure load in the virtualization cloud system platform based on virtualization GPU resource QoS stability, compared to remaining traditional QoS control method, load frame per second degree of stability is obviously improved.Meanwhile it is all simultaneously in maintenance Under the premise of sending out the smooth operation of workload stabilization, GPU computing resource occupancies are decreased obviously so that system platform obviously can More workloads are supported to maintain acceptable service quality level.

Description of the drawings

Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention, Objects and advantages will become more apparent upon：

Fig. 1 is the inside overall structure diagram of present system platform；

Fig. 2 is that workload glmark2 frame per second per second under Different Strategies changes over time signal in the specific embodiment Figure；

Fig. 3 is that workload plot3d frame per second per second under Different Strategies changes over time signal in the specific embodiment Figure；

Fig. 4 is that each workload and totality GPU are averaged occupancy schematic diagram under Different Strategies in the specific embodiment.

Specific implementation mode

With reference to specific embodiment, the present invention is described in detail.Following embodiment will be helpful to the technology of this field Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field For personnel, without departing from the inventive concept of the premise, several changes and improvements can also be made.These belong to the present invention Protection domain.

Fig. 1 is the inside overall structure diagram of present system platform；As shown in Figure 1, one kind of the present invention being based on money The control system of the GPU vitualization QoS of adaptive configuration is shared in source, and the control system structure is based on the virtual of Xen platforms Change on cloud framework, which is characterized in that including：Monitoring module, control module and scheduler module, three modules are realized and are inserted respectively Enter into the different levels and component of Xen platforms, the monitoring module is collected open from guest virtual machine and physical machine kernel Interface record runtime data；The control module reception summarizes data, and is calculated to resource point according to run time behaviour The adjustment matched approaches QoS targets, and the scheduler module obtains the adjust instruction that control module is submitted, then money after the adjustment Under the capacity limit of source, the scheduling of context switching is carried out to guest virtual machine, in all virtualization cloud computing works under QoS constraints GPU resource capacity is carried out between loading to share and distribute.

The scheduler module is included among kernel drive module, directly manipulates all virtual GPU equipment, realizes the time Slot distributes and context switching.And these are all kernel behaviors, it should be merely able to complete in kernel process.Modification is loaded into The GPU driver module of linux kernel is to realize desired function.

Interface is added between kernel-driven module and user's space receives GPU capacity configurations, it is interior by the interface Scheduling process in core responds the request being changed for GPU capacity limits from control module, when a request is received, Scheduling process need not make an immediate response configuration variation, and capacity parameter will be updated after determining time interval every time, to put down The response speed that weighs and overhead.

Then the task of Resource Scheduling Mechanism is exactly the GPU or more converted the distribution of GPU capacity between all virtual machines Text switching scheduling.As being proposed in the algorithm that the present invention changes, SchedulingTimer processes triggering context is cut Event is changed, it activates corresponding NextContextSelect for each ScheduleInterval time intervals.? When RestoreInterval time intervals start, each virtual machine corresponding " reciprocal value " will carry out extensive according to the capacity of distribution It is multiple.The value indicates the activity time length that this virtual machine distributes in next RestoreInterval time intervals.In the choice Hereafter during the NextContextSelect of switching target, the activity time of current virtual machine will be subtracted from " reciprocal value ". Then the pending queue formed by context is traversed, and selects first object with non-zero " reciprocal value " as up and down Text switching target.

It is dispatching algorithm pseudocode flow below：

1. dispatch timer (SchedulingTimer)：{ input：Resource capacity vector, " reciprocal value " vector, Quan Tixu Quasi- machine object }

1) it counts and is initialized as 0；

2) cycle starts：

If 3)：It counts mod RestoreInterval and is equal to 0：

4) each n in all virtual machine objects is done：" reciprocal value " (n) is assigned a value of RestoreInterval* moneys Source capacity is vectorial (n)；

5) if it is determined that terminating；

6) context handover event is triggered；

7) it counts and increases by 1；

8) ScheduleInterval time intervals are waited for；

9) cycle terminates；

2. context handoff procedure (NextContextSelect)：{ input：Reciprocal value " vector, current context, activity Time；Output：Target context }

1) " reciprocal value " (current context) is assigned a value of " reciprocal value " (current context)-activity time；

2) the next item down that target is assigned a value of current context is currently traversed；

If 3)：Current traversal target is then recycled not equal to current context：

If 4)：" reciprocal value " (currently traversing target) is more than 0：

5) it is current traversal target to return to target context；

6) if it is determined that terminating；

7) currently traversal target is assigned a value of currently traversing the next item down of target；

8) cycle terminates；

9) it is current traversal target to return to target context；

In the implementation, the present invention uses permillage as the format of setting resource capacity interface, the reason is that linux kernel is prohibited Floating number is only used, the present invention can only be with integer transfer capacity parameter, therefore the present invention is obtained using the capacity number of permillage Better precision.For the same reason, pot life length is calculated by capacity share in a time interval in order to handle Caused by remainder, which will be stored for the adjustment in next time interval.This is to ensure that resource capacity configuration is sensitive The solution of degree because in the case of no this method, in scheduling process the small variation of capability value may be ignored.

The monitoring module of the present invention is responsible for monitoring all virtual machines being currently running, and records all status datas and summarizes work The data input of module in order to control, the data input includes workload FPS and GPU utilization rate.In the entire system, the mould Block serves as the interface of workload on platform.In order to realize that FPS is recorded, the present invention is modified guest virtual machine.Because Submission and rendering graphical images are all the functions of shape library, and the new frame of time decision that precisely submission behavior is triggered is defeated The time point gone out.The present invention Stub and is covered in library about the Linux skills of LD_PRELOAD to inject monitoring using one Canonical function, in this case, all workloads dependent on shape library will call the function that was modified by this invention Version.When running figure workload in guest virtual machine, modified figure submit function (glFlush, SwapBuffer the time point of triggering) can be recorded and transfer data to the monitoring module in management domain Domain0.In addition, this Invention can also detect GPU utilization rates.GPU utilization rates are the measurements specific to equipment that standard virtual machine management program can not provide Value.The present invention is realized and is counted with busy to the free time of each virtual GPU, supported with this by changing Kernel Driver The realization of this function.

The control module includes a closed loop controlling structure, and the closed loop controlling structure is set up to be obtained from monitoring module Contact between the QoS feedback data obtained and the resource capacity configuration adjustment submitted to scheduler module.Although this module is not wrapped Function containing any manipulation GPU equipment or operation guest virtual machine, but it still can be considered as all scheduling of resource and share The higher management of behavior.

Since the QoS target levels of the present invention are a predefined constants, in controlling closed loop, input signal is One constant value.The present invention is defined as FPS in expression formula_ref, the expression formula is with constant value FPS_ref=30.In addition, As the feedback of workload operation, the present invention can collect and calculate current FPS outputs, and be denoted as FPS_out.Then of the invention It can derive that the error distance between reference value and current output value is E_fps：

E_fps=FPS_ref-FPS_out (1)

According to the structure mentioned in above-mentioned Fig. 1, the system operation controlled in closed loop can be considered as one by the present invention Black box, with FPS_outAs its output, and using GPU resource capacity C ap as input.In this black box, control interface receives shadow The capacity configuration of workload operation is rung, and FPS outputs are exactly to feed back when operation as a result.For intuitively, the present invention It is concluded that：Higher resource capacity is to lead to higher FPS, but various interference also remote super insignificant degree.

Thus the present invention can establish this control closed loop, wherein the basis for the control strategy that the present invention uses is exactly The widely used conventional PID controllers in system control field.The controller includes three independent parts, they are ratios Part, integral part and differential part, wherein being each partly respectively provided with by K_p, K_iAnd K_dThe coefficient of expression.Then the present invention can To derive capacity configuration value Cap and FPS error distance E_fpsBetween relationship：

Cap=K_p*_fps+K_i*∑_tE_fps(t)+K_d*Δ_tE_fps (2)

It is obvious that formula 2 represents the classical PID side for calculating GPU resource capability value from the error amount of the FPS data of record Method.It is good that this control method is enough the performance in the dynamic resource of computing system is shared.But has a problem in that classical PID controller Correctness based on linear and time invariance it is assumed that and can only ensure its system convergence under the two features. First linear character means should there is linear relationship between capacity input and FPS outputs.And time invariance feature is anticipated Whenever taste if the present invention inputs identical capability value, and the FPS values of system should also be identical, this and time variable It is unrelated.But the hypothesis of the two features can only substantially ensure in limited region.Therefore, the present invention will describe the present invention Modification to classical way, to overcome limitation to obtain better performance.

The guarantee of first feature can be solved by ordinary mode.It is tested according to assessment, if the present invention is by capability value It is limited in so that corresponding FPS is no more than the appropriate area around the preset QoS targets of the present invention, resource capacity input and work Relationship between load QoS outputs substantially can accurately keep linear.Therefore, it will produce nonlinear operating status and belong to different Often, can ignore in normal operation again.

Unlike first feature, ensure that another time invariance feature is more difficult.After all, capacity variable and Relationship between FPS values is constant when never keeping accurate, because the time loss for rendering single frame is always changing.But Being the present invention approximatively can regard system as a time invariant system, can cause to calculate unless there are scene switching The significant changes of complexity.Therefore, the present invention approximate in a short time can ensure time invariance, and the present invention solves input The method that acute variation occurs for relationship between output is to carry out dynamic self-adapting to the figure parameters of above-mentioned classical PID controller Adjustment.

For example, if graphical Work load starts to render more complicated scene, wherein each frame needs more GPU meters Resource is calculated, then more significant change may only result in FPS output variations as before in resource capacity input.It considers The linear character that black-box system still has, it can be calculated that if the present invention needs to keep the stabilization of FPS output variation ranges, The variation range for the resource capacity input that so black-box system obtains, it should be inversely proportional to black-box system input and output linear relationship Slope.That is, the coefficient in PID controller should be multiplied by the inverse of the ratio, the variation of black box internal feature is coped with this.

That is, it is assumed that there are following proportionate relationships in black-box system：

In this expression formula, Cap_avgAnd FPS_avgIt is the item for indicating average capacity values and FPS values in longer period.Institute It can be considered as workload in longer period per the average value of frame resource requirement, this is because Cap with this parameter k_avg And FPS_avgIt is two average values in the same period.Then present invention modification PID coefficient is：

According to above-mentioned equation, the coefficient modifying in PID controller is dynamic value by the present invention.The present invention will be original basic COEFFICIENT K_p0、K_i0And K_d0As with reference to amount configuration control loop, then coefficient of utilization k, that is, average capacity values and average FPS The ratio of value, the variation range that the resource capacity to adjust controller output configures.This modification can effectively cancel out black box system The variation of input/output relation in system.In control loop, when workload becomes more re-quantization, control coefrficient it is adaptive Answer dynamic configuration can be by expanding resource capacity variation to balance the contraction in FPS variations.

A kind of GPU vitualization QoS guarantee method based on resource-sharing adaptive configuration of the present invention, according to above-mentioned control What system was realized, include the following steps：

Step S1：On the basis of GPU vitualization framework, the tune of the GPU resource capacity of each virtual machine of dynamic configuration Module is spent, then context switching scheduling of the root between all virtual machines carries out operation adjustment；In view of graphic operation works The specific demand of load, the scheduling strategy that the present invention designs will ensure that the smooth of them executes, and avoid Time sharing and batch processing system Style, that is, handling capacity it is preferential.

Step S2：A closed loop controlling structure is configured, the closed loop controlling structure detects and ensures to provide virtualization GPU moneys The QoS level that all working loads on the cloud computing platform in source；And in view of the characteristic of figure load, the present invention is with frames per second Rate (FPS) is used as QoS measurement indexs.In order to achieve this goal, control structure will be according to the QoS received by function for monitoring Feedback, therefrom calculate current QoS state and preconfigured QoS goal discrepancies away from, and as with reference to attempt dynamic adjust it is every A corresponding resource capacity configuration of the virtual machine comprising workload, achievees the purpose that adjust QoS.

The present invention control method can control target operation when monitoring data, dynamic self-adapting controller ginseng Number improves the stable degree of control target and operation fluctuation convergence rate.In addition, the control method of the present invention considers quilt Control object runs fluency requirement, by promoting time slot allocation frequency, reduces control target operation fluctuation.

Meanwhile in order to ensure the Stability and veracity of system control loop, in control strategy, the present invention changes extension Traditional PID control method.Coefficient in controller is dynamic setting, adapts to the virtual machine run time behaviour number collected According to.By this method, the present invention can mitigate since workload controlled characteristic changes caused QoS guarantee difficulty.

By the collective effect of above-mentioned 3 improvements, the effect that the present invention can obtain is：It is obviously improved based on virtualization The QoS stability of figure load is concurrently run in the virtualization cloud system platform of GPU resource, compares remaining tradition QoS controlling party Method, load frame per second degree of stability are obviously improved.Meanwhile under the premise of maintaining all concurrent efforts load stabilization smoothness operations, GPU computing resource occupancies are decreased obviously, and up to 25.85% so that system platform can obviously be supported more to work negative It carries and maintains acceptable service quality level.

To make the purpose of the present invention, technical solution and a little clearer, it is embodied below in conjunction with attached drawing and one The present invention is described in further detail for example.

A hereinafter specific embodiment of the invention, wherein：

In the present embodiment, the configuration of operation platform determines as follows.On hardware, the model of system hardware is：

(1) platform：Intel NUC Kit NUC5i5MYHE

(2)CPU：Intel Core i5-5300U 2.30GHz

(3) memory RAM：16GB

(4) video card：Intel HD Graphics 5500

And software systems are set as：

(1) host platform：Xen 4.3.0

(2) kernel：Linux 4.3.0

(3) operating system：Ubuntu 14.04LTS

(4) guest virtual machine vCPU quantity：2

(5) guest virtual machine memory：2GB

The present invention uses three virtual machines, is separately operable three different resolution chart loads, is glxgears (light weights respectively Grade load, resource requirement are stablized), (heavyweight loads, and resource needs by glmark2 (resource requirement is with seasonal change) and plot3d Ask cyclically-varying), and the resource capacity allocation strategy of the service-oriented quality assurance in the present invention and conventional measures are carried out Comparison, as a control group.These strategies are divided equally including resource, pro rate, threshold value control.In control group, application strategy is removed Difference, remaining system configuration are consistent.Target QoS level is 30FPS.

Can from Fig. 2 (a) and Fig. 2 (b) and Fig. 3 (a) on Fig. 3 (b) it is seen that layout strategy gQoS of the present invention is compared Control strategy all has highest service quality stability to different type workload.Simultaneity factor aggregate resource occupancy obtains To apparent optimization.The average FPS of glmark2 and plot3d is respectively 29.55 and 29.85, while being had compared to other strategies Minimum standard deviation 2.80 and 1.51.Glmark2 FPS under its excess-three kind strategy are respectively 73.74,38.96,38.19, standard Difference is 5.57,7.43 and 10.25；Plot3d FPS under its excess-three kind strategy are respectively 19.74,36.24,30.93, standard deviation For 2.61,4.79and 6.37.According to Fig.4, and aggregate resource occupancy with compare strategy compared with have 9.05% respectively, 25.85% and 13.04% declines.

Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited in above-mentioned Particular implementation, those skilled in the art can make a variety of changes or change within the scope of the claims, this not shadow Ring the substantive content of the present invention.In the absence of conflict, the feature in embodiments herein and embodiment can arbitrary phase Mutually combination.

Claims

1. a kind of control system of the GPU vitualization QoS based on resource-sharing adaptive configuration, the control system structure is in base On the virtualization cloud framework of Xen platforms, which is characterized in that including：Monitoring module, control module and scheduler module, three Module realizes and is inserted into the different levels and component of Xen platforms respectively, the monitoring module collect from guest virtual machine and The runtime data of open interface record on physical machine kernel；Control module reception summarizes data, and when according to operation State computation approaches QoS targets to the adjustment of resource allocation, and the scheduler module obtains the adjust instruction that control module is submitted, Then under resource capacity limitation after the adjustment, the scheduling of context switching is carried out to guest virtual machine, in institute under QoS constraints Progress GPU resource capacity is shared and is distributed between having virtualization cloud computing workload.

2. the control system of GPU vitualization QoS based on resource-sharing adaptive configuration according to claim 1 a kind of, It is characterized in that, the scheduler module is included among kernel drive module, all virtual GPU equipment is directly manipulated, is realized Time slot distributes and context switching.

3. the control system of GPU vitualization QoS based on resource-sharing adaptive configuration according to claim 2 a kind of, It is characterized in that, adding an interface between the kernel-driven module and user's space to receive GPU capacity configurations, by this Interface, the scheduling process in kernel respond the request being changed for GPU capacity limits from control module, when receiving When request, scheduling process need not make an immediate response configuration variation, and capacity parameter will carry out more after determining time interval every time Newly, come equilibrium response speed and overhead.

4. the control system of GPU vitualization QoS based on resource-sharing adaptive configuration according to claim 1 a kind of, It is characterized in that, the monitoring module is responsible for monitoring all virtual machines being currently running, records all status datas and summarize work The data input of module in order to control, the data input includes workload FPS and GPU utilization rate.

5. the control system of GPU vitualization QoS based on resource-sharing adaptive configuration according to claim 1 a kind of, It is characterized in that, the control module includes a closed loop controlling structure, the closed loop controlling structure is set up from monitoring module Contact between the QoS feedback data of middle acquisition and the resource capacity configuration adjustment submitted to scheduler module.

6. a kind of control method of the GPU vitualization QoS based on resource-sharing adaptive configuration, according to claim 1 What control system was realized, which is characterized in that include the following steps：

Step S1：On the basis of GPU vitualization framework, the scheduling mould of the GPU resource capacity of each virtual machine of dynamic configuration Block, then context switching scheduling of the root between all virtual machines carry out operation adjustment；

Step S2：A closed loop controlling structure is configured, the closed loop controlling structure detects and ensures to provide virtualization GPU resource The QoS level that all working loads on cloud computing platform；

Step S3：Control structure feeds back the QoS received according to the function for monitoring by monitoring module, therefrom calculates current QoS state and preconfigured QoS goal discrepancies away from, and as with reference to attempt dynamic adjust it is each virtual comprising workload The corresponding resource capacity configuration of machine, achievees the purpose that adjust QoS.

7. the control system of GPU vitualization QoS based on resource-sharing adaptive configuration according to claim 6 a kind of, It is characterized in that, the monitoring module is responsible for monitoring all virtual machines being currently running, records all status datas and summarize work The data input of module in order to control, the data input includes workload FPS and GPU utilization rate.