CN105577834A - Cloud data center two-level bandwidth allocation method and system with predictable performance - Google Patents

Cloud data center two-level bandwidth allocation method and system with predictable performance Download PDF

Info

Publication number
CN105577834A
CN105577834A CN201610083948.7A CN201610083948A CN105577834A CN 105577834 A CN105577834 A CN 105577834A CN 201610083948 A CN201610083948 A CN 201610083948A CN 105577834 A CN105577834 A CN 105577834A
Authority
CN
China
Prior art keywords
bandwidth
virtual machine
tenant
request
guaranteed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610083948.7A
Other languages
Chinese (zh)
Other versions
CN105577834B (en
Inventor
杨家海
俞荟
王会
翁建平
梁子
孙晓晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201610083948.7A priority Critical patent/CN105577834B/en
Publication of CN105577834A publication Critical patent/CN105577834A/en
Application granted granted Critical
Publication of CN105577834B publication Critical patent/CN105577834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Abstract

The invention provides a cloud data center two-level bandwidth allocation method and system with a predictable performance. The method comprises the following steps: at a cloud tenant level, performing bandwidth guaranteed optimization on a virtual machine request including a network need transmitted to a tenant according to a fine grain virtual cluster FGVC network abstract model; performing physical machine and corresponding bandwidth resource allocation on the optimized virtual machine request according to a preset allocation algorithm; and at an application level, adopting an E-F runtime mechanism to equally distribute corresponding bandwidth resources to guaranteed tenants and non-guaranteed tenants. The cloud data center two-level bandwidth allocation method and system with the predictable performance provided by the invention solve the problems in the existing bandwidth allocation method that the network requests of tenants cannot be expressed completely, the bandwidth resources are wasted at a demand level, the requirements of the tenants cannot be fully considered, and the unused bandwidth resources cannot be equally distributed between the guaranteed tenants and non-guaranteed tenants.

Description

There is the two-layer bandwidth allocation methods of cloud data center and the system of Predicable performance
Technical field
The present invention relates to field of computer technology, be specifically related to a kind of two-layer bandwidth allocation methods of cloud data center and the system with Predicable performance.
Background technology
Along with the development of cloud computing technology, due to its simple payment mode as required and cheap deployment and maintenance cost, increasing enterprise selects their application migration to be deployed on cloud platform.Cost and performance are the problems that cloud tenant is concerned about most.But, the Internet resources of existing commercial cloud platform be between tenant in the mode of doing one's best share.As AmazonEC2 ensure that tenant is for CPU, the demand of internal memory and hard disk, but can not provide the guarantee of the network bandwidth.This make cloud service provider provide the performance of service and cost to have higher uncertainty.So lack Bandwidth guaranteed can produce series of problems, hinder enterprise's application migration to publicly-owned cloud.Simultaneously, the unpredictability of the application that the virtual machine that tenant also can be made to apply for runs: inconsistent due to the run duration network bandwidth, the running time of same cloud application is different, accordingly, according to existing pricing model, the cost of use of virtual machine will be also different.These influencing factors are mainly caused by expected performances different between cloud tenant and acquisition the inconsistent of performance.
Although realize having the internet resource management of Bandwidth guaranteed to be an intrinsic difficult problem on the network infrastructure shared, academicly with in industry propose a lot of work and solved this difficult problem.But there is significant limitation in current scheme: 1) existing coarseness network abstraction model intactly can not express the network demand of tenant, and wastes fractional bandwidth resource in demand aspect; 2) virtual machine Placement does not take into full account the demand of tenant's angle; 3) during current operation, mechanism can not realize the fair allocat of non-utilized bandwidth between guaranteed tenant and unsecured tenant.
Summary of the invention
For defect of the prior art, the invention provides a kind of two-layer bandwidth allocation methods of cloud data center and the system with Predicable performance, solve the network demand intactly can not expressing tenant of existing bandwidth allocation methods existence, in demand aspect waste bandwidth resource, the demand of tenant cannot be taken into full account and the problem of untapped bandwidth resources fair allocat between guaranteed tenant and unsecured tenant can not be realized.
For solving the problem, first aspect, the invention provides a kind of two-layer bandwidth allocation methods of cloud data center with Predicable performance, comprising:
In cloud tenant aspect, according to fine granularity Virtual Cluster FGVC network abstraction model, Bandwidth guaranteed optimization is carried out to the virtual machine request comprising network demand that tenant sends;
And, according to default allocation algorithm, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization;
At application, when adopting E-F to run, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
Further, describedly according to fine granularity Virtual Cluster FGVC network abstraction model, Bandwidth guaranteed optimization is carried out to the virtual machine request comprising network demand that tenant sends, comprise: according to the remaining bandwidth of described virtual machine request determination virtual machine, the unappropriated link that is connected with described virtual machine is given by the remaining bandwidth average mark of described virtual machine, and using the minimum value in the Bandwidth guaranteed of virtual machine two ends as Bandwidth guaranteed between virtual machine.
Further, described basis is preset allocation algorithm and is carried out physical machine and corresponding bandwidth resource allocation to the virtual machine request after optimization, comprising:
According to any one in following three kinds of default allocation algorithms, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization:
First presets allocation algorithm: find from physical machine layer to root routing layer the subtree that first can meet two conditions, comprise a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the bandwidth demand that bandwidth under this subtree meets virtual machine request;
Second presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant;
3rd presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is that in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.
Further, preset before allocation algorithm carries out physical machine and corresponding bandwidth resource allocation to the virtual machine request after optimizing in described basis, described method also comprises: sort to the virtual machine request after optimizing according to predetermined order algorithm.
Further, described according to predetermined order algorithm to optimize after virtual machine request sort, comprising:
According to any one in following five kinds of predetermined order algorithms, the virtual machine request after optimization is sorted:
Prerequisite variable FCFS: perform by arrival order;
The preferential SRRF of minimum profit rate: the task of having minimum profit rate first performs;
The preferential LRRF of Optimum Profit Rate: the task of having Optimum Profit Rate first performs;
Weight limit is preferential: in queue scheduling mechanism, introduce weight, and the weight of each virtual machine request is that the profit margin of this virtual machine request is multiplied by the stand-by period or divided by the stand-by period.
Further, when described employing E-F runs, mechanism is by physical machine and corresponding bandwidth resources distribute to guaranteed tenant liberally and unsecured tenant comprises: when there are the bandwidth resources be not assigned with, be that unsecured tenant sets Bandwidth guaranteed by server granularity.
Second aspect, present invention also offers a kind of two-layer bandwidth distribution system of cloud data center with Predicable performance, comprising: cloud tenant layer face treatment module and application processing module;
Described cloud tenant layer face treatment module, for carrying out Bandwidth guaranteed optimization according to fine granularity Virtual Cluster FGVC network abstraction model to the virtual machine request comprising network demand that tenant sends;
And, for carrying out physical machine and corresponding bandwidth resource allocation according to presetting allocation algorithm to the virtual machine request after optimization;
Application processing module, when running for adopting E-F, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
Further, described cloud tenant layer face treatment module specifically for: according to the remaining bandwidth of described virtual machine request determination virtual machine, the unappropriated link that is connected with described virtual machine is given by the remaining bandwidth average mark of described virtual machine, and using the minimum value in the Bandwidth guaranteed of virtual machine two ends as Bandwidth guaranteed between virtual machine.
Further, described cloud tenant layer face treatment module specifically for:
According to any one in following three kinds of default allocation algorithms, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization:
First presets allocation algorithm: find from physical machine layer to root routing layer the subtree that first can meet two conditions, comprise a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the bandwidth demand that bandwidth under this subtree meets virtual machine request;
Second presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant;
3rd presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is that in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.
Further, described application processing module specifically for: when exist be not assigned with bandwidth resources time, be that unsecured tenant sets Bandwidth guaranteed by server granularity.
As shown from the above technical solution, the two-layer bandwidth allocation methods of cloud data center and the system with Predicable performance of the present invention, have employed fine granularity Virtual Cluster FGVC network abstraction model in cloud tenant aspect and Bandwidth guaranteed optimization has been carried out to the virtual machine request comprising network demand that tenant sends, and according to default allocation algorithm, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization; At application, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism when E-F runs that have employed liberally, thus solve the network demand intactly can not expressing tenant that existing bandwidth allocation methods exists, in demand aspect waste bandwidth resource, cannot take into full account the demand of tenant and can not realize the problem of untapped bandwidth resources fair allocat between guaranteed tenant and unsecured tenant.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the flow chart with the two-layer bandwidth allocation methods of cloud data center of Predicable performance that first embodiment of the invention provides;
Fig. 2 is the flow chart with the two-layer bandwidth allocation methods of cloud data center of Predicable performance that second embodiment of the invention provides;
Fig. 3 is the schematic diagram of SpongeNet system performing step;
Fig. 4 a is a hose model schematic diagram;
Fig. 4 b is the schematic diagram of a virtual machine connection layout (VCG);
Fig. 5 a is the VCG figure of a MapReduce virtual machine cluster;
Fig. 5 b is the VCG figure of a typical three-layer network application;
Fig. 6 a is that a FGVC asks schematic diagram;
Fig. 6 b is FGVC model optimization process schematic;
Fig. 7 is that two benches virtual machine places schematic diagram;
Schematic diagram of mechanism when Fig. 8 a is a kind of E-F operation on a physical link;
Schematic diagram of mechanism when Fig. 8 b is the another kind of E-F operation on a physical link;
Fig. 9 is schematic diagram of mechanism when E-F runs in tree topology;
Figure 10 is the bandwidth conservation amount schematic diagram of different application;
Figure 11 is the deadline schematic diagram of various combination;
Figure 12 a is profit margin schematic diagram;
Figure 12 b is virtual machine utilance (per second) schematic diagram;
Figure 13 a is the deadline schematic diagram of different average bandwidth size;
Figure 13 b is the deadline schematic diagram under different platform load;
Figure 14 a and Figure 14 b is stand-by period and the queue length schematic diagram of various combination;
Figure 15 a and Figure 15 b is stand-by period and the queue length schematic diagram over time of each request;
Figure 16 a and Figure 16 b is the average latency schematic diagram under different average bandwidth size and platform loads;
Figure 17 is experiment porch topological structure schematic diagram;
Figure 18 a and Figure 18 b is many-to-one experiment scene schematic diagram;
Figure 19 a and Figure 19 b is many-to-one two kinds of sights;
Figure 20 is the CDF schematic diagram of MapReduce experiment;
Figure 21 is the structural representation with the two-layer bandwidth distribution system of cloud data center of Predicable performance that third embodiment of the invention provides.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, clear, complete description is carried out to the technical scheme in the embodiment of the present invention, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other invention obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 shows the flow chart with the two-layer bandwidth allocation methods of cloud data center of Predicable performance that first embodiment of the invention provides, as shown in Figure 1, the two-layer bandwidth allocation methods of cloud data center with Predicable performance of the present embodiment comprises the steps:
Step 101: in cloud tenant aspect, carries out Bandwidth guaranteed optimization according to fine granularity Virtual Cluster FGVC network abstraction model to the virtual machine request comprising network demand that tenant sends;
And, according to default allocation algorithm, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization.
Step 102: at application, when adopting E-F to run, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
The present embodiment have employed fine granularity Virtual Cluster FGVC network abstraction model in cloud tenant aspect and has carried out Bandwidth guaranteed optimization to the virtual machine request comprising network demand that tenant sends, and carries out physical machine and corresponding bandwidth resource allocation according to default allocation algorithm to the virtual machine request after optimization; At application, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism when E-F runs that have employed liberally, thus solve the network demand intactly can not expressing tenant that existing bandwidth allocation methods exists, in demand aspect waste bandwidth resource, cannot take into full account the demand of tenant and can not realize the problem of untapped bandwidth resources fair allocat between guaranteed tenant and unsecured tenant.
Wherein, abstract model Shi Yun data center realizes the most important ring of Bandwidth guaranteed.It must realize following four targets: 1) tenant can simple, intuitive and specify their network demand exactly, and model can resilient expansion neatly for some tenants having the virtual machine of large quantity as Netflix simultaneously; 2) abstract model can help tenant to optimize their network demand, improves resource utilization; 3) this model can be applied to several scenes; 4) demand expressed by it can be converted to the deployment scheme on true cloud platform simply.Many correlative studys achieve abstract network model from different perspectives.Oktopus and SecondNet provides the static reservations of Internet resources, but the network bandwidth that the former only achieves virtual machine layer ensures, thus precisely cannot describe the bandwidth request of tenant flexibly; The latter achieves the Bandwidth guaranteed between virtual machine and virtual machine, but tenant needs to specify the bandwidth between every bar virtual machine, and this makes model not easily dispose.Proteus, CloudMirror and ElasticSwitch propose different mechanism running time, and Bandwidth guaranteed can carry out dynamic conditioning according to the real network demand of upper layer application.But the adaptability of Proteus and CloudMirror is confined to certain application background: Proteus is only applicable to MapReduce, CloudMirror is only suitable for 3 layer network application.ElasticSwitch achieves bandwidth conservation, and can apply for great majority, but it does not relate to the allocation algorithm of virtual machine.
In order to make Bandwidth guaranteed instantiation, need to be mapped in the network topology of low layer reality by virtual machine Placement with the network demand that network abstraction model is specified.Many working values before consider CPU, the distribution of internal memory and hard disk, but the distribution that have ignored Internet resources.In addition, most of existing research is devoted to optimize the allocated phase in virtual machine Placement, does not have to consider queuing stages wherein.And the throughput of system is mainly considered in these correlative studys from the angle of cloud service provider, have ignored the response time that tenant pays close attention to.
Finally, during operation, mechanism not only will ensure the bandwidth demand of tenant in network abstraction model, also needs efficiently, real network demand when running according to upper layer application liberally adjusts allocated bandwidth between tenant.Solution before does not realize: 1) Bandwidth guaranteed; 2) high usage; 3) justice between tenant; 4) exploitativeness.Before for the research of mechanism when running, all could not realize above target simultaneously.Oktopus, SecondNet and Silo do not realize the efficiency utilization of bandwidth, and Seawall can not provide Bandwidth guaranteed.FairCloud needs special switch support.The tenant that ElasticSwitch have ignored Bandwidth guaranteed and there is no Bandwidth guaranteed tenant between the fair allocat of bandwidth.
In order to solve the problem, in the present embodiment second embodiment, give a kind of specific implementation process with the two-layer bandwidth allocation methods of cloud data center of Predicable performance that the present embodiment provides, see Fig. 2, specifically comprising the steps.
Step 201: in cloud tenant aspect, carries out Bandwidth guaranteed optimization according to fine granularity Virtual Cluster FGVC network abstraction model to the virtual machine request comprising network demand that tenant sends.
Step 202: the virtual machine request after optimization is sorted according to predetermined order algorithm.
Step 203: physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization according to default allocation algorithm.
Step 204: at application, when adopting E-F to run, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
The present embodiment proposes SpongeNet, and it is a kind of method of two aspects for cloud data center allocated bandwidth, makes the application based on cloud have predictable performance.In tenant's aspect (i.e. tenant's aspect), SpongeNet fine granularity Virtual Cluster (Fine-grainedVirtualCluster) network abstraction model is combined with effective two benches virtual machine Placement, according to the demand assignment bandwidth of tenant.FGVC abstract model precisely and neatly can express the demand of tenant, effectively saves bandwidth, of many uses and easily dispose.Two benches virtual machine prevents algorithm by Optimize and line up and places two stages, effectively by FGVC Model Mapping in practical topology, and different optimization aim can be met.At application, when the E-F of SpongeNet runs, mechanism (Efficiency-Fairnessruntimemechanism) utilizes OpenvSwitch to realize the Bandwidth guaranteed to tenant and the adjustment of the application demand according to tenant allocated bandwidth.When E-F runs, mechanism achieves the Bandwidth guaranteed of FGVC model, and untapped bandwidth resources are distributed to guaranteed and unwarranted tenant liberally.
The present embodiment uses the topology of a large amount of True Datas and data center to carry out emulation experiment, demonstrates the superiority of FGVC model and two benches virtual machine Placement.The present embodiment has built a simple SpongeNet prototype in the cloud data center experimental bench that 7 station servers form, and mechanism when E-F runs that illustrates has better performance than other schemes existing.
The validity of mechanism during for proving that FGVC model, two benches virtual machine Placement and E-F run, the present embodiment with the addition of an interface flexibly, and it makes tenant simply, clearly can express network demand.Meanwhile, the present embodiment develops a new scheduler, to achieve the two benches virtual machine Placement on cloud platform based on the API that increases income of OpenStackHavana at cloud platform.Finally, the present embodiment, by rewriting the part of original code of OpenvSwitch, achieves the function of mechanism when E-F runs.SpongeNet establishes adjustment Bandwidth guaranteed according to the network demand of tenant, thus realizes predictable performance.Meanwhile, SpongeNet makes virtual machine can utilize in cloud platform does not have the bandwidth resources of use, thus improves the utilance of cloud platform network resource.
The target of SpongeNet system is the virtual machine network stable performance making cloud tenant, meanwhile, makes cloud service provider obtain economic benefit large as far as possible.Whole process realizes (Fig. 3) by three steps:
The first step, cloud tenant uses FGVC specification of a model containing the virtual machine request of network demand.By the Bandwidth guaranteed optimizing from virtual machine aspect to virtual machine, the request making tenant is more reasonable because this to be the request arrived afterwards saved that more multi-band is wide.
Second step, by obtain in the first step through pretreated request, according to different targets, queue up by five kinds of different algorithms.Next, SpongeNet can dispatch orderly virtual machine request queue according to dispatching algorithm.Based on the thought of three kinds of classical memory allocation algorithms, this patent achieves three kinds of different dispatching algorithms.
3rd step, when E-F runs, mechanism is by controlling limiter of speed, realizes the Bandwidth guaranteed between virtual machine set in FGVC model, then uses the class TCP bandwidth detection of Weight to realize the untapped bandwidth resources of fair allocat, and achieve efficiency utilization.The present embodiment will elaborate this three parts at lower joint.
FGVC model realizes on the basis of traditional hose model.The Inspiration Sources that hose model realizes runs Distributed Application in the special physical cluster that present most enterprise is being made up of a physics route and multiple computing node.So (Fig. 4 a), has such abstract model: the two-way link that multiple virtual machine handling capacities of same tenant are limited is linked to a virtual flow-line in hose model.In hose model, the request <N of a tenant, B> represent him/her needs N platform bandwidth to be the virtual machine of B, so the bandwidth of virtual flow-line is N*B.
Hose model can be applied to most application, but tenant can only express the bandwidth demand of virtual machine-level.No matter employ which type of virtual topology, a virtual machine can only be received and sent messages with speed B.So the present embodiment devises FGVC model: on the basis of hose model, with the addition of a parameter---virtual machine connection layout (VCG), it can make tenant express the bandwidth (Fig. 4 b) of link between arbitrary virtual machine.That is, tenant intactly can express their network demand.VCG is a figure, and its each node on behalf one comprises the virtual machine of Bandwidth guaranteed, and every bar limit represents communication between two virtual machines and Bandwidth guaranteed.
In cloud data center of today, a large amount of batch application (batchapplications) and interactive application (as Hadoop and the application of 3 layer networks) are run under virtual environment.These application have fixing communication pattern, usually so tenant can know clearly the bandwidth demand how set between virtual machine and virtual machine.
HadoopMapReduce process comprises two stages.In the map stage, the data of input are cut into multiple fritter, distribute to mappers process.In the reduce stage, the intermediate data shuffle that the map stage is formed by MapReduce framework operates, and they are delivered to reducers process.The present embodiment can use FGVC model clearly to express communication details between virtual machine.Such as, a tenant applies for the MapReduce virtual machine cluster of 8 nodes, comprising 1 master node and 7 slave nodes.In 7 nodes, there are 7 mappers and 2 reducers (usually, reducers is the subset of mappers node).By FGVC, tenant can illustrate that this is asked.In fig 5 a, all has 8 nodes of Bandwidth guaranteed as hose model, is connected to a virtual router.By VCG, the present embodiment can learn communication pattern between virtual machine and Bandwidth guaranteed.As mentioned above, in the map stage, the data after fractionation are delivered to mappers (as shown in Figure 5 a, virtual machine A is connected to virtual machine B to H) by master node.In the reduce stage, reducers collects the data from the process of mappers.In this example, data are delivered to virtual machine F by virtual machine B, C and D; Data are delivered to virtual machine E by virtual machine G, H.
Equally, the present embodiment all FGVC is also applicable to interactive application, as 3 layer network application.Fig. 5 b illustrates the VCG of the three-layer network application typically having front end layer (A-C), Business Logic (D-F) and database layer (G-I).
Provide the explanation explanation optimizing bandwidth demand below.
FGVC model is that tenant provides the ability expressing inter-virtual machine communication pattern and Bandwidth guaranteed.In some cases, if tenant is when setting network data between virtual machine, does not know the concrete numerical value between virtual machine X and virtual machine Y, first can not setting this link.As shown in Figure 6 a, the virtual machine of a tenant request 5 60Mb/s, and this tenant sets the Bandwidth guaranteed value of communication pattern and middle part of links thereof.Before by this Model Mapping to low layer infrastructure, first the present embodiment needs the initial value of the Bandwidth guaranteed drawing link between virtual machine (as A to B, C to D etc.), needs at the MODE of operation of virtual machine to virtual machine because speed performs; Only bandwidth to be ensured or inadequate in the aspect of VM.
One rationally and simple settling mode is to connected unappropriated link by the remaining bandwidth average mark of virtual machine.This method of salary distribution is proved to be very effective in ElasticSwitch.In the above example, the remaining bandwidth of virtual machine D is 30Mb/s, and also has the bandwidth of two links not determine in coupled link.So in virtual link D → E, the method for virtual machine D by dividing equally, can provide the bandwidth link of 15Mb/s to ensure similar, virtual machine E can provide the bandwidth link of 40Mb/s to ensure in the method for simplifying of FGVC, the present embodiment by the minimum value in the Bandwidth guaranteed of two ends as Bandwidth guaranteed B between virtual machine d → Evalue:
B D &RightArrow; E = m i n ( B D D &RightArrow; E , B E E &RightArrow; D )
So final B d → Evalue be 15Mbps.For virtual machine E, if Q efor the set of all virtual machines be directly connected with virtual machine E, then the real network demand of virtual machine E is be 35Mbps, virtual machine E has saved 25Mbps bandwidth.The Bandwidth guaranteed of virtual machine D is all utilized.By Optimized model, between the virtual machine Bandwidth guaranteed finally obtained and virtual machine, link bandwidth ensures as shown in Figure 6 b.Optimizing process makes the request of tenant save 120Mbps bandwidth altogether.
Wherein, Bandwidth guaranteed optimized algorithm is specially as described below:
For the virtual machine request r:<N of certain tenant, B, G>, perform following operation:
Step (1). the method needs following parameter:
G ij: virtual machine i is to the Link State of virtual machine j.Positive nonzero number represents the amount of bandwidth be connected between virtual machine i with virtual machine j; 0 expression is communicated with therebetween, but non-nominated bandwidth demand; Both-1 representatives are not communicated with.
B i: the bandwidth demand of the virtual machine i that tenant specifies.
R i: the remaining bandwidth of virtual machine i.
U i: the number of links of virtual machine i also unassigned bandwidth.
Step (2). initialization number counter i=1, along with the propelling of anabolic process, often through a number, counter increases i=i+1, when i is not more than N, in each scale of notation, performs following steps:
Step (2.1). to every platform virtual machine i, according to matrix G, its remaining bandwidth of initialization R ifor
The virtual machine bandwidth B that tenant specifies ideduct the bandwidth of having distributed that is:
R i = B i - &Sigma; { j | G i j > 0 } G i j
Step (2.2). according to matrix G, the number of links of its unassigned bandwidth of initialization, that is:
U i = &Sigma; { j | G i j = 0 } 1
Step (3). when still there is the link of unassigned bandwidth in matrix G, i.e. G ij=0, then perform following steps:
Step (3.1). from all virtual machines, find out a virtual machine i, meet U i>0, and R i/ U ithe minimum virtual machine i of value.
Step (3.2). find first j, meet G ij=0, i is gained in step (4.1).
Step (3.3). upgrade G ji=G ij=R i/ U i.
Step (4.4). upgrade R iand R j, R i-=R i/ U i, R j-=R i/ U i.
Step (3.5). upgrade U iand U j, U i=U i-1, U j=U j-1.
Step (4). initialization number counter i=1, along with the propelling of anabolic process, often through a number, counter increases i=i+1, when i is not more than N, in each scale of notation, performs following steps:
Step (4.1). upgrade B i=B i-R i
Step (5). return r:<N, B, G>.
Provide the implementation procedure of two benches virtual machine Placement below.
After executing above-mentioned optimizing process, next, SpongeNet will will pass through the virtual machine Demand mapping of optimizing process on actual physics platform by effective and efficient virtual machine Placement.As shown in Figure 7, this step is divided into two stages by the present embodiment: one is phase sorting, and in this stage, the request of tenant is by a certain specific algorithm sequence; Second is allocated phase, and in the process, the request queue of sequence is distributed by a certain specific allocation algorithm successively.
Carry out optimum distribution to virtual machine and be proved to be np hard problem, nowadays First-Fit (first adaptive strategy) is the most general heuritic approach used when allocated phase distributes virtual machine.The present embodiment is expanded again and is proposed 4 kinds of ordering strategies on the basis of FCFS (prerequisite variable) strategy, and on the basis of First-Fit strategy, propose other two kinds of allocation strategies.The Different Strategies in these two stages mutually combined, can find which is demand concerning cloud service provider is best solution, and which is best solution to the demand of tenant.
The queuing algorithm of A, first stage
When the speed that virtual machine request arrives exceedes the speed of response of platform, not processed request can be overstock in scheduling queue, may cause platform overload.At this moment, need by a suitable scheduling queue ordering strategy quickening execution virtual machine request and the rate of rise reducing request queue.
The present embodiment provides 5 kinds of queuing policys according to different request condition such as task size, stand-by period.Because comprise quantity and the bandwidth demand of virtual machine in the request of tenant, this embodiment introduces the price p that an existing pricing model calculates the virtual machine request of each tenant, and use profit margin r=p/t as the parameter of unified measurement task size.5 kinds of strategies are as follows:
Prerequisite variable (FCFS) strategy: this is a simple and general sort algorithm, the order that virtual machine arrives by them performs.
Minimum profit rate preferential (SRRF) strategy: the task of having minimum profit rate first performs, and this algorithm, when load is larger, can improve the response speed of sponge system.
Optimum Profit Rate preferential (LRRF) strategy: the task of having Optimum Profit Rate first performs.This algorithm can improve the throughput of system.
Weight limit preference strategy (comprising two kinds of strategies): this strategy be improve on the basis of SRRF and LRRF obtained.All may there is situation about dying of hunger in SRRF and LRRF strategy, have in SRRF Optimum Profit Rate and in LRRF, have the task of minimum profit rate to be all likely in wait state in high-load situations, this always.In order to address this problem, this introduces weight in this queue scheduling mechanism, the weight of each task is that the profit margin of task is multiplied by the stand-by period (LRRF) or divided by stand-by period (SRRF), this can make waiting as long for of task have processed chance.
The allocation algorithm of B, second stage
Allocation algorithm is used for determining which platform physical machine a virtual machine should be assigned in.A suitable allocation strategy can be optimized the resource utilization of cloud platform and improve the performance of dispatcher further.
The same by a FGVC model with a hose model is assigned to physical topology, because the present embodiment only needs the bandwidth restriction of the virtual machine-level considering optimised FGVC model.When selecting suitable allocation algorithm, fragment is main Consideration, the subtree in a physical topology may be made to only have several virtual machine, but only remain little bandwidth, cannot continue to receive other FGVC to ask because distribute a FGVC request.Memory Allocation problem classical in above-mentioned fragment problems and operating system is quite similar.This problem has three kinds of common solutions: first-fit, best-fit and worst-fit.The request of first-fit algorithm assigns virtual machine is mainly used in work before, and the present embodiment, on the basis of best-fit and the worst-fit algorithm of internal memory, achieves best-fit and the worst-fit algorithm that virtual machine distributes.The present embodiment use 3 in allocation strategy as described below.
Adaptive strategy first: this strategy finds to top (root routing layer) subtree (physical machine, ToR switch, a convergence-level switch or a core layer switch) that first can meet two conditions from the end (physical machine layer).A condition is " virtual trough idle in subtree is more than or equal to the virtual machine number of tenant request ".Second condition is " bandwidth demand that the bandwidth under this subtree meets virtual machine request ".Generally speaking, the subtree that can meet tenant's demand that this strategy searching first is minimum.
Optimal adaptation strategy: as first-fit strategy, this strategy is also the end of to each node of top traversal, finds the subtree that can meet two conditions.First condition is the same with first-fit.Second condition is " remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant ".This strategy is found and bandwidth fragmentation can be made to minimize and the minimum subtree that can meet tenant request.
The poorest adaptive strategy: contrary with best-fit strategy, the unique difference of the two is present in second condition: in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.That is, this strategy is found and can be met tenant request, and can retain the minimum subtree of maximum bandwidth resource for tenant afterwards.Algorithm 2:Best-Fit allocation algorithm
The framework of Best-Fit algorithm is similar to First-Fit, selects an optimum stalk tree, then recursively for tenant distributes virtual machine and bandwidth with it unlike Best-Fit.
vbe can hold the number of virtual machine for certain tenant with the node v subtree that is root, it is subject to the restriction of remaining bandwidth and the virtual trough of residue.N vmathematical definition as follows:
N v={n∈[0,min{k v,R}]
s . t . min ( &Sigma; 0 n B i , A - &Sigma; 0 n B i ) < R i }
Wherein, B iit is the bandwidth demand of the virtual machine i of tenant.A be all virtual machine bandwidth demands and, k vv subtree current available virtual trough number, and k v∈ [0, K], K are the virtual trough number under subtree.R is the virtual machine quantity still needing to distribute.R lfor the remaining bandwidth of link l.
For a tenant request r:<N, B>, perform following operation:
The method may need the following parameter of disposable initialization:
T:3 layer tree topology.
L: the number of plies of current traversal topology.0-2 represents physical machine layer, ToR routing layer, Pod routing layer and root routing layer respectively.
H v: the virtual trough quantity that the subtree being root with node v can be idle.
Step (1). by current traversal number of plies l initialization, l=0;
Step (2). perform following operation always:
Step (2.1). to the node of each l layer, the subtree that to travel through with it be root:
Step (2.1.1). the quantity H of idle virtual machine groove in statistics subtree v v.
Step (2.1.2) if. N is more than or equal to H v, then perform following operation:
Step (2.1.2.1). node v is added candidate's subtree queue C.
Step (2.2) if. current traversal be routing layer, i.e. l>0, then perform following operation:
Step (2.2.1). candidate's subtree queue C is pressed upstream bandwidth and the ascending sort of the node of subtree.
Step (2.3). otherwise, current traversal be physical machine layer, i.e. l=0, then perform following operation:
Step (2.3.1). by the quantity H of candidate's subtree queue C by the free virtual trough of subtree vascending sort.
Step (2.4). distribute virtual machine number assigned=BestFitAlloc (r, v, N), recursively distribute virtual machine and bandwidth.
Step (2.5). assigned=N is judged:
Step (2.5.1) if. be true, then return True.
Step (2.5.2). otherwise, parameter is rolled back to the state before execution 2.4, and upgrade l=l+1.
Step (2.6). judge whether l equals 3.
Step (2.6.1) if. be true, then return False.
BestFitAlloc(r,v,m)
Step (1). decision node v place layer.
Step (1.1) if. level (v)=0, then perform following operation.
Step (1.1.1). m platform virtual machine is distributed to tenant, and marks
Step (1.1.2). return m
Step (1.2). otherwise, perform following operation:
Step (1.2.1). the virtual machine number count=0 that initialization has distributed.
Step (1.2.2). find the node of all available free virtual troughs, put into both candidate nodes queue C.
Step (1.2.3). as count<m and length (C) unequal to 0, then perform following operation:
Step (1.2.3.1). judge whether level (C) >0
Step (1.2.3.1.1) if. be true, then by candidate's subtree queue C by the upstream bandwidth of node of subtree and ascending sort.
Step (1.2.3.1.2). by the quantity H of candidate's subtree queue C by the free virtual trough of subtree vascending sort.
Step (1.2.3.2). recurrence, count+=BestFitAlloc (r, C0, NC0)
Further, after the step executing above-mentioned tenant's aspect, provide below application E-F run time mechanism.
In this section, mechanism (when E-F runs mechanism) when the present embodiment proposes an operation based on the improvement of ElasticSwitch rate-allocation solution.When E-F runs, mechanism compensate for the tenant that is guaranteed and that do not ensure of problem ElasticSwitch can not realize fair allocat between to(for) untapped bandwidth, and achieves four targets above mentioned simultaneously.
In SpongeNet, the present embodiment proposes mechanism when the E-F taking into account Efficiency and fairness runs, in order to distribute untapped bandwidth liberally liberally between guaranteed tenant and unsecured tenant.It still can ensure the performance of the network of best effort under extreme network condition.Evaluation part (chapter 5) demonstrates SpongeNet can distribute idle bandwidth more liberally, and achieves better performance isolation between tenant.
When E-F runs, the bandwidth be not assigned with is distributed to them according to the defrayment of tenant by mechanism pro rata.Above-mentioned existing pricing model Cost=NkvT+NkbB is made up of two parts.NkvT is that tenant uses N platform virtual machine, and running time is the expense of T.NkbB is the expense for N platform virtual machine guarantee bandwidth.When E-F runs, mechanism employs this part weight as tenant of Nkv.Therefore, unwarranted tenant uses the bandwidth of distribution as the Bandwidth guaranteed input B of function R x → Y, R x → Y=max (B x → Y, R w_TCP(B x → Y, F x → Y)).Therefore, unwarranted tenant can utilize untapped bandwidth liberally with guaranteed tenant.
In SpongeNet, if the expense of these three tenant request be Nkv=10/hour, when E-F runs, mechanism can by unappropriated bandwidth (L-B x → Y+ B z → T+ B a → B=700Mbps) give tenants according to weight Nkv.Like this, each tenant Bandwidth guaranteed of using them new is as the input of function R.So the final speed limit result of three tenants is respectively R x → Y=333Mbps, R a → B=233MbpsandR z → T=666Mbps.
E-F runs operational excellence on physical link on opportunity, and see Fig. 8 a and Fig. 8 b, but how it to be deployed on tree topology be a challenge.Because every station server has the virtual machine of tenant A, when E-F runs, mechanism can regard the unwarranted link of the virtual machine of all tenant A as a guaranteed link as.This special guaranteed link, it has minimum bandwidth, on other physical links that the server that can be assigned to all virtual machines with there being tenant A is connected.Specifically, the Bandwidth guaranteed of the server i of tenant t is set to
S - bandwidth t i = min ( S - bandwidth t i &RightArrow; j ) &ForAll; j &Element; V t , i &NotEqual; j
S - bandwidth t i &RightArrow; j = min ( S - bandwidth t i &RightArrow; j , p ) &ForAll; p &Element; P a t h ( i , j ) , i , j &Element; V t , i &NotEqual; j
Wherein, V tbe all servers having the virtual machine of tenant t, Path (i, j) is the set of the physical link between virtual machine i to virtual machine j.In Fig. 9, there is the virtual machine 4 station servers having tenant A.Server a has 3 physical links and other to have the server of the virtual machine of A to be connected.The most wide 90Mbps that is respectively of I distribution strip, 60Mbps and the 60Mbps of these three links.So the final bandwidth of the special Bandwidth guaranteed link of server a ensures as 60Mbps.All the other several station servers, all-links is the same to the physical link of other servers.So the bandwidth of server b to d is also 60Mbps.
Particularly, be that unsecured tenant sets Bandwidth guaranteed by server granularity, wherein by server granularity be unsecured tenant set Bandwidth guaranteed be E-F run time mechanism core.When there are the bandwidth resources be not assigned with in it, for unsecured tenant is at server granularity setting Bandwidth guaranteed, thus makes the distribution between unsecured tenant and guaranteed tenant more fair.Following is that unsecured tenant sets Bandwidth guaranteed algorithm and distributes unappropriated bandwidth by calculating by cost ratio by server granularity, obtains server s 0can at every bar from s 0to server s jlink can obtain how many bandwidth, and with the guarantee value of minimum value as link.And will from s 0minimum guarantee value to other servers is assigned to tenant t at server s 0in guarantee value.According to the function (being generally used for data center architecture) of tree topology, from a certain leaf node, access all leaf nodes mean the whole tree of traversal, need through wherein each link.Therefore, s tmiddle Servers-all has identical guarantee bandwidth B t, choose S t[0] as server s 0be used for calculating guarantee value not necessarily, S can be chosen arbitrarily tin server.
Providing below by server granularity is that unsecured tenant sets Bandwidth guaranteed algorithm:
For unsecured tenant t, perform following operation:
The method needs following parameter:
T: one unwarranted tenant;
T: the tenant t tenant competed with other and t;
M i: in T, the payment of tenant i;
S t: the server of the virtual machine containing tenant t;
untapped bandwidth in link p.
Step (1) .servers 0=S t[0]
Step (2). initialization number counter j=1, along with the propelling of anabolic process, often through 1 number, counter increases j=j+1, works as server S jat S ttime middle, in each scale of notation, perform following steps:
Step (2.1). obtain from server s 0to server s jguarantee value:
B t s 0 &RightArrow; s j = min ( M t &Sigma; &ForAll; i &Element; T M i &CenterDot; B u n u s e d p ) , &ForAll; p &Element; P a t h ( s 0 , s j )
Step (3). will from s 0minimum guarantee value to other servers is set to tenant t at server s 0in guarantee value:
B t = m i n ( B t s 0 &RightArrow; s j ) , &ForAll; s j &Element; S t , j &NotEqual; 0
Step (4). return B t.
Below in process, give the experimental evaluation result of SpongeNet.
In this part, the present embodiment assessment 1) FGVC abstract model, 2) optimum combination of first stage queuing policy and second stage allocation strategy, and 3) the E-F operation that realized time mechanism target.The present embodiment have evaluated from virtual machine aspect the effect that bandwidth saved by FGVC model.Furtherly, the experiment of the present embodiment have evaluated validity and the high efficiency of FGVC model and two benches virtual machine Placement respectively.In addition, the present embodiment have evaluated throughput and the response time of each tenant.Finally, the present embodiment also demonstrates the Bandwidth guaranteed realized between virtual machine to virtual machine in actual cloud platform is feasible.
Explain from tenant's layer emulation aspect below.
In order to assess the first two aspect, the present embodiment Python achieves the VC model (a kind of hose model using the same band method of salary distribution) of FGVC model, 5 queuing policys, 3 allocation strategies and Oktopus.Following table 1 shows application Species distributing.
Table 1
Emulation is arranged: the simulator of the present embodiment uses the tree topology of 3 layers be made up of 1 core layer switch, 4 convergence-level switches, 80 ToR switches and 1600 station servers to simulate the cloud data center of reality.In order to make experiment short and sweet, the CPU of all servers and interiorly have identical configuration, and every station server has 4 virtual troughs.The upstream bandwidth of server, ToR route and polyaluminium chloride PAC is respectively 100Mbps, 1Gbps and 10Gbps.
The data set of experiment derives from bing.com, comprises many groups of service data.Service request is made up of multiple-task type (interactive application or batch processing task) and communication pattern (as linear, star, net form etc.).The present embodiment have chosen 20000 groups of service datas from the workflow of Bing, is broadly divided into four classes, as shown in table 1.These 20000 groups of virtual machine requests meet Poisson distribution, and it on average arrives the time (λ) is 0.02, and the time of implementation of task meets exponential distribution, and it is 0.00002 that task leaves speed (μ).
First, the present embodiment have evaluated FGVC model, the then quantitative analysis bandwidth of being saved by Optimized model, and and VC model compare.
Bandwidth conservation: as mentioned above, virtual machine request can be divided into four classes.The first kind is batch processing task (as MapReduce).This type of topology is star, can by the bandwidth (Figure 10) of FGVC model optimization about 77%.Equations of The Second Kind is interactive application (as 3 layer network application), and the topology of 3 layer network application is wire, can be saved the bandwidth of 18% by FGVC.Other two kinds of topologys be the complete communication pattern that connects and isolated, without mutual pattern, so compared with VC model, do not save bandwidth.The bandwidth optimization rate of FGVC model to all 20000 tasks reaches 48%.In VC model, each node has same Bandwidth guaranteed, and this is easy to make some node become bottleneck.By analyzing above and can finding, FGVC model has the performance of better saving bandwidth in the network topology comparatively concentrated.
Throughput: the time that the present embodiment completes according to all tasks compares throughput because the time that shorter task completes under the scene of lasting task, mean higher throughput.Figure 11 illustrates the deadline that 3 kinds of different allocation strategies combine from different queuing policys.Best-fit and LRRF tenant is optimum, and this combination of primer makes the fragment of platform minimize, to receive more arriving afterwards of tasks.Worst-fit and SRRF combines optimum because this combination can better within each unit interval for task below leaves over minimum fragment.On the other hand, it is optimum that SRRF and worst-fit allocation strategy combines, because the large task that this combination arrives as far as possible after each unit interval is retains maximum space.The present embodiment 20000 virtual machine requests complete experiment, 20000 request always spend in pricing model under be consistent.The profit margin of cloud platform is larger, and the deadline of task is shorter, combines as LRRF and first-fit, and its profit margin is maximum.
The present embodiment also compares FGVC+LRRF+first-fit combination and traditional VC+FCFS+first-fit combination of best performance in profit margin and virtual machine utilance.In Figure 12 a, the present embodiment depicts the profit margin of this two schemes each second.Before reaching stable state, the profit margin of the two is substantially identical, because now the resource of platform is sufficient.But when reaching stable state, the average profit of the solution of the present embodiment is 17.60, and solution is before 15.34, the present embodiment proposes d scheme and improves 12.84% than traditional scheme.The result of virtual machine utilance is also basically identical, and as shown in Figure 12b, scheme utilance when stable state that the present embodiment proposes is 99.50%, promotes 32.20% than traditional scheme.
In order to assess integrality and the robustness of new departure, the present embodiment changes the size of bandwidth resources and platform loads, the relatively performance of VC+FCFS+first-fit, FGVC+FCFS+first-fit and FGVC+LRRF+first-fit tri-kinds of schemes, result is as shown in 13a.When bandwidth resources are different, experimental result as shown in the figure.Along with the average bandwidth request of task increases, the deadline of task is also thereupon elongated, but the scheme of the present embodiment is always optimum.When the present embodiment changes platform loads, result is similar with it, as illustrated in fig. 13b.Can find from figure, FGVC plays very important effect in combination.
Response time: the scheme of the present embodiment also considers the response time from the angle of tenant.Figure 14 a and Figure 14 b respectively show average latency and the average queue length of 15 kinds of combined strategies.From figure, the present embodiment can sum up, and FCFS makes the response time of platform best, because scheduling mechanism is each all will wait for that the task of maximum duration processes at first with greedy method, this is for reduction average latency and waiting list length successful.The present embodiment finds, from the angle of cloud platform response time, FGVC+FCFS+best-fit is optimum tenant's strategy.
The present embodiment is also analyzed the response time in more fine granularity.Figure 15 a illustrates the stand-by period of each task.Figure 15 b illustrates the queue length of each task.Determine in scheme based on VC solution to model, the stand-by period of virtual machine request constantly increases, and in the scheme based on FGVC model, the stand-by period stablizes.
In Figure 16 a and Figure 16 b, the present embodiment changes average bandwidth size and platform loads, compares the average latency of three kinds of solutions and length of on average babbling.The experimental result that change average bandwidth request size obtains as illustrated in fig 16 a.Along with average bandwidth asks changes persuing large, the average latency also increases thereupon, but FGVC+FCFS+best-fit is optimum all the time.When changing platform loads, trend is identical, as shown in fig 16b.In sum, FGVC model is for optimization response time most important factor.
In addition, the experiment results of application is given.
The present embodiment passes through mechanism when the deployment evaluation of small scale prototype E-F runs.The target of assessment has: 1) even if mechanism also can provide Bandwidth guaranteed in the worst case when showing that E-F runs; 2) when showing that E-F runs, mechanism achieves fair share 3 between guaranteed tenant and unsecured tenant) mechanism is efficiency utilization when showing that E-F runs.The present embodiment illustrates the above target of efficient fair operating mechanism in both cases: many-one scene and MapReduce scene, just as ElasticSwitch.
Testing stand is arranged: the present embodiment achieves SpongeNet on small-sized data multi-center trial platform, and it constitutes 2 layers of tree topology, as Figure 17 by 7 physical machine.The present embodiment has disposed OpenStack in this data center, comprising 1 Controlling vertex and 6 computing nodes.Each server has 2GHzIntelXeonE5-2620CUPs and 16GB internal memory.The bandwidth of 1 layer (as between server and ToR switch) and 2 layers is 230 and 700Mbps respectively.
Many-one scene: two virtual machine X and Z are with on the server, and belong to two different tenants, other the virtual machine of these two tenants is on other server node, as shown in figure 18 a.Virtual machine Z receives the data from multiple virtual machine, and virtual machine X receives the data coming from single virtual machine.The physical link of their contention green, as Figure 18 b.
In ElasticSwitch, X and Z has identical 100Mb Bandwidth guaranteed, represents the maximum guarantee bandwidth that can provide on 230Mbps.Figure 19 a compares the throughput of X under four kinds of solutions: unprotect, static reservations (Oktopus), ElasticSwitch and SpongeNet.SpongeNet, in the scene of these two guaranteed tenants, has similar performance with ElasticSwitch.Even if when the transmit leg of Z is when constantly increasing, SpongeNet still can provide Bandwidth guaranteed for X.Meanwhile, when not having transmit leg to communicate with Z, whole physical links can be supplied to X by SpongeNet, this achieves the static reservations of Oktopus.Figure 19 b show guaranteed tenant and unsecured tenant and Figure 19 a and arrange identical, they have same expense (5/ hour).In ElasticSwitch, X almost occupies all band width in physical, because guaranteed tenant has higher priority than unwarranted tenant.But mechanism can liberally for they distribute untapped bandwidth when the E-F in SpongeNet runs.Because their payment is identical, these two tenants everyone can obtain the non-utilized bandwidth (100Mbps) of half.So no matter how many virtual machines communicate with Z, and Z obtains 50Mbps, and X obtains 150Mbps.
MapReduce scene: the present embodiment uses real MapReduce data simulation all processes of MapReduce operation, and measures the operation deadline of 4 kinds of different solutions.The present embodiment uses 6 all computing nodes, and each node has 5 virtual machines, 30 altogether.The present embodiment creates multiple tenant, randomly their request is set as that 2 to 10 virtual machines are not etc.; Wherein, half virtual machine is as mapper, and half is as reducer; Each mapper is connected to a reducer.All virtual machines of each tenant provide FGVC model, and the stream between each virtual machine has the guarantee of 30Mbps.The present embodiment tests nonequilibrium virtual machine Placement Strategy (mapper is placed on the right of tree, and reducers is placed on the left side of tree).This nonequilibrium scene applies maximum pressure can to the bandwidth resources of data center network.
Figure 20 illustrates the CDF figure under 4 different solutions: unprotect, static reservations (Oktopus), ElasticSwitch and SpongeNet.The deadline of SpongeNet and ElasticSwitch is always short than static reservations, because static reservations can not utilize physical link resource fully.SpongeNet has higher throughput than ElasticSwitch, because in this simply experiment, unsecured and guaranteed tenant exists simultaneously, and in ElasticSwitch, and the guaranteed unwarranted stream of impact that fails to be convened for lack of a quorum.When using unprotect scheme, the operation deadline under worst case than SpongeNet grown 1.75 ×.
Wherein, by enterprise's application deployment on cloud platform, and its stable performance is made to be very important by Bandwidth guaranteed.For this reason, the present embodiment proposes SpongeNet system, comprising two-layer, in order to realize cloud data center justice and the Bandwidth guaranteed of efficiency utilization.Tenant's layer is made up of two parts, FGCV model and two benches virtual machine Placement.FGVC model is one to provide more precisely, expresses network demand flexibly and save the model of bandwidth in demand aspect for tenant.Two benches virtual machine Placement, it considers ordering strategy and allocation strategy simultaneously, can provide the strategy combination of different optimums according to different targets.Application layer provides mechanism when E-F runs, and it not only achieves the Bandwidth guaranteed of FGVC model between virtual machine, also can distribute untapped bandwidth liberally between guaranteed tenant and unsecured tenant according to the application demand on real-time upper strata.By complete emulation with the testing stand experiment that True Data completes, demonstrate SpongeNet and Shi Yun data center can realize better network resource management and predictable application performance.
Third embodiment of the invention provides a kind of two-layer bandwidth distribution system of cloud data center with Predicable performance, see Figure 21, comprising: cloud tenant layer face treatment module 211 and application processing module 212;
Described cloud tenant layer face treatment module 211, for carrying out Bandwidth guaranteed optimization according to fine granularity Virtual Cluster FGVC network abstraction model to the virtual machine request comprising network demand that tenant sends;
And, for carrying out physical machine and corresponding bandwidth resource allocation according to presetting allocation algorithm to the virtual machine request after optimization;
Application processing module 212, when running for adopting E-F, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
Further, described cloud tenant layer face treatment module 211 specifically for: according to the remaining bandwidth of described virtual machine request determination virtual machine, the unappropriated link that is connected with described virtual machine is given by the remaining bandwidth average mark of described virtual machine, and using the minimum value in the Bandwidth guaranteed of virtual machine two ends as Bandwidth guaranteed between virtual machine.
Further, described cloud tenant layer face treatment module 211 specifically for:
According to any one in following three kinds of default allocation algorithms, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization:
First presets allocation algorithm: find from physical machine layer to root routing layer the subtree that first can meet two conditions, comprise a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the bandwidth demand that bandwidth under this subtree meets virtual machine request;
Second presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant;
3rd presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is that in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.
Further, described application processing module concrete 212 for: when exist be not assigned with bandwidth resources time, be that unsecured tenant sets Bandwidth guaranteed by server granularity.
System described in the present embodiment may be used for the method described in above-described embodiment of performing, its principle and technique effect similar, no longer describe in detail herein.
Above embodiment only for illustration of technical scheme of the present invention, is not intended to limit; Although with reference to previous embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

1. there is the two-layer bandwidth allocation methods of cloud data center of Predicable performance, it is characterized in that, comprising:
In cloud tenant aspect, according to fine granularity Virtual Cluster FGVC network abstraction model, Bandwidth guaranteed optimization is carried out to the virtual machine request comprising network demand that tenant sends;
And, according to default allocation algorithm, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization;
At application, when adopting E-F to run, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
2. method according to claim 1, it is characterized in that, describedly according to fine granularity Virtual Cluster FGVC network abstraction model, Bandwidth guaranteed optimization is carried out to the virtual machine request comprising network demand that tenant sends, comprise: according to the remaining bandwidth of described virtual machine request determination virtual machine, the unappropriated link that is connected with described virtual machine is given by the remaining bandwidth average mark of described virtual machine, and using the minimum value in the Bandwidth guaranteed of virtual machine two ends as Bandwidth guaranteed between virtual machine.
3. method according to claim 1, is characterized in that, carries out physical machine and corresponding bandwidth resource allocation, comprising according to default allocation algorithm to the virtual machine request after optimization:
According to any one in following three kinds of default allocation algorithms, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization:
First presets allocation algorithm: find from physical machine layer to root routing layer the subtree that first can meet two conditions, comprise a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the bandwidth demand that bandwidth under this subtree meets virtual machine request;
Second presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant;
3rd presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is that in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.
4. method according to claim 1, it is characterized in that, preset before allocation algorithm carries out physical machine and corresponding bandwidth resource allocation to the virtual machine request after optimizing in described basis, described method also comprises: sort to the virtual machine request after optimizing according to predetermined order algorithm.
5. method according to claim 4, is characterized in that, described according to predetermined order algorithm to optimize after virtual machine request sort, comprising:
According to any one in following five kinds of predetermined order algorithms, the virtual machine request after optimization is sorted:
Prerequisite variable FCFS: perform by arrival order;
The preferential SRRF of minimum profit rate: the task of having minimum profit rate first performs;
The preferential LRRF of Optimum Profit Rate: the task of having Optimum Profit Rate first performs;
Weight limit is preferential: in queue scheduling mechanism, introduce weight, and the weight of each virtual machine request is that the profit margin of this virtual machine request is multiplied by the stand-by period or divided by the stand-by period.
6. method according to claim 1, it is characterized in that, when described employing E-F runs, mechanism is by physical machine and corresponding bandwidth resources distribute to guaranteed tenant liberally and unsecured tenant comprises: when there are the bandwidth resources be not assigned with, be that unsecured tenant sets Bandwidth guaranteed by server granularity.
7. there is the two-layer bandwidth distribution system of cloud data center of Predicable performance, it is characterized in that, comprising: cloud tenant layer face treatment module and application processing module;
Described cloud tenant layer face treatment module, for carrying out Bandwidth guaranteed optimization according to fine granularity Virtual Cluster FGVC network abstraction model to the virtual machine request comprising network demand that tenant sends;
And, for carrying out physical machine and corresponding bandwidth resource allocation according to presetting allocation algorithm to the virtual machine request after optimization;
Application processing module, when running for adopting E-F, physical machine and corresponding bandwidth resources are distributed to guaranteed tenant and unsecured tenant by mechanism liberally.
8. system according to claim 7, it is characterized in that, described cloud tenant layer face treatment module specifically for: according to the remaining bandwidth of described virtual machine request determination virtual machine, the unappropriated link that is connected with described virtual machine is given by the remaining bandwidth average mark of described virtual machine, and using the minimum value in the Bandwidth guaranteed of virtual machine two ends as Bandwidth guaranteed between virtual machine.
9. system according to claim 7, is characterized in that, described cloud tenant layer face treatment module specifically for:
According to any one in following three kinds of default allocation algorithms, physical machine and corresponding bandwidth resource allocation are carried out to the virtual machine request after optimization:
First presets allocation algorithm: find from physical machine layer to root routing layer the subtree that first can meet two conditions, comprise a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the bandwidth demand that bandwidth under this subtree meets virtual machine request;
Second presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is the remaining bandwidth under this subtree is minimum, and can meet the bandwidth demand of tenant;
3rd presets allocation algorithm: physical machine layer to root routing layer finds the subtree that first can meet two conditions, comprises a physical machine, ToR switch, a convergence-level switch or a core layer switch; First condition is the virtual machine number that virtual trough idle in subtree is more than or equal to tenant request; Second condition is that in subtree, the sum of remaining bandwidth is maximum, and can meet the bandwidth demand in virtual machine request.
10. system according to claim 7, is characterized in that, described application processing module specifically for: when exist be not assigned with bandwidth resources time, be that unsecured tenant sets Bandwidth guaranteed by server granularity.
CN201610083948.7A 2016-02-06 2016-02-06 Two layers of bandwidth allocation methods of cloud data center with Predicable performance and system Active CN105577834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610083948.7A CN105577834B (en) 2016-02-06 2016-02-06 Two layers of bandwidth allocation methods of cloud data center with Predicable performance and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610083948.7A CN105577834B (en) 2016-02-06 2016-02-06 Two layers of bandwidth allocation methods of cloud data center with Predicable performance and system

Publications (2)

Publication Number Publication Date
CN105577834A true CN105577834A (en) 2016-05-11
CN105577834B CN105577834B (en) 2018-10-16

Family

ID=55887478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610083948.7A Active CN105577834B (en) 2016-02-06 2016-02-06 Two layers of bandwidth allocation methods of cloud data center with Predicable performance and system

Country Status (1)

Country Link
CN (1) CN105577834B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827523A (en) * 2016-06-03 2016-08-03 无锡华云数据技术服务有限公司 Virtual gateway capable of dynamically adjusting bandwidths of multiple tenants in cloud storage environment
CN106301930A (en) * 2016-08-22 2017-01-04 清华大学 A kind of cloud computing virtual machine deployment method meeting general bandwidth request and system
CN106411678A (en) * 2016-09-08 2017-02-15 清华大学 Bandwidth guarantee type virtual network function (VNF) deployment method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104009904A (en) * 2014-05-23 2014-08-27 清华大学 Method and system for establishing virtual network for big data processing of cloud platform
CN104270421A (en) * 2014-09-12 2015-01-07 北京理工大学 Multi-user cloud platform task scheduling method supporting bandwidth guarantee

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104009904A (en) * 2014-05-23 2014-08-27 清华大学 Method and system for establishing virtual network for big data processing of cloud platform
CN104270421A (en) * 2014-09-12 2015-01-07 北京理工大学 Multi-user cloud platform task scheduling method supporting bandwidth guarantee

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALVIN AUYOUNG等: "Democratic Resolution of Resource Conflicts Between SDN Control Programs", 《PROCEEDINGS OF THE 10TH ACM INTERNATIONAL ON CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES》 *
亓开元等: "多租户数据库即服务模式系统资源优化分配方法", 《高技术通讯》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827523A (en) * 2016-06-03 2016-08-03 无锡华云数据技术服务有限公司 Virtual gateway capable of dynamically adjusting bandwidths of multiple tenants in cloud storage environment
CN105827523B (en) * 2016-06-03 2019-04-30 无锡华云数据技术服务有限公司 A kind of virtual gateway for realizing dynamic adjustment to the bandwidth of multi-tenant in cloud storage environment
CN106301930A (en) * 2016-08-22 2017-01-04 清华大学 A kind of cloud computing virtual machine deployment method meeting general bandwidth request and system
CN106411678A (en) * 2016-09-08 2017-02-15 清华大学 Bandwidth guarantee type virtual network function (VNF) deployment method

Also Published As

Publication number Publication date
CN105577834B (en) 2018-10-16

Similar Documents

Publication Publication Date Title
Chowdhury et al. Efficient coflow scheduling without prior knowledge
CN108566659B (en) 5G network slice online mapping method based on reliability
CN104396187B (en) The method and device that Bandwidth guaranteed and work are kept
CN105610715B (en) A kind of cloud data center multi-dummy machine migration scheduling method of planning based on SDN
Tsai et al. Two-tier multi-tenancy scaling and load balancing
CN106844051A (en) The loading commissions migration algorithm of optimised power consumption in a kind of edge calculations environment
CN107688492B (en) Resource control method and device and cluster resource management system
CN104270421B (en) A kind of multi-tenant cloud platform method for scheduling task for supporting Bandwidth guaranteed
CN104521198A (en) System and method for virtual ethernet interface binding
US20140149493A1 (en) Method for joint service placement and service routing in a distributed cloud
US11816509B2 (en) Workload placement for virtual GPU enabled systems
CN104881322A (en) Method and device for dispatching cluster resource based on packing model
US20060031444A1 (en) Method for assigning network resources to applications for optimizing performance goals
CN104104621A (en) Dynamic adaptive adjustment method of virtual network resources based on nonlinear dimensionality reduction
CN109379281A (en) A kind of traffic scheduling method and system based on time window
CN105577834A (en) Cloud data center two-level bandwidth allocation method and system with predictable performance
CN111159859B (en) Cloud container cluster deployment method and system
Ke et al. Aggregation on the fly: Reducing traffic for big data in the cloud
Hsu et al. Virtual network mapping algorithm in the cloud infrastructure
CN104298539B (en) Scheduling virtual machine and dispatching method again based on network aware
CN113886034A (en) Task scheduling method, system, electronic device and storage medium
CN113535393B (en) Computing resource allocation method for unloading DAG task in heterogeneous edge computing
CN110048966B (en) Coflow scheduling method for minimizing system overhead based on deadline
CN110138830A (en) Across data center task schedule and bandwidth allocation methods based on hypergraph partitioning
CN110958192B (en) Virtual data center resource allocation system and method based on virtual switch

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant