CN107291539B - Cluster program scheduler method based on resource significance level - Google Patents

Cluster program scheduler method based on resource significance level Download PDF

Info

Publication number
CN107291539B
CN107291539B CN201710462836.7A CN201710462836A CN107291539B CN 107291539 B CN107291539 B CN 107291539B CN 201710462836 A CN201710462836 A CN 201710462836A CN 107291539 B CN107291539 B CN 107291539B
Authority
CN
China
Prior art keywords
resource
program
significance level
server node
capacity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710462836.7A
Other languages
Chinese (zh)
Other versions
CN107291539A (en
Inventor
耿世超
赵雪
王琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Shandong Computer Science Center National Super Computing Center in Jinan
Original Assignee
Shandong Normal University
Shandong Computer Science Center National Super Computing Center in Jinan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University, Shandong Computer Science Center National Super Computing Center in Jinan filed Critical Shandong Normal University
Priority to CN201710462836.7A priority Critical patent/CN107291539B/en
Publication of CN107291539A publication Critical patent/CN107291539A/en
Application granted granted Critical
Publication of CN107291539B publication Critical patent/CN107291539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/485Resource constraint
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/504Resource capping

Abstract

The invention discloses the cluster program scheduler methods based on resource significance level, the dispatching method sorts resource according to the important procedure to program for each program, resource is searched according to resource significance level sequence when searching node, to guarantee the performance of a plurality of types of programs.The invention avoids blindnesses the shortcomings that being scheduled based on processor resource and memory source, ensure that disk-intensive type, the performance of programs of network intensive model;Performing environment is searched all in accordance with different resource sequences to each program, the effectiveness for playing resource in data center can be maximized.

Description

Cluster program scheduler method based on resource significance level
Technical field
The present invention relates to the scheduling of program in Parallel & Distributed Computing, especially cluster, in particular to important based on resource The cluster program scheduler method of degree.
Background technique
Data center can be with the carrier of win-win, with net as the infrastructure of cloud computing and user and cloud service operator Network accesses the non-local increase for calculating service, moves to maturity from concept.But data center resource utilization rate is generally insufficient 30%.And low-resource utilization rate causes compared with low-energy-efficiency, the investigation of the New York Times in 2012 shows that data center wastes a large amount of energy consumptions, Only 6% to the 12% of total energy consumption has been used as effective calculating.How resource utilization is improved as cloud computing operator needs The critical issue of consideration is all concerned from business perspective and academic angle.
Load aggregation (Workload Consolidation) is the important means for improving data center resource utilization rate, is born Carrying polymerization is multiple programs to be assigned in a calculate node, calculates, stores, disk so that server node node be made to improve The utilization rate of the resources such as I/O, and more idle nodes can be closed to reduce energy consumption expense.Investigation display in the recent period is needed with calculating The increase asked, the data center operator more than 60% can use load aggregation.Load aggregation can be in program feature and system Tradeoff is realized between resource utilization.
Current data center runs diversified program, such as processor is intensive, disk-intensive type.However, In The current general load aggregation method of data center is that processor and memory source are only considered when doing scheduling decision, according to program Scheduling is realized to the occupancy of both resources, to reach the load balancing of processor and memory source in cluster.But it is this Method has ignored program to the occupancy capacity of disk and network bandwidth resources, often leads to disk-intensive or the intensive journey of network bandwidth The lower performance of sequence.Therefore, it is necessary to a kind of scheduling strategies for being more suitable for data center it can be considered that processor, memory, disk The multiple resources such as bandwidth, network bandwidth guarantee the performance of the program of various features.
Summary of the invention
The purpose of the present invention is to propose to a kind of cluster program scheduler method based on resource significance level, the dispatching method needle Resource is sorted according to the significance level to program to each program, this sequence is referred to as resource significance level sequence.It is searching Resource is searched according to resource significance level sequence when node, to guarantee the performance of a plurality of types of programs.
Cluster program scheduler method based on resource significance level considers processor, disk reading, magnetic when realizing scheduling Five disk write-in, memory and network bandwidth resources, comprising the following steps:
Step (1): collection of resources: processor is obtained, disk is read, five disk write-in, memory and network bandwidth resources Idling-resource information;
Step (2): resource sequence: firstly, computing resource is to program significance level;Then, for every in task queue One program is ranked up resource according to significance level of the resource to program;Obtain the resource significance level of each program Sequence;
Step (3): for each of task queue program, first scheduling: is chosen from resource significance level sequence A resource, and be first resource lookup several server node;The service that several server nodes are searched to be searched It is standard to the occupancy capacity of first resource that the idling-resource capacity of device node, which is greater than program,;
It then, is second resource lookup meet demand from the server node that first resource has been found out Server node is equally greater than program to the occupancy capacity of second resource with the idling-resource for the server node searched For standard;
And so on, until going out the server node of meet demand in sequence for the last one resource lookup, by last The server node that a resource lookup goes out is stored into service list;
When program executes, server node is selected to carry out program operation directly from service list.
The step of step (1) are as follows:
Step (101): processor, disk reading, disk write-in, memory, network bandwidth five on server node are obtained Resource has used capacity;
Step (102): respective total using five processor, disk reading, disk write-in, memory, network bandwidth resources Capacity, it is corresponding to subtract that each resource is respective to have used capacity, obtain the idling-resource information of each resource;
Step (103): the periodically idling-resource information of report Current resource.
Obtained using passage capacity analysis tool collectl in the step (101) processor on server node, Five disk reading, disk write-in, memory, network bandwidth resources have used capacity.
The resource is as follows to the calculation method of program significance level:
Step (201): for each resource, it is maximum that stock number of the program when the unrestricted condition of resource executes, which is arranged, Resource constraint point, the maximum resource limitation point within the scope of setting ratio is that least resource limits point;
Step (202): performance of the program under maximum resource limitation point and least resource limitation point is obtained;
Step (203): calculation procedure is in least resource limitation point and the performance ratio under maximum resource limitation point;The property Energy ratio is exactly importance value of the resource to program;Resource is bigger to the importance value of program, illustrates that resource size changes Insensitive to program feature, resource is inessential to program;Resource is smaller to program importance value, illustrates resource size variation pair Program feature is sensitive, and resource is important to program;It sorts from small to large according to the importance value of program, obtains the resource weight of program Want degree of sequence.
The step (202): limiting resource by using resource constraint tool Cgroups, obtains program in maximum resource Performance under limitation point and least resource limitation point;
Described program least resource limitation point with maximum resource limitation point under performance ratio for one between [0,1] it Between constant.
The prior information of each program, comprising: the resource significance level sequence that occupies capacity and program of the program to resource Column.
The resource, comprising: processor, disk reading, disk write-in, memory and network bandwidth.
The step of step (3) are as follows:
For each of task queue program, first resource is chosen from resource significance level sequence, and search Several first kind server nodes, the standard for searching first kind server node is: the appearance of each first kind server node Amount is greater than program to the occupancy capacity of first resource;
Then, second resource is chosen from resource significance level sequence, and from several first kind server nodes Search several the second class server nodes;The standard for searching the second class server node is: each second class server node Capacity be greater than program to the occupancy capacity of second resource;
Then, third resource is chosen from resource significance level sequence, and from several the second class server nodes Search several third class server nodes;The standard for searching third class server node is: each third class server node Capacity be greater than program to the occupancy capacity of third resource;
Then, the 4th resource is chosen from resource significance level sequence, and from several third class server nodes Search several the 4th class server nodes;The standard for searching the 4th class server node is: each 4th class server node Capacity be greater than program to the occupancy capacity of the 4th resource;
Then, the 5th resource is chosen from resource significance level sequence, and from several the 4th class server nodes Search several the 5th class server nodes;The standard for searching the 5th class server node is: each 5th class server node Capacity be greater than program to the occupancy capacity of the 5th resource;
Finally, all 5th class server node titles are stored to the node into server list, in server list For the node that can be mapped.
The present invention is based on the advantages of resource significance level scheduling to be:
1, the present invention passes through the valuable source for targetedly analyzing program, and the valuable source based on program is scheduled. This method avoid blindnesses the shortcomings that being scheduled based on processor resource and memory source, ensure that disk-intensive type, net The performance of network intensive procedure.
2, the present invention searches performing environment to each program all in accordance with different resource sequences, can maximize performance data The effectiveness of resource in center.
Detailed description of the invention
Fig. 1 is to obtain resource significance level schematic diagram.
Fig. 2 is scheduling strategy flow chart.
Specific embodiment
The invention will be further described with embodiment with reference to the accompanying drawing.
Fig. 1 description obtains processor resource to the method for the significance level of program Data Caching.Processor has 8 Core, it is that maximum resource limits point that 8 cores, which are arranged, in we, and it is that least resource limits point that 2 cores, which are arranged,.And program occupies 2 cores and holds Performance ratio when capable 8 cores of performance and occupancy execute is significance level of the processor resource to program of our requirements. In this example, processor resource importance value is 0.92.Resource is bigger to the importance value of program, illustrates that resource size becomes Change is insensitive to program feature, and the resource is inessential to program.Resource is smaller to program importance value, illustrates that resource size becomes Change is sensitive to program feature, and the resource is important to program.In this way, it can be read with computation processor, disk, disk write Enter, memory, network bandwidth resources are to the importance value of program, sort, can be obtained for program from small to large according to numerical value The resource significance level sequence of Data Caching.
Fig. 2 describes scheduling strategy.
One: idling-resource information collection
Utility analysis tool collectl obtains the processor of present node on each node, disk is read, Resource using information in five disk write-in, memory and network bandwidth resource dimensions;And idling-resource is then equal to server section The total resources of point subtract used resource;And the idling-resource periodically on fallout predictor report present node.
Two: resource sequence
Resource significance level sequence is obtained according to the description of Fig. 1.
Three: scheduling
For each of task queue program, all include two groups of prior informations: program processor, disk read, Resource using information in five disk write-in, memory and network bandwidth resource dimensions, the resource significance level sequence of program.
First resource is chosen from resource significance level sequence, and checks program to the occupancy capacity of the resource, with this For standard filtration server node list.Previous step is repeated, until the resource in resource significance level sequence has been listed, then Stop searching, the node in server list is the node that can be mapped.It is mapped.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (8)

1. the cluster program scheduler method based on resource significance level, characterized in that consider processor, disk when realizing scheduling It reads, five disk write-in, memory and network bandwidth resources, comprising the following steps:
Step (1): collection of resources: processor is obtained, disk is read, disk is written, the sky of five resources of memory and network bandwidth Not busy resource information;
Step (2): resource sequence: firstly, computing resource is to program significance level;Then, for each of task queue Program is ranked up resource according to significance level of the resource to program;Obtain the resource significance level sequence of each program;
Step (3): for each of task queue program, first money scheduling: is chosen from resource significance level sequence Source, and be first resource lookup several server node;The server section that several server nodes are searched to be searched It is standard to the occupancy capacity of first resource that the idling-resource capacity of point, which is greater than program,;
It then, is the service of second resource lookup meet demand from the server node that first resource has been found out Device node, equally, being greater than program with the idling-resource for the server node searched is mark to the occupancy capacity of second resource It is quasi-;
And so on, until going out the server node of meet demand in sequence for the last one resource lookup, the last one is provided The server node that source is found out is stored into service list;
When program executes, server node is selected to carry out program operation directly from service list;
The resource is as follows to the calculation method of program significance level:
Step (201): for each resource, it is maximum resource that stock number of the program when the unrestricted condition of resource executes, which is arranged, Point is limited, the maximum resource limitation point for setting a quarter is that least resource limits point;
Step (202): performance of the program under maximum resource limitation point and least resource limitation point is obtained;
Step (203): calculation procedure is in least resource limitation point and the performance ratio under maximum resource limitation point;The performance ratio Value is exactly importance value of the resource to program;Resource is bigger to the importance value of program, illustrates resource size variation to journey Sequence can be insensitive, and resource is inessential to program;Resource is smaller to program importance value, illustrates resource size variation to program Performance sensitive, resource are important to program;It sorts from small to large according to the importance value of program, obtains the important journey of resource of program Degree series.
2. the cluster program scheduler method based on resource significance level as described in claim 1, characterized in that the step (1) the step of are as follows:
Step (101): processor, disk reading, disk write-in, five memory, network bandwidth resources on server node are obtained Used capacity;
Step (102): utilizing processor, disk reading, disk write-in, five memory, network bandwidth respective total capacities of resource, It is corresponding to subtract that each resource is respective to have used capacity, obtain the idling-resource information of each resource;
Step (103): the periodically idling-resource information of report Current resource.
3. the cluster program scheduler method based on resource significance level as claimed in claim 2, characterized in that the step (101) processor on server node is obtained in using passage capacity analysis tool collectl, disk is read, disk write Enter, five memory, network bandwidth resources have used capacity.
4. the cluster program scheduler method based on resource significance level as described in claim 1, characterized in that the step (202): resource being limited by using resource constraint tool Cgroups, program is obtained in maximum resource and limits point and least resource Performance under limitation point.
5. the cluster program scheduler method based on resource significance level as described in claim 1, characterized in that described program exists Performance ratio under least resource limitation point and maximum resource limitation point is a constant between [0,1].
6. the cluster program scheduler method based on resource significance level as described in claim 1, characterized in that each program Prior information, comprising: the resource significance level sequence that occupies capacity and program of the program to resource.
7. the cluster program scheduler method based on resource significance level as claimed in claim 6, characterized in that the resource, It include: processor, disk reading, disk write-in, memory and network bandwidth.
8. the cluster program scheduler method based on resource significance level as described in claim 1, characterized in that the step (3) the step of are as follows:
For each of task queue program, first resource is chosen from resource significance level sequence, and search several A first kind server node, the standard for searching first kind server node is: the capacity of each first kind server node is big In program to the occupancy capacity of first resource;
Then, second resource is chosen from resource significance level sequence, and is searched from several first kind server nodes Several the second class server nodes;The standard for searching the second class server node is: the appearance of each second class server node Amount is greater than program to the occupancy capacity of second resource;
Then, third resource is chosen from resource significance level sequence, and is searched from several the second class server nodes Several third class server nodes;The standard for searching third class server node is: the appearance of each third class server node Amount is greater than program to the occupancy capacity of third resource;
Then, the 4th resource is chosen from resource significance level sequence, and is searched from several third class server nodes Several the 4th class server nodes;The standard for searching the 4th class server node is: the appearance of each 4th class server node Amount is greater than program to the occupancy capacity of the 4th resource;
Then, the 5th resource is chosen from resource significance level sequence, and is searched from several the 4th class server nodes Several the 5th class server nodes;The standard for searching the 5th class server node is: the appearance of each 5th class server node Amount is greater than program to the occupancy capacity of the 5th resource;
Finally, by the storage of all 5th class server node titles into server list, the node in server list is energy The node enough mapped.
CN201710462836.7A 2017-06-19 2017-06-19 Cluster program scheduler method based on resource significance level Active CN107291539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710462836.7A CN107291539B (en) 2017-06-19 2017-06-19 Cluster program scheduler method based on resource significance level

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710462836.7A CN107291539B (en) 2017-06-19 2017-06-19 Cluster program scheduler method based on resource significance level

Publications (2)

Publication Number Publication Date
CN107291539A CN107291539A (en) 2017-10-24
CN107291539B true CN107291539B (en) 2019-11-01

Family

ID=60096816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710462836.7A Active CN107291539B (en) 2017-06-19 2017-06-19 Cluster program scheduler method based on resource significance level

Country Status (1)

Country Link
CN (1) CN107291539B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475277A (en) * 2019-01-23 2020-07-31 阿里巴巴集团控股有限公司 Resource allocation method, system, equipment and machine readable storage medium
CN110457138A (en) * 2019-08-20 2019-11-15 网易(杭州)网络有限公司 Management method, device and the electronic equipment of game server cluster
CN110740070B (en) * 2019-11-22 2022-07-01 国网四川省电力公司经济技术研究院 Intelligent power grid station bandwidth estimation method based on multivariate nonlinear fitting
CN113377521B (en) * 2020-02-25 2024-01-30 先智云端数据股份有限公司 Method for establishing system resource prediction and management model through multi-level correlation
CN116048773A (en) * 2022-10-25 2023-05-02 北京京航计算通讯研究所 Distributed collaborative task assignment method and system based on wave function collapse

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019853A (en) * 2012-11-19 2013-04-03 北京亿赞普网络技术有限公司 Method and device for dispatching job task
CN105430027A (en) * 2014-09-04 2016-03-23 中国石油化工股份有限公司 Load balance dynamic pre-allocating method based on a plurality of resource scales
CN105630575A (en) * 2015-12-23 2016-06-01 一兰云联科技股份有限公司 Performance evaluation method aiming at KVM virtualization server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019853A (en) * 2012-11-19 2013-04-03 北京亿赞普网络技术有限公司 Method and device for dispatching job task
CN105430027A (en) * 2014-09-04 2016-03-23 中国石油化工股份有限公司 Load balance dynamic pre-allocating method based on a plurality of resource scales
CN105630575A (en) * 2015-12-23 2016-06-01 一兰云联科技股份有限公司 Performance evaluation method aiming at KVM virtualization server

Also Published As

Publication number Publication date
CN107291539A (en) 2017-10-24

Similar Documents

Publication Publication Date Title
CN107291539B (en) Cluster program scheduler method based on resource significance level
Sethi et al. RecShard: statistical feature-based memory optimization for industry-scale neural recommendation
CN109144699A (en) Distributed task dispatching method, apparatus and system
CN105700948A (en) Method and device for scheduling calculation task in cluster
CN104462314A (en) Power grid data processing method and device
CN103678579A (en) Optimizing method for small-file storage efficiency
WO2010024027A1 (en) Virtual server system and physical server selection method
CN102521014A (en) Deploying method and deploying device for virtual machine
CN103365971A (en) Mass data access processing system based on cloud computing
CN105808358B (en) A kind of data dependence thread packet mapping method for many-core system
CN106202092A (en) The method and system that data process
CN105677763A (en) Image quality evaluating system based on Hadoop
CN103019855A (en) Method for forecasting executive time of Map Reduce operation
CN112632025A (en) Power grid enterprise management decision support application system based on PAAS platform
CN109144670A (en) A kind of resource regulating method and device
Mansouri et al. Hierarchical data replication strategy to improve performance in cloud computing
Singh et al. Spatial data analysis with ArcGIS and MapReduce
CN110134646A (en) The storage of knowledge platform service data and integrated approach and system
CN107193940A (en) Big data method for optimization analysis
CN107066328A (en) The construction method of large-scale data processing platform
CN110110153A (en) A kind of method and apparatus of node searching
CN116244085A (en) Kubernetes cluster container group scheduling method, device and medium
CN103324577A (en) Large-scale itemizing file distributing system based on minimum IO access conflict and file itemizing
CN107203554A (en) A kind of distributed search method and device
CN108304549A (en) A kind of big data Intelligent processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant