CN106293933A - A kind of cluster resource configuration supporting much data Computational frames and dispatching method - Google Patents

A kind of cluster resource configuration supporting much data Computational frames and dispatching method Download PDF

Info

Publication number
CN106293933A
CN106293933A CN201511000709.2A CN201511000709A CN106293933A CN 106293933 A CN106293933 A CN 106293933A CN 201511000709 A CN201511000709 A CN 201511000709A CN 106293933 A CN106293933 A CN 106293933A
Authority
CN
China
Prior art keywords
resource
computational frame
calculating
computational
accounting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201511000709.2A
Other languages
Chinese (zh)
Inventor
张京梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dian Zan Science And Technology Ltd
Original Assignee
Beijing Dian Zan Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dian Zan Science And Technology Ltd filed Critical Beijing Dian Zan Science And Technology Ltd
Priority to CN201511000709.2A priority Critical patent/CN106293933A/en
Publication of CN106293933A publication Critical patent/CN106293933A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Abstract

The invention discloses a kind of cluster resource configuration supporting much data Computational frames and dispatching method, comprise the following steps: by all of calculating resource supplying of concluding the business of main control node collection calculating node to Computational frame scheduler, corresponding Computational frame scheduler decide whether accept resource and use with contract mode of doing business;If Computational frame accepts the resource of distribution, by Computational frame self distributed scheduling, distribution of computation tasks calculated resource to corresponding and notify main control node, starting corresponding Computational frame executor and perform calculating task;If Computational frame refusal accepts, redistribute resource, continue to send resource transaction information to Computational frame;Multiple calculating resource type is carried out fine granularity distributional equity scheduling, and the resource distribution to Computational frame is determined by the resource that stresses of this framework, and the accounting stressing resource that each Computational frame obtains should be the most identical.Improve the overall resource utilization of cluster and calculate reliability of service/extensibility.

Description

A kind of cluster resource configuration supporting much data Computational frames and dispatching method
Technical field
The invention belongs to computing cluster calculates configuration and the dispatching method of resource, support much numbers more particularly to one Configure and dispatching method according to the cluster resource of Computational frame.
Background technology
Along with the development of the industry such as cloud computing, big data, increasing data center is established, and they need more Effective manner is come for data center cost-effective, as the data center of Facebook, Google and Amazon Company is the most quick Expansion, they, also in the technology of searching, help them reduce the construction of data center and safeguard renewal cost.
Data center (Data Center) is the particular device network of global collaboration, is used on internet basis Transmit on facility, accelerate, show, calculate, store data message.Data center's major part electronic component is all by low-voltage DC Source drives operation.The physical problem that data center faces is server itself and applies to other for connecting these servers The cable of environment.
Cluster refers to that trunked communication system is a kind of computer system, and it passes through one group of loose integrated computer software And/or hardware couples together the evaluation work that the most closely cooperated.In some sense, they can be counted as one Computer.Single computer in group system is commonly referred to node, is generally connected by LAN, but also has other possibility Connected mode.Cluster computer is commonly used to improve the calculating speed of single computer and/or reliability.Generally cluster Computer is than single computer, and such as work station or the supercomputer ratio of performance to price are much higher.
Big data Computational frame is for processing operation and the programming framework of the distributed computing system of big data, such as, Storm is for processing high speed, the distributed real time computation system of large data stream.Reliable real time data is with the addition of for Hadoop Process function;Spark have employed internal memory and calculates.From the batch processing of many iteration, it is allowed to load data into internal memory and repeatedly inquire about, The multiple calculating normal forms such as additionally fused data warehouse, stream process and graphics calculations.Spark builds on HDFS, can be with Hadoop well combines.The batch processing of Hadoop user's mass data and off-line data process, and are that current big data calculate mark One of accurate, it is used in current a lot of commercial systems for applications.Can even destructuring number the most integrated structured, semi-structured According to collection.
The Intel Virtualization Technology of current popular can allow multiple application or virtual machine share a machine to improve server money The utilization rate in source.But this shared meeting brings resource contention, and then the performance of interference application program, affects application on site Response time.But quickly service response time is to weigh the key index of service quality, it is to allow user be satisfied with, to keep user here Key.Therefore, this method will certainly affect CSAT, reduces service quality.
Current data center, in order to ensure service quality, use excess to provide the mode of resource, but sacrifices resource profit By rate.The wasting of resources shows as two kinds of forms, and one is that crucial application on site monopolizes data center.Such as use in data The heart runs certain or certain several application on site specially, and other job runs are in other data centers, to reduce at line service Interference.Another kind is to exaggerate resource requirement.
Computing cluster in data center has become as the main calculating platform that big data are relied on, along with Distributed Calculation Development, various big data Computational frames are made to solve different traffic issues, are such as suitable for large-scale off-line batch processing Hadoop, is suitable for the Storm that real-time streams calculates, and the proposition of these big data frameworks solves the base of Distributed Calculation for developer This requirement, including expansible and fault-tolerant.In order to adapt to business innovation, new Computational frame continues to produce, enterprise and tissue Need on same computing cluster, run multiple Computational frame, by the demand of the combination adaptation business of multiple Computational frames.
The solution of existing shared computing cluster mainly has two kinds:
1) the calculating resource to data center carries out static partition, and the computing cluster of each subregion is specified and run a kind of calculation block Frame, such as Hadoop cluster, Spark cluster, Storm cluster etc.;
2) by cloud computing architecture i.e. Service Management all of calculating resource, it is each calculation block by Intel Virtualization Technology Frame provides one group of virtual machine, such as KVM.
Above scheme has the disadvantage in that
1) demand calculating resource is laid particular stress on by different Computational frames is different, and static partition causes overall resource utilization low Under, autgmentability and reliability are low, and maintenance cost is high;
2) the specific dispatching requirement of enterprise customer is not accounted for;
3) following new Computational frame cannot be supported;
4) not accounting for computing capability and the position optimization of data storage, different Computational frames cannot share same local number According to source, network transport load is higher, it is impossible to allow calculating actively find data to improve data access efficiency;
The underlying cause causing these shortcomings is, above static partition or the scheme of fictitious host computer and existing big data Computational frame there are differences from the granularity that distributed computing resource distributes, and framework generally uses fine-grained resource Share Model, a single calculating node can run multiple calculating task simultaneously and visit to improve resource utilization and data The efficiency asked.These Computational frames are all stand-alone developments, and existing scheme cannot realize fine granularity between different Computational frames Resource-sharing.
Summary of the invention
For above-mentioned technical problem, it is desirable to provide a kind of cluster resource supporting much data Computational frames configures With dispatching method, by one group of unified interface, the calculating resource of cluster can be carried out by different big data Computational frames Access, realize between different Computational frame fine-grained shared to calculating resource by the way of dynamic distribution and contract transaction, The extendible method of salary distribution can the business demand of adaptive different enterprises.
For reaching above-mentioned purpose, the technical scheme is that
A kind of cluster resource configuration supporting much data Computational frames and dispatching method, it is characterised in that comprise the following steps:
S01: add a corresponding Computational frame scheduler for each Computational frame and be deployed to whole system, passing through master control The all of calculating resource supplying of concluding the business of node processed collection calculating node, to Computational frame scheduler, is adjusted by corresponding Computational frame Degree device decides whether accept resource and use in the way of contract transaction;
S02: if Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, by Computational frame self Distributed scheduling calculates distribution of computation tasks resource to corresponding and notify main control node, then is notified phase by main control node The node that calculates answered starts corresponding Computational frame executor to perform calculating task;
S03: if Computational frame refusal accepts the calculating resource of current distribution, then main control node re-starts resource point Join, continue to send to Computational frame the information of resource transaction;
S04: multiple calculating resource type is carried out fine granularity distributional equity scheduling, the resource of each Computational frame is distributed by The resource that stresses of this framework determines, in the accounting of the various calculating resources that each Computational frame obtains, stresses the accounting of resource Lion's share should be occupied, and the accounting stressing resource that each Computational frame obtains should be the most identical.
Preferably, described step S04 includes:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
Compared with prior art, the invention has the beneficial effects as follows:
1, the present invention makes multiple big data Computational frame can share the meter of cluster by bilayer scheduling architecture and contract transaction Calculate resource, it is achieved that the dynamic distribution of cluster resource, ensure that new by reusing the existing distributed scheduling of Computational frame The support of Computational frame, makes the calculating demand quilt of different Computational frames towards fine granularity resource distributional equity dispatching method Meet as far as possible and improve the resource utilization that cluster is overall, thus improve the whole efficiency of data center.
2, the method can give different Computational frames with the calculating resource in the distribution cluster of dynamic high-efficiency, improves cluster Overall resource utilization and calculating reliability of service/extensibility.
Accompanying drawing explanation
Fig. 1 be the present invention support much data Computational frames cluster resource configuration with dispatching method across scheduling architecture Figure;
Fig. 2 is that the present invention supports the cluster resource configuration of much data Computational frames and the resource distribution sequential chart of dispatching method;
Fig. 3 is that the present invention supports the cluster resource configuration of much data Computational frames and the scheduling flow figure of dispatching method.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention of greater clarity, below in conjunction with detailed description of the invention and Accompanying drawing, the present invention is described in more detail.It should be understood that these describe the most exemplary, and it is not intended to limit the present invention Scope.Additionally, in the following description, the description to known features and technology is eliminated, to avoid unnecessarily obscuring this Bright concept.
Embodiment:
Technical scheme mainly includes two aspects:
1) based on master/slave double-deck scheduling mechanism and the scheduling architecture of contract transaction
As it is shown in figure 1, there is the existing distributed big data Computational frame of N kind to need to share PC cluster resource, need for each Planting Computational frame add a corresponding Computational frame scheduler and be deployed in whole system, this scheduler is responsible for and main control The resource distribution module of node carries out resource contract transaction, determines to accept or the meter of refusal distribution according to the requirement of Computational frame Calculate resource;
There is K main control node, service be provided by the way of load balancing, each node comprises resource distribution module, It is responsible for collecting the resource service condition from each calculating node, and corresponding resource supplying to each Computational frame is dispatched Device;
Having M to calculate node, each calculating node is responsible for reporting local resource service condition, and starts corresponding as required Computational frame executor performs the calculating task of Computational frame.
As in figure 2 it is shown, present invention employs double-deck scheduling mechanism, ground floor is collected by main control node and is calculated node institute The concluded the business calculating resource supplying having is to Computational frame scheduler, by corresponding Computational frame scheduler in the way of contract transaction Decide whether accept resource and use.If Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, pass through The distributed scheduling of Computational frame self calculates distribution of computation tasks resource to corresponding and notify main control node, then by leading Control the corresponding Computational frame executor calculating node startup corresponding of node notice and perform calculating task;If Computational frame Refusal accepts the calculating resource of current distribution, then main control node re-starts distribution resource, continues to send to Computational frame The information of resource transaction.
) towards fine granularity resource distributional equity dispatching method
In a shared cluster, the requirement of resource is given priority to by different big data Computational frames, and some needs are substantial amounts of Disk and network, calculating based on internal memory of having needs substantial amounts of physical memory, have to belong to computation-intensive needs substantial amounts of CPU.Consider to dispatch a fine granularity distributional equity for multiple calculating resource type, the resource to each Computational frame Distribution should be determined by the resource that stresses of this framework, and the accounting of the various calculating resources that each Computational frame obtains (collects relatively Group's aggregate resource) in, the accounting stressing resource should occupy lion's share.In view of fairness, each Computational frame obtains The accounting stressing resource should be the most identical.
The flow chart of this algorithm is as shown in Figure 3:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
It should be appreciated that the above-mentioned detailed description of the invention of the present invention is used only for exemplary illustration or explains the present invention's Principle, and be not construed as limiting the invention.Therefore, that is done in the case of without departing from the spirit and scope of the present invention is any Amendment, equivalent, improvement etc., should be included within the scope of the present invention.Additionally, claims purport of the present invention Whole within containing the equivalents falling into scope and border or this scope and border change and repair Change example.

Claims (2)

1. the cluster resource configuration supporting much data Computational frames and dispatching method, it is characterised in that include following step Rapid:
S01: add a corresponding Computational frame scheduler for each Computational frame and be deployed to whole system, passing through master control The all of calculating resource supplying of concluding the business of node processed collection calculating node, to Computational frame scheduler, is adjusted by corresponding Computational frame Degree device decides whether accept resource and use in the way of contract transaction;
S02: if Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, by Computational frame self Distributed scheduling calculates distribution of computation tasks resource to corresponding and notify main control node, then is notified phase by main control node The node that calculates answered starts corresponding Computational frame executor to perform calculating task;
S03: if Computational frame refusal accepts the calculating resource of current distribution, then main control node re-starts resource point Join, continue to send to Computational frame the information of resource transaction;
S04: multiple calculating resource type is carried out fine granularity distributional equity scheduling, the resource of each Computational frame is distributed by The resource that stresses of this framework determines, in the accounting of the various calculating resources that each Computational frame obtains, stresses the accounting of resource Lion's share should be occupied, and the accounting stressing resource that each Computational frame obtains should be the most identical.
The cluster resource configuration of support the most according to claim 1 much data Computational frame and dispatching method, its feature Being, described step S04 includes:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
CN201511000709.2A 2015-12-29 2015-12-29 A kind of cluster resource configuration supporting much data Computational frames and dispatching method Pending CN106293933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511000709.2A CN106293933A (en) 2015-12-29 2015-12-29 A kind of cluster resource configuration supporting much data Computational frames and dispatching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511000709.2A CN106293933A (en) 2015-12-29 2015-12-29 A kind of cluster resource configuration supporting much data Computational frames and dispatching method

Publications (1)

Publication Number Publication Date
CN106293933A true CN106293933A (en) 2017-01-04

Family

ID=57650585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511000709.2A Pending CN106293933A (en) 2015-12-29 2015-12-29 A kind of cluster resource configuration supporting much data Computational frames and dispatching method

Country Status (1)

Country Link
CN (1) CN106293933A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704069A (en) * 2017-06-15 2018-02-16 重庆邮电大学 A kind of Spark energy-saving scheduling methods perceived based on energy consumption
CN107705025A (en) * 2017-10-16 2018-02-16 曙光信息产业(北京)有限公司 Supercomputer and its operating method
CN109976894A (en) * 2019-04-03 2019-07-05 中国科学技术大学苏州研究院 A kind of platform-independent expansible distributed system task schedule braced frame
CN112150248A (en) * 2020-09-30 2020-12-29 欧冶云商股份有限公司 Method, system and device for counting hung goods amount based on batch flow fusion
CN112416538A (en) * 2019-08-20 2021-02-26 中国科学院深圳先进技术研究院 Multilayer architecture and management method of distributed resource management framework
CN112698944A (en) * 2020-12-29 2021-04-23 乐陵欧曼电子科技有限公司 Distributed cloud computing system and method based on human brain simulation
CN113326116A (en) * 2021-06-30 2021-08-31 北京九章云极科技有限公司 Data processing method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866918A (en) * 2012-07-26 2013-01-09 中国科学院信息工程研究所 Resource management system for distributed programming framework
CN103699445A (en) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 Task scheduling method, device and system
CN104965762A (en) * 2015-07-21 2015-10-07 国家计算机网络与信息安全管理中心 Scheduling system oriented to hybrid tasks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866918A (en) * 2012-07-26 2013-01-09 中国科学院信息工程研究所 Resource management system for distributed programming framework
CN103699445A (en) * 2013-12-19 2014-04-02 北京奇艺世纪科技有限公司 Task scheduling method, device and system
CN104965762A (en) * 2015-07-21 2015-10-07 国家计算机网络与信息安全管理中心 Scheduling system oriented to hybrid tasks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
毛小娃: "mesos概述", 《HTTPS://WWW.CNBLOGS.COM/XIAOMAOHAI/P/6158061.HTML》 *
胡俊: "集群环境下聚类算法的并行化研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
霍菁 等: "一种改进的DRF算法对BESIII集群资源管理的优化", 《核电子学与探测技术》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704069A (en) * 2017-06-15 2018-02-16 重庆邮电大学 A kind of Spark energy-saving scheduling methods perceived based on energy consumption
CN107704069B (en) * 2017-06-15 2020-08-04 重庆邮电大学 Spark energy-saving scheduling method based on energy consumption perception
CN107705025A (en) * 2017-10-16 2018-02-16 曙光信息产业(北京)有限公司 Supercomputer and its operating method
CN109976894A (en) * 2019-04-03 2019-07-05 中国科学技术大学苏州研究院 A kind of platform-independent expansible distributed system task schedule braced frame
CN109976894B (en) * 2019-04-03 2023-07-25 中国科学技术大学苏州研究院 Platform-independent extensible distributed system task scheduling support frame
CN112416538A (en) * 2019-08-20 2021-02-26 中国科学院深圳先进技术研究院 Multilayer architecture and management method of distributed resource management framework
CN112150248A (en) * 2020-09-30 2020-12-29 欧冶云商股份有限公司 Method, system and device for counting hung goods amount based on batch flow fusion
CN112698944A (en) * 2020-12-29 2021-04-23 乐陵欧曼电子科技有限公司 Distributed cloud computing system and method based on human brain simulation
CN113326116A (en) * 2021-06-30 2021-08-31 北京九章云极科技有限公司 Data processing method and system

Similar Documents

Publication Publication Date Title
CN106293933A (en) A kind of cluster resource configuration supporting much data Computational frames and dispatching method
Krishnamurthy et al. Pratyaastha: an efficient elastic distributed sdn control plane
Vakilinia Energy efficient temporal load aware resource allocation in cloud computing datacenters
Masdari et al. Efficient task and workflow scheduling in inter-cloud environments: challenges and opportunities
US10671444B2 (en) Systems and methods for scheduling tasks and managing computing resource allocation for closed loop control systems
Kwok et al. Resource calculations with constraints, and placement of tenants and instances for multi-tenant SaaS applications
CN107545338B (en) Service data processing method and service data processing system
US8843929B1 (en) Scheduling in computer clusters
Zhu et al. Scheduling stochastic multi-stage jobs to elastic hybrid cloud resources
CN103731372A (en) Resource supply method for service supplier under hybrid cloud environment
Konstanteli et al. Elastic admission control for federated cloud services
CN103927229A (en) Scheduling Mapreduce Jobs In A Cluster Of Dynamically Available Servers
Nithya et al. SDCF: A software-defined cyber foraging framework for cloudlet environment
Amokrane et al. Greenslater: On satisfying green SLAs in distributed clouds
Bi et al. SLA-based optimisation of virtualised resource for multi-tier web applications in cloud data centres
Gupta Load balancing in cloud computing
CN115134371A (en) Scheduling method, system, equipment and medium containing edge network computing resources
Lu et al. InSTechAH: Cost-effectively autoscaling smart computing hadoop cluster in private cloud
Hung et al. Task scheduling for optimizing recovery time in cloud computing
Guo et al. Multi-objective optimization for data placement strategy in cloud computing
Mijumbi Placement and scheduling of functions in network function virtualization
Kumar et al. QoS‐aware resource scheduling using whale optimization algorithm for microservice applications
Yusoh et al. A penalty-based grouping genetic algorithm for multiple composite saas components clustering in cloud
Yadav et al. Job scheduling in grid computing
CN105046393A (en) Cloud computing-based traffic resource management system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104

RJ01 Rejection of invention patent application after publication