CN106293933A - A kind of cluster resource configuration supporting much data Computational frames and dispatching method - Google Patents
A kind of cluster resource configuration supporting much data Computational frames and dispatching method Download PDFInfo
- Publication number
- CN106293933A CN106293933A CN201511000709.2A CN201511000709A CN106293933A CN 106293933 A CN106293933 A CN 106293933A CN 201511000709 A CN201511000709 A CN 201511000709A CN 106293933 A CN106293933 A CN 106293933A
- Authority
- CN
- China
- Prior art keywords
- resource
- computational frame
- calculating
- computational
- accounting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Abstract
The invention discloses a kind of cluster resource configuration supporting much data Computational frames and dispatching method, comprise the following steps: by all of calculating resource supplying of concluding the business of main control node collection calculating node to Computational frame scheduler, corresponding Computational frame scheduler decide whether accept resource and use with contract mode of doing business;If Computational frame accepts the resource of distribution, by Computational frame self distributed scheduling, distribution of computation tasks calculated resource to corresponding and notify main control node, starting corresponding Computational frame executor and perform calculating task;If Computational frame refusal accepts, redistribute resource, continue to send resource transaction information to Computational frame;Multiple calculating resource type is carried out fine granularity distributional equity scheduling, and the resource distribution to Computational frame is determined by the resource that stresses of this framework, and the accounting stressing resource that each Computational frame obtains should be the most identical.Improve the overall resource utilization of cluster and calculate reliability of service/extensibility.
Description
Technical field
The invention belongs to computing cluster calculates configuration and the dispatching method of resource, support much numbers more particularly to one
Configure and dispatching method according to the cluster resource of Computational frame.
Background technology
Along with the development of the industry such as cloud computing, big data, increasing data center is established, and they need more
Effective manner is come for data center cost-effective, as the data center of Facebook, Google and Amazon Company is the most quick
Expansion, they, also in the technology of searching, help them reduce the construction of data center and safeguard renewal cost.
Data center (Data Center) is the particular device network of global collaboration, is used on internet basis
Transmit on facility, accelerate, show, calculate, store data message.Data center's major part electronic component is all by low-voltage DC
Source drives operation.The physical problem that data center faces is server itself and applies to other for connecting these servers
The cable of environment.
Cluster refers to that trunked communication system is a kind of computer system, and it passes through one group of loose integrated computer software
And/or hardware couples together the evaluation work that the most closely cooperated.In some sense, they can be counted as one
Computer.Single computer in group system is commonly referred to node, is generally connected by LAN, but also has other possibility
Connected mode.Cluster computer is commonly used to improve the calculating speed of single computer and/or reliability.Generally cluster
Computer is than single computer, and such as work station or the supercomputer ratio of performance to price are much higher.
Big data Computational frame is for processing operation and the programming framework of the distributed computing system of big data, such as,
Storm is for processing high speed, the distributed real time computation system of large data stream.Reliable real time data is with the addition of for Hadoop
Process function;Spark have employed internal memory and calculates.From the batch processing of many iteration, it is allowed to load data into internal memory and repeatedly inquire about,
The multiple calculating normal forms such as additionally fused data warehouse, stream process and graphics calculations.Spark builds on HDFS, can be with
Hadoop well combines.The batch processing of Hadoop user's mass data and off-line data process, and are that current big data calculate mark
One of accurate, it is used in current a lot of commercial systems for applications.Can even destructuring number the most integrated structured, semi-structured
According to collection.
The Intel Virtualization Technology of current popular can allow multiple application or virtual machine share a machine to improve server money
The utilization rate in source.But this shared meeting brings resource contention, and then the performance of interference application program, affects application on site
Response time.But quickly service response time is to weigh the key index of service quality, it is to allow user be satisfied with, to keep user here
Key.Therefore, this method will certainly affect CSAT, reduces service quality.
Current data center, in order to ensure service quality, use excess to provide the mode of resource, but sacrifices resource profit
By rate.The wasting of resources shows as two kinds of forms, and one is that crucial application on site monopolizes data center.Such as use in data
The heart runs certain or certain several application on site specially, and other job runs are in other data centers, to reduce at line service
Interference.Another kind is to exaggerate resource requirement.
Computing cluster in data center has become as the main calculating platform that big data are relied on, along with Distributed Calculation
Development, various big data Computational frames are made to solve different traffic issues, are such as suitable for large-scale off-line batch processing
Hadoop, is suitable for the Storm that real-time streams calculates, and the proposition of these big data frameworks solves the base of Distributed Calculation for developer
This requirement, including expansible and fault-tolerant.In order to adapt to business innovation, new Computational frame continues to produce, enterprise and tissue
Need on same computing cluster, run multiple Computational frame, by the demand of the combination adaptation business of multiple Computational frames.
The solution of existing shared computing cluster mainly has two kinds:
1) the calculating resource to data center carries out static partition, and the computing cluster of each subregion is specified and run a kind of calculation block
Frame, such as Hadoop cluster, Spark cluster, Storm cluster etc.;
2) by cloud computing architecture i.e. Service Management all of calculating resource, it is each calculation block by Intel Virtualization Technology
Frame provides one group of virtual machine, such as KVM.
Above scheme has the disadvantage in that
1) demand calculating resource is laid particular stress on by different Computational frames is different, and static partition causes overall resource utilization low
Under, autgmentability and reliability are low, and maintenance cost is high;
2) the specific dispatching requirement of enterprise customer is not accounted for;
3) following new Computational frame cannot be supported;
4) not accounting for computing capability and the position optimization of data storage, different Computational frames cannot share same local number
According to source, network transport load is higher, it is impossible to allow calculating actively find data to improve data access efficiency;
The underlying cause causing these shortcomings is, above static partition or the scheme of fictitious host computer and existing big data
Computational frame there are differences from the granularity that distributed computing resource distributes, and framework generally uses fine-grained resource
Share Model, a single calculating node can run multiple calculating task simultaneously and visit to improve resource utilization and data
The efficiency asked.These Computational frames are all stand-alone developments, and existing scheme cannot realize fine granularity between different Computational frames
Resource-sharing.
Summary of the invention
For above-mentioned technical problem, it is desirable to provide a kind of cluster resource supporting much data Computational frames configures
With dispatching method, by one group of unified interface, the calculating resource of cluster can be carried out by different big data Computational frames
Access, realize between different Computational frame fine-grained shared to calculating resource by the way of dynamic distribution and contract transaction,
The extendible method of salary distribution can the business demand of adaptive different enterprises.
For reaching above-mentioned purpose, the technical scheme is that
A kind of cluster resource configuration supporting much data Computational frames and dispatching method, it is characterised in that comprise the following steps:
S01: add a corresponding Computational frame scheduler for each Computational frame and be deployed to whole system, passing through master control
The all of calculating resource supplying of concluding the business of node processed collection calculating node, to Computational frame scheduler, is adjusted by corresponding Computational frame
Degree device decides whether accept resource and use in the way of contract transaction;
S02: if Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, by Computational frame self
Distributed scheduling calculates distribution of computation tasks resource to corresponding and notify main control node, then is notified phase by main control node
The node that calculates answered starts corresponding Computational frame executor to perform calculating task;
S03: if Computational frame refusal accepts the calculating resource of current distribution, then main control node re-starts resource point
Join, continue to send to Computational frame the information of resource transaction;
S04: multiple calculating resource type is carried out fine granularity distributional equity scheduling, the resource of each Computational frame is distributed by
The resource that stresses of this framework determines, in the accounting of the various calculating resources that each Computational frame obtains, stresses the accounting of resource
Lion's share should be occupied, and the accounting stressing resource that each Computational frame obtains should be the most identical.
Preferably, described step S04 includes:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector
Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration
During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting
Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also
Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
Compared with prior art, the invention has the beneficial effects as follows:
1, the present invention makes multiple big data Computational frame can share the meter of cluster by bilayer scheduling architecture and contract transaction
Calculate resource, it is achieved that the dynamic distribution of cluster resource, ensure that new by reusing the existing distributed scheduling of Computational frame
The support of Computational frame, makes the calculating demand quilt of different Computational frames towards fine granularity resource distributional equity dispatching method
Meet as far as possible and improve the resource utilization that cluster is overall, thus improve the whole efficiency of data center.
2, the method can give different Computational frames with the calculating resource in the distribution cluster of dynamic high-efficiency, improves cluster
Overall resource utilization and calculating reliability of service/extensibility.
Accompanying drawing explanation
Fig. 1 be the present invention support much data Computational frames cluster resource configuration with dispatching method across scheduling architecture
Figure;
Fig. 2 is that the present invention supports the cluster resource configuration of much data Computational frames and the resource distribution sequential chart of dispatching method;
Fig. 3 is that the present invention supports the cluster resource configuration of much data Computational frames and the scheduling flow figure of dispatching method.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention of greater clarity, below in conjunction with detailed description of the invention and
Accompanying drawing, the present invention is described in more detail.It should be understood that these describe the most exemplary, and it is not intended to limit the present invention
Scope.Additionally, in the following description, the description to known features and technology is eliminated, to avoid unnecessarily obscuring this
Bright concept.
Embodiment:
Technical scheme mainly includes two aspects:
1) based on master/slave double-deck scheduling mechanism and the scheduling architecture of contract transaction
As it is shown in figure 1, there is the existing distributed big data Computational frame of N kind to need to share PC cluster resource, need for each
Planting Computational frame add a corresponding Computational frame scheduler and be deployed in whole system, this scheduler is responsible for and main control
The resource distribution module of node carries out resource contract transaction, determines to accept or the meter of refusal distribution according to the requirement of Computational frame
Calculate resource;
There is K main control node, service be provided by the way of load balancing, each node comprises resource distribution module,
It is responsible for collecting the resource service condition from each calculating node, and corresponding resource supplying to each Computational frame is dispatched
Device;
Having M to calculate node, each calculating node is responsible for reporting local resource service condition, and starts corresponding as required
Computational frame executor performs the calculating task of Computational frame.
As in figure 2 it is shown, present invention employs double-deck scheduling mechanism, ground floor is collected by main control node and is calculated node institute
The concluded the business calculating resource supplying having is to Computational frame scheduler, by corresponding Computational frame scheduler in the way of contract transaction
Decide whether accept resource and use.If Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, pass through
The distributed scheduling of Computational frame self calculates distribution of computation tasks resource to corresponding and notify main control node, then by leading
Control the corresponding Computational frame executor calculating node startup corresponding of node notice and perform calculating task;If Computational frame
Refusal accepts the calculating resource of current distribution, then main control node re-starts distribution resource, continues to send to Computational frame
The information of resource transaction.
) towards fine granularity resource distributional equity dispatching method
In a shared cluster, the requirement of resource is given priority to by different big data Computational frames, and some needs are substantial amounts of
Disk and network, calculating based on internal memory of having needs substantial amounts of physical memory, have to belong to computation-intensive needs substantial amounts of
CPU.Consider to dispatch a fine granularity distributional equity for multiple calculating resource type, the resource to each Computational frame
Distribution should be determined by the resource that stresses of this framework, and the accounting of the various calculating resources that each Computational frame obtains (collects relatively
Group's aggregate resource) in, the accounting stressing resource should occupy lion's share.In view of fairness, each Computational frame obtains
The accounting stressing resource should be the most identical.
The flow chart of this algorithm is as shown in Figure 3:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector
Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration
During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting
Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also
Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
It should be appreciated that the above-mentioned detailed description of the invention of the present invention is used only for exemplary illustration or explains the present invention's
Principle, and be not construed as limiting the invention.Therefore, that is done in the case of without departing from the spirit and scope of the present invention is any
Amendment, equivalent, improvement etc., should be included within the scope of the present invention.Additionally, claims purport of the present invention
Whole within containing the equivalents falling into scope and border or this scope and border change and repair
Change example.
Claims (2)
1. the cluster resource configuration supporting much data Computational frames and dispatching method, it is characterised in that include following step
Rapid:
S01: add a corresponding Computational frame scheduler for each Computational frame and be deployed to whole system, passing through master control
The all of calculating resource supplying of concluding the business of node processed collection calculating node, to Computational frame scheduler, is adjusted by corresponding Computational frame
Degree device decides whether accept resource and use in the way of contract transaction;
S02: if Computational frame accepts the resource of distribution, then rise to second layer allocation schedule, by Computational frame self
Distributed scheduling calculates distribution of computation tasks resource to corresponding and notify main control node, then is notified phase by main control node
The node that calculates answered starts corresponding Computational frame executor to perform calculating task;
S03: if Computational frame refusal accepts the calculating resource of current distribution, then main control node re-starts resource point
Join, continue to send to Computational frame the information of resource transaction;
S04: multiple calculating resource type is carried out fine granularity distributional equity scheduling, the resource of each Computational frame is distributed by
The resource that stresses of this framework determines, in the accounting of the various calculating resources that each Computational frame obtains, stresses the accounting of resource
Lion's share should be occupied, and the accounting stressing resource that each Computational frame obtains should be the most identical.
The cluster resource configuration of support the most according to claim 1 much data Computational frame and dispatching method, its feature
Being, described step S04 includes:
S11: inquire about registered Computational frame scheduler, the calculating resource vector of single calculating required by task, and in vector
Each Resource Calculation accounting in all resources of cluster;
S12: the accounting of all resources is ranked up, wherein accounting maximum for stressing resource, when the calculation block having new registration
During frame, repeat step S11;Otherwise continue executing with;
S13: calculate the allocated accounting stressing resource of each Computational frame, to stressing the sequence of resource accounting, minimum to accounting
Computational frame carry out resource distribution, when the whole resources needed for this Computational frame all meet, this Computational frame remove also
Carry out next round distribution;
S14: repeat step S13, until PC cluster resource is all assigned.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511000709.2A CN106293933A (en) | 2015-12-29 | 2015-12-29 | A kind of cluster resource configuration supporting much data Computational frames and dispatching method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511000709.2A CN106293933A (en) | 2015-12-29 | 2015-12-29 | A kind of cluster resource configuration supporting much data Computational frames and dispatching method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106293933A true CN106293933A (en) | 2017-01-04 |
Family
ID=57650585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201511000709.2A Pending CN106293933A (en) | 2015-12-29 | 2015-12-29 | A kind of cluster resource configuration supporting much data Computational frames and dispatching method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106293933A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704069A (en) * | 2017-06-15 | 2018-02-16 | 重庆邮电大学 | A kind of Spark energy-saving scheduling methods perceived based on energy consumption |
CN107705025A (en) * | 2017-10-16 | 2018-02-16 | 曙光信息产业(北京)有限公司 | Supercomputer and its operating method |
CN109976894A (en) * | 2019-04-03 | 2019-07-05 | 中国科学技术大学苏州研究院 | A kind of platform-independent expansible distributed system task schedule braced frame |
CN112150248A (en) * | 2020-09-30 | 2020-12-29 | 欧冶云商股份有限公司 | Method, system and device for counting hung goods amount based on batch flow fusion |
CN112416538A (en) * | 2019-08-20 | 2021-02-26 | 中国科学院深圳先进技术研究院 | Multilayer architecture and management method of distributed resource management framework |
CN112698944A (en) * | 2020-12-29 | 2021-04-23 | 乐陵欧曼电子科技有限公司 | Distributed cloud computing system and method based on human brain simulation |
CN113326116A (en) * | 2021-06-30 | 2021-08-31 | 北京九章云极科技有限公司 | Data processing method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102866918A (en) * | 2012-07-26 | 2013-01-09 | 中国科学院信息工程研究所 | Resource management system for distributed programming framework |
CN103699445A (en) * | 2013-12-19 | 2014-04-02 | 北京奇艺世纪科技有限公司 | Task scheduling method, device and system |
CN104965762A (en) * | 2015-07-21 | 2015-10-07 | 国家计算机网络与信息安全管理中心 | Scheduling system oriented to hybrid tasks |
-
2015
- 2015-12-29 CN CN201511000709.2A patent/CN106293933A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102866918A (en) * | 2012-07-26 | 2013-01-09 | 中国科学院信息工程研究所 | Resource management system for distributed programming framework |
CN103699445A (en) * | 2013-12-19 | 2014-04-02 | 北京奇艺世纪科技有限公司 | Task scheduling method, device and system |
CN104965762A (en) * | 2015-07-21 | 2015-10-07 | 国家计算机网络与信息安全管理中心 | Scheduling system oriented to hybrid tasks |
Non-Patent Citations (3)
Title |
---|
毛小娃: "mesos概述", 《HTTPS://WWW.CNBLOGS.COM/XIAOMAOHAI/P/6158061.HTML》 * |
胡俊: "集群环境下聚类算法的并行化研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
霍菁 等: "一种改进的DRF算法对BESIII集群资源管理的优化", 《核电子学与探测技术》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704069A (en) * | 2017-06-15 | 2018-02-16 | 重庆邮电大学 | A kind of Spark energy-saving scheduling methods perceived based on energy consumption |
CN107704069B (en) * | 2017-06-15 | 2020-08-04 | 重庆邮电大学 | Spark energy-saving scheduling method based on energy consumption perception |
CN107705025A (en) * | 2017-10-16 | 2018-02-16 | 曙光信息产业(北京)有限公司 | Supercomputer and its operating method |
CN109976894A (en) * | 2019-04-03 | 2019-07-05 | 中国科学技术大学苏州研究院 | A kind of platform-independent expansible distributed system task schedule braced frame |
CN109976894B (en) * | 2019-04-03 | 2023-07-25 | 中国科学技术大学苏州研究院 | Platform-independent extensible distributed system task scheduling support frame |
CN112416538A (en) * | 2019-08-20 | 2021-02-26 | 中国科学院深圳先进技术研究院 | Multilayer architecture and management method of distributed resource management framework |
CN112150248A (en) * | 2020-09-30 | 2020-12-29 | 欧冶云商股份有限公司 | Method, system and device for counting hung goods amount based on batch flow fusion |
CN112698944A (en) * | 2020-12-29 | 2021-04-23 | 乐陵欧曼电子科技有限公司 | Distributed cloud computing system and method based on human brain simulation |
CN113326116A (en) * | 2021-06-30 | 2021-08-31 | 北京九章云极科技有限公司 | Data processing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106293933A (en) | A kind of cluster resource configuration supporting much data Computational frames and dispatching method | |
Krishnamurthy et al. | Pratyaastha: an efficient elastic distributed sdn control plane | |
Vakilinia | Energy efficient temporal load aware resource allocation in cloud computing datacenters | |
Masdari et al. | Efficient task and workflow scheduling in inter-cloud environments: challenges and opportunities | |
US10671444B2 (en) | Systems and methods for scheduling tasks and managing computing resource allocation for closed loop control systems | |
Kwok et al. | Resource calculations with constraints, and placement of tenants and instances for multi-tenant SaaS applications | |
CN107545338B (en) | Service data processing method and service data processing system | |
US8843929B1 (en) | Scheduling in computer clusters | |
Zhu et al. | Scheduling stochastic multi-stage jobs to elastic hybrid cloud resources | |
CN103731372A (en) | Resource supply method for service supplier under hybrid cloud environment | |
Konstanteli et al. | Elastic admission control for federated cloud services | |
CN103927229A (en) | Scheduling Mapreduce Jobs In A Cluster Of Dynamically Available Servers | |
Nithya et al. | SDCF: A software-defined cyber foraging framework for cloudlet environment | |
Amokrane et al. | Greenslater: On satisfying green SLAs in distributed clouds | |
Bi et al. | SLA-based optimisation of virtualised resource for multi-tier web applications in cloud data centres | |
Gupta | Load balancing in cloud computing | |
CN115134371A (en) | Scheduling method, system, equipment and medium containing edge network computing resources | |
Lu et al. | InSTechAH: Cost-effectively autoscaling smart computing hadoop cluster in private cloud | |
Hung et al. | Task scheduling for optimizing recovery time in cloud computing | |
Guo et al. | Multi-objective optimization for data placement strategy in cloud computing | |
Mijumbi | Placement and scheduling of functions in network function virtualization | |
Kumar et al. | QoS‐aware resource scheduling using whale optimization algorithm for microservice applications | |
Yusoh et al. | A penalty-based grouping genetic algorithm for multiple composite saas components clustering in cloud | |
Yadav et al. | Job scheduling in grid computing | |
CN105046393A (en) | Cloud computing-based traffic resource management system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170104 |
|
RJ01 | Rejection of invention patent application after publication |