CN106484528B

CN106484528B - For realizing the method and device of cluster dynamic retractility in Distributed Architecture

Info

Publication number: CN106484528B
Application number: CN201610809555.XA
Authority: CN
Inventors: 周恺; 王倩; 肖远昊; 王家兴; 张发恩
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2016-09-07
Filing date: 2016-09-07
Publication date: 2019-08-27
Anticipated expiration: 2036-09-07
Also published as: CN106484528A

Abstract

The present invention provides the methods and device in Distributed Architecture for realizing cluster dynamic retractility, method therein comprises determining that out that there are cluster dynamic retractility demands, wherein, the cluster includes: multiple from node, and the multiple is divided into computing resource group and storage resource group according to its resource service provided from node；The slave number of nodes in the computing resource group and/or storage resource group is adjusted according to the cluster dynamic retractility demand.Above-mentioned technical proposal provided by the invention can preferably be such that cluster scale and processing capacity matches with the demand in practical application scene to storage resource and computing resource, to effectively avoid the phenomenon that cluster resource is insufficient and cluster resource wastes, and then while improving the flexibility of cluster dynamic retractility, the performance of cluster is improved.

Description

For realizing the method and device of cluster dynamic retractility in Distributed Architecture

Technical field

The present invention relates to network technologies, more particularly, to the side in a kind of Distributed Architecture for realizing cluster dynamic retractility For realizing the device of cluster dynamic retractility in method and Distributed Architecture.

Background technique

In distributed technical field, become at present since the system bottom details of Hadoop is transparent using relatively broad Distributed Architecture.The flexible tune for typically referring to cluster scale and processing capacity of cluster (i.e. Hadoop cluster) based on Hadoop It is whole.

Currently, the implementation method that Hadoop cluster stretches is usual are as follows: need to increase in Hadoop cluster new from node When, this is configured from the server where node first, then, interrupts the service in each node in Hadoop cluster, and Each node in notice Hadoop cluster increases new after node service in each node of starting Hadoop cluster, It is new from node addition Hadoop cluster to make；When needing to reduce from node in Hadoop cluster, Hadoop cluster is interrupted In each node in service start Hadoop collection and after the slave node that is contracted by of each node in notice Hadoop cluster Service in each node of group, so that the slave node being contracted by be made to exit Hadoop cluster.

Inventor has found that the mode of existing adjustment cluster scale and processing capacity is due to needing in realizing process of the present invention The service in each node of Hadoop cluster is interrupted, and the operations such as server are set and need artificial treatment, so that cluster be made to stretch The cost of implementation of contracting is higher, and intelligence degree is lower.In addition, Hadoop cluster adjusted be difficult in practical application scene Demand to storage resource and computing resource matches, if Hadoop cluster scale and processing capacity are usually according to peak period What the demand to cluster resource was arranged, therefore, the phenomenon that offpeak period necessarily will appear cluster resource waste.

Summary of the invention

The object of the present invention is to provide the method and devices in a kind of Distributed Architecture for realizing cluster dynamic retractility.

According to an aspect of the present invention, a kind of method in Distributed Architecture for realizing cluster dynamic retractility is provided, And the method mainly comprises the steps that and determines that there are cluster dynamic retractility demands, wherein the cluster includes: multiple From node, and the multiple computing resource group and storage resource group are divided into according to its resource service provided from node；Root The slave number of nodes in the computing resource group and/or storage resource group is adjusted according to the cluster dynamic retractility demand.

According to another aspect of the present invention, the dress in a kind of Distributed Architecture for realizing cluster dynamic retractility is provided It sets, comprising: for determining that there are the devices of cluster dynamic retractility demand, wherein the cluster includes: multiple from node, and It is the multiple that computing resource group and storage resource group are divided into according to its resource service provided from node；For according to Cluster dynamic retractility demand adjusts the device of the slave number of nodes in the computing resource group and/or storage resource group.

Compared with prior art, the invention has the following advantages that the present invention by by cluster it is multiple from node according to Its respectively provided by resource service (i.e. computing resource service, storage resource service) and be divided in computing resource group and storage money In the group of source, have differences resource service provided by the slave node in computing resource group and storage resource group, in this way, calculating money Slave node in the group of source can be the slave node that re-computation gently stores, and the slave node in storage resource group can attach most importance to store it is light The slave node calculated, so that the present invention is when cluster stretches, it can be targetedly to computing resource group and/or storage resource group It stretches, and then the present invention can make the adjustment of cluster scale and processing capacity more targeted, make cluster scale and place Reason ability can preferably be matched with the demand in practical application scene to storage resource and computing resource, such as in peak period, Cluster can provide sufficient computing resource in time, and in low-valley interval, cluster can discharge extra computing resource etc. in time； It follows that technical solution provided by the invention can effectively avoid cluster resource insufficient and what cluster resource wasted shows As improving the performance of cluster while improving the flexibility of cluster dynamic retractility.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:

Fig. 1 is the method flow diagram in the Distributed Architecture of the embodiment of the present invention one for realizing cluster dynamic retractility；

Fig. 2 is to utilize void in the Distributed Architecture of the embodiment of the present invention two for realizing in the method for cluster dynamic retractility Quasi- machine builds the flow chart of a specific example of Hadoop cluster；

Fig. 3 is in the Distributed Architecture of the embodiment of the present invention two for realizing the Hadoop in the method for cluster dynamic retractility The flow chart of one specific example of the dilatation of cluster；

Fig. 4 is in the Distributed Architecture of the embodiment of the present invention two for realizing the Hadoop in the method for cluster dynamic retractility The flow chart of one specific example of the capacity reducing of cluster；

Fig. 5 is that the long-range control node of the embodiment of the present invention three executes in Distributed Architecture for realizing cluster dynamic retractility Method in a specific example flow chart；

Fig. 6 is the schematic device in the Distributed Architecture of the embodiment of the present invention four for realizing cluster dynamic retractility；

Fig. 7 is the schematic diagram of a specific example of the determination demand device of the embodiment of the present invention four；

Fig. 8 is the schematic diagram of a specific example of the adjustresources group device of the embodiment of the present invention four；

Fig. 9 is the schematic diagram of another specific example of the adjustresources group device of the embodiment of the present invention four；

Figure 10 is the schematic diagram of a specific example of the register device of the embodiment of the present invention four；

Figure 11 is the schematic diagram of another specific example of the adjustresources group device of the embodiment of the present invention four.

The same or similar appended drawing reference represents the same or similar component in attached drawing.

Specific embodiment

Before exemplary embodiment is discussed in greater detail, it should be mentioned that some exemplary embodiments are described At the processing or method described as flow chart.Although operations are described as the processing of sequence by flow chart, it is therein Many operations can be implemented concurrently, concomitantly or simultaneously.In addition, the execution sequence of operations can be pacified again Row.The processing can be terminated when its operations are completed, it is also possible to have the additional step being not included in attached drawing.Institute Stating processing can correspond to method, function, regulation, subroutine, subprogram etc..

Alleged " computer equipment " within a context, also referred to as " computer ", referring to can be by running preset program or referring to Enable to execute numerical value and calculate and/or the intelligent electronic device of the predetermined process process such as logic calculation, may include processor with Memory executes the survival prestored in memory instruction by processor to execute predetermined process process, or by ASIC, The hardware such as FPGA, DSP execute predetermined process process, or are realized by said two devices combination.Computer equipment includes but not It is limited to server, PC and laptop etc..

The computer equipment includes user equipment and the network equipment.Wherein, the user equipment includes but is not limited to electricity Brain, smart phone, PDA etc.；The network equipment includes but is not limited to that single network server, multiple network servers form Server group or the cloud consisting of a large number of computers or network servers for being based on cloud computing (Cloud Computing), wherein Cloud computing is one kind of distributed computing, a super virtual computer consisting of a loosely coupled set of computers.Its In, the computer equipment can isolated operation realize the present invention, also can access network and by with other calculating in network The present invention is realized in the interactive operation of machine equipment.Wherein, network locating for the computer equipment include but is not limited to internet, Wide area network, Metropolitan Area Network (MAN), local area network, VPN network etc..

It should be noted that the user equipment, the network equipment and network etc. are only for example, other are existing or from now on may be used The computer equipment or network that can occur such as are applicable to the present invention, should also be included within the scope of protection of the present invention, and to draw It is incorporated herein with mode.

Method discussed hereafter (some of them are illustrated by process) can by hardware, software, firmware, middleware, Microcode, hardware description language or any combination thereof are implemented.When with software, firmware, middleware or microcode to implement, Program code or code segment to implement necessary task can be stored in machine or computer-readable medium, and (for example storage is situated between Matter) in.Necessary task can be implemented in (one or more) processor.

Specific structure and function details disclosed herein are only representative, and are for describing the present invention show The purpose of example property embodiment.But the present invention can be implemented by many alternative forms, and be not interpreted as It is limited only by the embodiments set forth herein.

Although it should be understood that may have been used term " first ", " second " etc. herein to describe each unit, But these units should not be limited by these terms.The use of these items is only for by a unit and another unit It distinguishes.For example, without departing substantially from the range of exemplary embodiment, it is single that first unit can be referred to as second Member, and similarly second unit can be referred to as first unit.Term "and/or" used herein above include one of them or Any and all combinations of more listed associated items.

It should be understood that when a unit referred to as " connects " or when " coupled " to another unit, it can be straight It connects and is connected or coupled to another unit, or may exist temporary location.In contrast, when a unit is referred to as " straight Connect in succession " or " direct-coupling " to another unit when, then temporary location is not present.Should explain in a comparable manner by with Relationship between description unit other words (such as " between being in ... " compared to " between being directly in ... ", " with ... it is neighbouring " compared to " with ... it is directly adjacent to " etc.).

Term used herein above is not intended to limit exemplary embodiment just for the sake of description specific embodiment.Unless Context clearly refers else, otherwise singular used herein above "one", " one " also attempt to include plural number.Also answer When understanding, term " includes " and/or "comprising" used herein above provide stated feature, integer, step, operation, The presence of unit and/or component, and do not preclude the presence or addition of other one or more features, integer, step, operation, unit, Component and/or combination thereof.

It should further be mentioned that the function action being previously mentioned can be attached according to being different from some replace implementations The sequence indicated in figure occurs.For example, related function action is depended on, the two width figures shown in succession actually may be used Substantially simultaneously to execute or can execute in a reverse order sometimes.

Present invention is further described in detail with reference to the accompanying drawing.

For realizing the method for cluster dynamic retractility in embodiment one, Distributed Architecture.

Fig. 1 is in the Distributed Architecture of the present embodiment for realizing the flow chart of the method for cluster dynamic retractility, and Fig. 1 institute The method shown specifically includes that step S100 and step S110.Method described in this embodiment is usually in computer equipment In be performed, it is preferred that method described in this embodiment can be in server, desktop computer and other network equipments In be performed, as this method is performed in the server, desktop computer and other network equipments remotely connecting with cluster. Each step in Fig. 1 is illustrated respectively below.

S100, determine that there are cluster dynamic retractility demands.

Cluster in the present embodiment can be Hadoop cluster, or the cluster based on other Distributed Architecture.This Cluster in embodiment includes: multiple from node, i.e. Slave node.Slave node in the present embodiment is the master relative to cluster For node (i.e. Master node).Slave node in the present embodiment does not imply that the slave section for company-data copy Point.In addition, the slave node in the present embodiment is usually by virtual machine or the logic for being easy to create and be easy to destroy of other forms Equipment is realized.

As an example, all in cluster are divided into computing resource group and storage resource group, certainly, this implementation from node A possibility that part that example is not precluded in cluster is divided into computing resource group and storage resource group from node.Usual situation Under, computing resource group should include at least one from node (being properly termed as calculating from node), and storage resource group also should include extremely Lack one from node (being properly termed as storage from node), i.e. computing resource group and storage resource group is usually not in for empty feelings Condition.

As an example, one is divided in the foundation in computing resource group or storage resource group from node and generally includes: It should be resource service provided by cluster from node；For example, one provides storage resource service (as divided if it is cluster from node Cloth storage service), then it should can be divided into storage resource group from node, and if one does not provide for cluster from node and deposits Resource service is stored up, but provides computing resource service (such as distributed computing services) for cluster, then should can be divided from node Into computing resource group.

As an example, (such as large capacity is hard for each disk usually from node with larger memory space in storage resource group Disk etc.), and the computing capability of its CPU (Central Processing Unit, central processing unit) is usually weaker；Opposite, meter Each CPU usually from node with stronger computing capability in resource group is calculated, and the usual very little of the memory space of its disk (is such as matched Set low capacity hard disk etc.).

Slave node division in cluster is computing resource group and storage resource group by the present embodiment, and computing resource group and is deposited Storing up configuration of the slave node in resource group in terms of storage resource and computing resource may exist larger difference, so as to so that collection The adjustment of group's scale and processing capacity is more targeted, and can enable cluster scale and processing capacity and practical application field Demand in scape to storage resource and computing resource preferably matches, as cluster preferably can be suitable for storing and gently calculating again In practical application scene, cluster can be preferably suitable for the practical application scene of light storage re-computation for another example.

As an example, the slave node in the storage resource group of the present embodiment provides storage resource service for cluster and calculates money Source service, as the slave node in storage resource group provides distributed storage service and distributed computing services for cluster；And this reality Applying the slave node in the computing resource group of example is only that cluster provides computing resource service, i.e. slave node in computing resource group is only Cluster provides distributed computing services, without providing distributed storage service for cluster.Certainly, storage is not precluded in the present embodiment Slave node in resource group is only that cluster provides a possibility that storage resource service is without providing computing resource service for cluster.By Slave node in cluster needs to occupy certain computing resource when executing data read operation, but its occupied calculating Resource is very limited, and therefore, the present embodiment is by making the slave node in storage resource group provide storage resource service and calculating Making full use of to the slave node in storage resource group may be implemented in resource service.

As an example, the cluster dynamic retractility demand in the present embodiment may include: the slave section expanded in computing resource group Point demand, the slave node demand in Reduction Computation resource group expand the slave node demand in storage resource group and reduce storage One or more in slave node demand in resource group.

As an example, in the case that cluster dynamic retractility demand in the present embodiment includes above-mentioned four kinds of demands, this reality Applying the existing cluster dynamic retractility demand that example is determined can be any one demand in above-mentioned four kinds of demands；Certainly, The existing cluster dynamic retractility demand that the present embodiment is determined can also comprise more than a kind of demand but simultaneously include two Kind demand, such as the slave node demand in the slave node demand and expansion storage resource group in expansion computing resource group；For another example expand It fills the slave node demand in computing resource group and reduces the slave node demand in storage resource group；For another example Reduction Computation resource group In slave node demand and expand storage resource group in slave node demand；For another example the slave node in Reduction Computation resource group needs Seek and reduce the slave node demand in storage resource group.

As an example, the present embodiment can excessively form the situation of accumulation or harmful competition etc. in the calculating task of cluster Under, determine that there are cluster dynamic retractility demands, and the present embodiment can be in the memory space inadequate or memory space of cluster There are in the case where crisis, determine that there are cluster dynamic retractility demands；More specifically, the present embodiment can be according to getting Clustering performance information determine that there are cluster dynamic retractility demands.Clustering performance information accessed by the present embodiment can wrap It includes: at least one of the computing resource utilization rate of cluster and the utilization ratio of storage resources of cluster；And in general, this reality It applies the clustering performance information that example is got while including: the computing resource utilization rate of cluster and the utilization ratio of storage resources of cluster. The computing resource utilization rate of cluster is usually the occupied computing resource of all calculating tasks and current cluster in current cluster In all calculating tasks from the ratio namely current cluster for total computing resource that node is capable of providing occupied by Computing resource account for the percentage of all total computing resources being capable of providing from node in current cluster.The storage of cluster provides Source utilization rate is usually all from the occupied memory space of the stored data of node and current cluster in current cluster It is all from all from the stored number of node in the ratio namely current cluster for total storage resource that node is capable of providing The percentage of all total storage resources being capable of providing from node in current cluster is accounted for according to occupied memory space.When So, clustering performance information accessed by the present embodiment also may include: the computing resource being not used by cluster and cluster In unappropriated storage resource etc..The present embodiment does not limit the specific manifestation form of clustering performance information.

As an example, the present embodiment can know each of cluster by all processes that cluster is added from node in cluster The total computing resource (data-handling capacity of such as CPU) and total storage resource being respectively capable of providing from node are (such as disk Storage size), so that the present embodiment can know all total computing resources being capable of providing from node of cluster and total Storage resource.

As an example, in the cluster all are divided in computing resource group and storage resource group from node, and count Calculate in resource group it is all computing resource service is only provided from node, and all in storage resource group only provide storage from node In the case where resource service, the computing resource utilization rate of the obtained cluster of the present embodiment its actually in computing resource group it is all from The computing resource utilization rate of node, the utilization ratio of storage resources of cluster acquired in the present embodiment its actually in storage resource group All utilization ratios of storage resources from node.

As an example, in the cluster all are divided in computing resource group and storage resource group from node, and count Calculate resource group in it is all computing resource service is only provided from node, and in storage resource group it is all from node provide storage money In the case where source service and computing resource service, its actually computing resource of the computing resource utilization rate of the obtained cluster of the present embodiment All computing resource utilization rates from node in group and storage resource group, the storage resource benefit of the present embodiment cluster obtained With its actually all utilization ratio of storage resources from node in storage resource group of rate.

As an example, the present embodiment can obtain the computing resource utilization rate and cluster of cluster from the host node of cluster Utilization ratio of storage resources；It is requested as the present embodiment is sent to host node by carrying out telecommunication with the host node of cluster, and Receive host node transmitted according to the request come current cluster computing resource utilization rate and cluster storage resource utilization Rate；For another example host node actively periodically reports the computing resource utilization rate of current cluster and depositing for cluster by telecommunication mode Store up resource utilization；For another example host node in the computing resource utilization rate for monitoring current cluster reaches first threshold or lower than the When the utilization ratio of storage resources of two threshold values or cluster reaches third threshold value or is lower than four threshold values, pass through telecommunication mode The computing resource utilization rate of active reporting current cluster and the utilization ratio of storage resources of cluster.

As an example, the present embodiment can be occupied from all calculating tasks obtained in cluster from the host node of cluster It is all from the occupied memory space of the stored data of node in computing resource and cluster, and to the occupancy got All total computing resources being capable of providing from node in the cluster of computing resource, the memory space of occupancy and local maintenance It is calculated with all total storage resources being capable of providing from node in cluster, so that the computing resource for obtaining cluster utilizes The utilization ratio of storage resources of rate and cluster；

One specific example, the present embodiment are asked by carrying out telecommunication with the host node of cluster to host node transmission Ask, and receive host node transmitted according to the request come current cluster in the occupied computing resource of all calculating tasks with And it is all from the occupied memory space of the stored data of node in current cluster, then, calculate the current collection received The occupied computing resource of all calculating tasks in group with it is all in the computing resource group and storage resource group currently safeguarded The ratio of the total computing resource provided from node, so that the computing resource utilization rate of current cluster is obtained, likewise, the present embodiment It also needs to calculate all from the occupied memory space of the stored data of node and leading dimension in the current cluster received The ratio of all total memory spaces provided from node in the storage resource group of shield, to obtain the storage resource of current cluster Utilization rate；

Another specific example, host node actively periodically report all meters in current cluster by telecommunication mode It is all from the occupied memory space of the stored data of node in the occupied computing resource of calculation task and current cluster, Then, the computing resource group for calculating the occupied computing resource of all calculating tasks in the current cluster reported and currently safeguarding With in storage resource group it is all from node provide total computing resources ratio, thus obtain current cluster computing resource benefit With rate, likewise, the present embodiment also need to calculate it is all occupied by the stored data of node in the current cluster reported Memory space and all total memory spaces provided from node in the storage resource group currently safeguarded ratio, to obtain The utilization ratio of storage resources of current cluster；

Another specific example, host node are monitoring the occupied calculating money of all calculating tasks in current cluster What source reached in default processing threshold value or current cluster all reaches from the occupied memory space of the stored data of node When default storage threshold value, the occupied computing resource of all calculating tasks in current cluster and the institute in current cluster are reported Have from the occupied memory space of the stored data of node, then, calculates all calculating tasks in the current cluster reported All total calculating provided from node in occupied computing resource and the computing resource group and storage resource group currently safeguarded The ratio of resource, to obtain the computing resource utilization rate of current cluster, likewise, the present embodiment also needs to calculate working as of reporting It is all from the occupied memory space of the stored data of node and the institute in the storage resource group currently safeguarded in preceding cluster There is the ratio of the total memory space provided from node, to obtain the utilization ratio of storage resources of current cluster.

As an example, the present embodiment can determine cluster with the presence or absence of expansion meter using the computing resource utilization rate of cluster Calculate the slave node demand in the slave node demand or Reduction Computation resource group in resource group；And utilize the storage resource benefit of cluster Cluster can be determined with the presence or absence of in the slave node demand or reduction storage resource group expanded in storage resource group with rate From node demand.

As an example, the present embodiment determines that cluster is provided with the presence or absence of calculating is expanded using the computing resource utilization rate of cluster One specific example of the slave node demand in slave node demand or Reduction Computation resource group in the group of source is to judge In the case that the computing resource utilization rate of cluster is more than first threshold, determine that there is the slave node expanded in computing resource group needs It asks；And in the case where judging the computing resource utilization rate of cluster lower than second threshold, determine that there are Reduction Computation resources Slave node demand in group；And above-mentioned first threshold is typically much deeper than second threshold.

As an example, the present embodiment can determine that cluster is deposited with the presence or absence of expansion using the utilization ratio of storage resources of cluster A specific example for storing up the slave node demand in resource group or the slave node demand in reduction storage resource group is to sentence In the case that the utilization ratio of storage resources of disconnected storage resource group out is more than third threshold value, determines to exist and expand in storage resource group Slave node demand；And in the case where judging the utilization ratio of storage resources of computing resource group lower than four threshold values, it determines In the presence of the slave node demand in reduction storage resource group；And above-mentioned third threshold value is typically much deeper than the 4th threshold value.

As an example, the present embodiment can determine that there are cluster dynamics to stretch according to the resource adjustment control information received Contracting demand；Such as determine that there are cluster dynamic retractility demands when receiving the resource adjustment control information that host node transmission comes； For another example determine that there are cluster dynamic retractility demands in the resource adjustment control information for receiving user's input.

As an example, the resource adjustment control information in the present embodiment can be with are as follows: expand the slave node in computing resource group Control information, the control information of slave node in Reduction Computation resource group, the control for expanding slave node in storage resource group Information and reduction storage resource group in slave node control information in any one；Or expand computing resource group In the control information of slave node and the control information of the slave node in Reduction Computation resource group in any one and expand It is any in the control information of slave node in storage resource group and the control information of the slave node in reduction storage resource group One.

S110, the slave number of nodes in computing resource group and/or storage resource group is adjusted according to cluster dynamic retractility demand.

As an example, the present embodiment can first determine to need to adjust in computing resource group and/or storage resource group from Then the quantity of node increases in computing resource group and/or storage resource group further according to the quantity determined or reduces phase Answer the slave node of quantity.

As an example, the present embodiment determine one of the quantity for the slave node for needing to adjust in computing resource group it is specific Example are as follows: it is previously provided with the computing resource utilization rate (the computing resource utilization rate of i.e. ideal cluster) of preferable cluster, It, can be according to shared by all calculating tasks in current cluster in the case where needing to add new slave node in computing resource group Computing resource determines total computing resource needed for reaching the computing resource utilization rate of above-mentioned preferable cluster, needed for calculating Total computing resource and current cluster provided by the difference of total computing resource then newly increased according to the difference and one The computing resource that is capable of providing of slave node determine the slave number of nodes for needing to expand in computing resource group.Another is specific Example can be according to all calculating tasks in current cluster in the case where needing the slave node in Reduction Computation resource group Occupied computing resource determines total computing resource needed for reaching the computing resource utilization rate of above-mentioned preferable cluster, calculates Then the difference of total computing resource provided by required total computing resource and current cluster according to the difference and calculates money One determines the slave number of nodes for needing to reduce in computing resource group from the computing resource that node is capable of providing in the group of source.

As an example, the present embodiment determine one of the quantity for the slave node for needing to adjust in storage resource group it is specific Example are as follows: it is previously provided with the utilization ratio of storage resources (utilization ratio of storage resources of i.e. ideal cluster) of preferable cluster, It, can be according to occupied by the storing data in current cluster in the case where needing to add new slave node in storage resource group Storage resource determine total storage resource needed for reaching the utilization ratio of storage resources of above-mentioned preferable cluster, calculate required The difference of total storage resource provided by total storage resource and current cluster, and the slave section newly increased according to the difference and one The be capable of providing storage resource of point determines the slave number of nodes for needing to expand in storage resource group.Another specific example, It, can be according to shared by all storing datas in current cluster in the case where needing to reduce the slave node in storage resource group Storage resource determines total storage resource needed for reaching the utilization ratio of storage resources of above-mentioned preferable cluster, needed for calculating Total storage resource and current cluster provided by total storage resource difference, then, according to the difference and storage resource group In one the slave number of nodes for needing to reduce in storage resource group is determined from the storage resource that node is capable of providing.

It should be strongly noted that need for the slave number of nodes in computing resource group and storage resource group respectively into In the case that row adjusts, and the slave node in storage resource group is capable of providing storage resource and computing resource, the present embodiment is logical Chang Yingxian determines the quantity for the slave node for needing to adjust in storage resource group, then, then determines and needs to adjust in computing resource group The quantity of whole slave node, and during the quantity for the slave node for needing to adjust in determining computing resource group, usually take an examination Consider in storage resource group and needs influence of the computing resource provided by the slave node of expansion/reduction to total computing resource of cluster.

As an example, the slave node in computing resource group by virtual machine to realize in the case where, the present embodiment is according to collection Group's dynamic retractility demand expands a specific implementation process of the slave node in computing resource group are as follows: firstly, determining to calculate money Then the slave number of nodes for needing to expand in the group of source creates the virtual machine of equivalent amount according to the quantity, and will be newly created each Virtual machine is registered respectively (such as to be added the registration information including the host name of newly created virtual machine respectively in the cluster Host node in cluster and respectively from the cluster configuration file in node), and start the distribution of each virtual machine of successful registration Formula calculates service, later, each virtual machine is divided in computing resource group, such as by the relevant information of each virtual machine newly increased (the computing resource information configuration information of the virtual machine such as newly increased) is maintained in computing resource group information.

It should be strongly noted that the new virtual machine of operation, the creation of the above-mentioned determining slave number of nodes for needing to expand Distributed computing services in the new virtual machine registered operation, start successful registration of operation, new virtual machine in the cluster Operation and will not be in current cluster by the operation that is divided in computing resource group of virtual machine after the service that successfully starts up Each node performed by task have an impact, and the present embodiment is successfully starting up the distributed computing services in virtual machine Afterwards, which is formally come into operation as the slave node in computing resource group, and the host node in cluster can be according to it Current distribution of computation tasks strategy is that the virtual machine distributes corresponding calculating task, so that the present embodiment can be in computing resource Smoothly increase new slave node without breakpoint in group.It follows that the present embodiment can not interrupt the clothes in each node of cluster In the case where business, increase new slave node, in computing resource group so as to avoid service disruption institute during cluster is stretched Bring cluster stretches the higher problem of cost of implementation；In addition, the present embodiment is realized by using virtual machine in computing resource group Slave node, can with remote controlled manner easily in computing resource group slave node carry out additions and deletions, to improve reality The intelligence degree that existing cluster is stretched.In addition, computing resource provided by the different virtual machine of the present embodiment creation may be identical, There may be differences.

As an example, the slave node in storage resource group by virtual machine to realize in the case where, the present embodiment is according to collection Group's dynamic retractility demand expands a specific implementation process of the slave number of nodes in storage resource group are as follows: firstly, determining to deposit The slave number of nodes for needing to expand in storage resource group then then will be new according to the virtual machine that the quantity creates equivalent amount Each virtual machine of creation is registered respectively in the cluster (such as will be including the registration information including the host name of newly created virtual machine point Host node in the cluster is not added and respectively from the cluster configuration information in node), and start each virtual of successful registration Each virtual machine is divided in storage resource group by the distributed computing services of machine and distributed storage service later, will such as be increased newly (the computing resource information and storage resource information of the virtual machine such as newly increased match confidence to the relevant information of each virtual machine added Breath) it is maintained in storage resource group information.

It should be strongly noted that the new virtual machine of operation, the creation of the above-mentioned determining slave number of nodes for needing to expand Distributed storage service in the new virtual machine registered operation, start successful registration of operation, new virtual machine in the cluster Operation with distributed computing services and the operation that the virtual machine after the service that successfully starts up is divided in storage resource group are equal Task performed by each node in current cluster will not be had an impact, and the present embodiment is in having successfully started up virtual machine After distributed storage service and distributed computing services, the virtual machine be formally come into operation as in storage resource group from Node, the host node in cluster can be measured as according to its current store tasks allocation strategy and distribution of computation tasks this from Node distributes corresponding store tasks and calculating task, so that the present embodiment can be in storage resource group smoothly without the increasing of breakpoint Add new from node.It follows that being provided in the case that the present embodiment can not interrupt the service in each node of cluster in storage Increase new slave node in the group of source, stretches cost of implementation so as to avoid cluster brought by service disruption during cluster is stretched Higher problem；In addition, the present embodiment realizes the slave node in storage resource group by using virtual machine, it can be remotely to control Mode processed easily carries out additions and deletions to the slave node in storage resource group, to improve the intelligence degree for realizing that cluster is stretched. In addition, computing resource provided by the different virtual machine of the present embodiment creation may be identical, it is also possible to have differences.

As an example, virtual machine is registered a specific example in the cluster by the present embodiment are as follows: pass through Telnet The newly created virtual machine of the continuous logon attempt of mode can be in the void after success Telnet newly created virtual machine In quasi- machine, exempt from code entry permission for Telnet side's setting of this Telnet, in order to delete subsequent from node During can be convenient this is remotely operated from node；Then, the registration informations such as the host name of the virtual machine are matched It sets in the host node of cluster and respectively from node, adds in the cluster where making the host node in cluster and respectively knowing it from node Add this from node.

As an example, the slave node in computing resource group by virtual machine to realize in the case where, the present embodiment is according to collection One specific implementation process of the slave node in group's dynamic retractility cutback in demand computing resource group are as follows: firstly, determining to calculate money Then the slave number of nodes for needing to reduce in the group of source is chosen from computing resource group accordingly from node according to the quantity, for What is selected is each from node, notifies the host node in cluster and respectively each deletes from cluster from node from node by what is selected (as controlled host node in cluster respectively and respectively respectively being executed the gap between two tasks will select from node at it Respectively deleted from cluster configuration file respectively from the registration information of node), it is deleted in all host nodes and from node After the registration information for the slave node being picked, the virtual machine for realizing the slave node being picked, and the void that will be deleted are deleted The relevant information (such as configuration information of deleted virtual machine) of quasi- machine is deleted from computing resource group information.

It should be strongly noted that since the present embodiment can execute the gap of two tasks to the note in node in node Volume information carries out delete processing, therefore, usually will not be to current cluster during slave node in Reduction Computation resource group In other each nodes performed by task have an impact, and the present embodiment will from node from cluster delete after, divide Dispensing should be from node and the calculating task that should be executed not successfully from node can usually be avoided by the disaster tolerance mechanism of cluster itself The task finally executes failure, so that the present embodiment can balance the slave node in the deletion computing resource group of no breakpoint.Thus It is found that the present embodiment can reduce the slave section in computing resource group in the case where the service in each node for not interrupting cluster Point.In addition, since the computing resource that the difference in computing resource group can be provided from node may be identical, it is also possible to which there are differences It is different, therefore, in the case that the computing resource that the difference in computing resource group can be provided from node has differences, needed determining During the slave node that the quantity for the slave node to be reduced and selection are contracted by, it is considered as in computing resource group respectively from node The computing resource that can be provided.

As an example, the slave node in storage resource group by virtual machine to realize in the case where, the present embodiment is according to collection One specific implementation process of the slave number of nodes in group's dynamic retractility cutback in demand storage resource group are as follows: firstly, determining to deposit Then the slave number of nodes that needs to reduce in storage resource group is chosen from storage resource group accordingly from node according to the quantity, It is each from node for what is selected, notify the host node in cluster and respectively from node by select it is each from node from cluster Delete (will choose as controlled the host node in cluster respectively and respectively respectively executing the gap between two tasks at it from node Each registration information from node out is deleted from cluster configuration file respectively), it is deleted in all host nodes and from node After the registration information for the slave node being picked, the virtual machine for realizing the slave node being picked is deleted, and will be deleted The relevant information (configuration information of such as deleted virtual machine) of virtual machine deleted from storage resource group information.

It should be strongly noted that since the present embodiment can execute the gap of two tasks to the note in node in node Volume information carries out delete processing, therefore, usually will not be to current cluster during the slave node reduced in storage resource group In other each nodes performed by task have an impact, and the present embodiment will from node from cluster delete after, divide Dispensing should be from node and the calculating task that should be executed not successfully from node can usually be kept away by the disaster tolerance mechanism of cluster itself Exempt from the calculating task and finally execute failure, and is existing from the loss of data that should be stored from node caused by node due to deleting this As can usually be restored by the synchronizing process between the data copy of cluster；To which the present embodiment can balance no breakpoint Delete the slave node in computing resource group.It follows that the present embodiment can be in the service in each node for not interrupting cluster In the case of, by remotely controlling the slave node in reduction storage resource group.In addition, due to different from node in storage resource group The storage resource that can be provided may be identical, therefore, different from node institute in storage resource group also by there may be differences In the case that the storage resource that can be provided has differences, it is contracted by the quantity for the slave node that determining needs reduce and selection During node, it is considered as the storage resource that respectively can be provided from node in storage resource group.In addition, in reduction storage money During slave node in the group of source, it is considered as the influence to the computing resource of cluster.

As an example, during being reduced for storage resource group, in general, selected by the present embodiment The slave number of nodes being contracted by out should be less than all differences from node and company-data copy amount in storage resource group；Such as In the case that data copy quantity in the cluster is 3, after being reduced to the slave node in storage resource group, storage Slave number of nodes in resource group should be not less than 3, to reduce the loss of data risk in cluster as far as possible.

As an example, during being reduced for computing resource group, in general, selected by the present embodiment All always calculate provided by the node in total computing resource that the slave node being contracted by out is capable of providing and cluster provides The ratio in source is no more than predetermined ratio (such as 5%), to reduce computation delay caused by re-computation as far as possible.

Embodiment two, in Hadoop cluster for realizing the method for cluster dynamic retractility.

Hadoop cluster in the present embodiment is built by virtual machine.It is taken by remote controlled manner using virtual machine Build a detailed process of Hadoop cluster as shown in Fig. 2, and method shown in Fig. 2 include the following steps:

The virtual machine of the quantity creation respective numbers of S200, the slave node for being included based on presetting Hadoop cluster, During creating each virtual machine, the essential information for virtual machine distribution generally includes: LAN IP address is (such as 192.168.0.62), the login account (such as root) of virtual machine and the login password of virtual machine；

S210, after successfully creating each virtual machine, by each virtual machine of the continuous logon attempt of Telnet mode, with It is available to confirm that the network of each virtual machine is connected to；And after remotely successfully logging in each virtual machine, needle is distinguished in each virtual machine Password login permission is exempted to Telnet side setting, it is subsequent to the control of Hadoop cluster dynamic retractility to facilitate；

S220, a Hos tname (host name) is distributed for each virtual machine, and by the Hos tname of each virtual machine It is respectively arranged in the respective profiles of other all virtual machines (in such as/etc/hos ts file)；

S230, other than the host node in Hadoop cluster, storage resource group and calculating are carried out to all virtual machine The division of resource group, and it is arranged the nodemanager's in the yarn service of each virtual machine according to the hardware configuration of each virtual machine Storage size (i.e. storage resource information) in Cpu information (i.e. computing resource information) and hdfs service, makes Telnet Side can easily adjust the quantity of the slave node in storage resource group and computing resource group；In addition, host node should also be known respectively Storage size in the Cpu information and hdfs service of nodemanager in the yarn service of virtual machine, in order to main section Point can carry out the distribution of calculating task and store tasks；

S240, the installation of the component in each virtual machine is managed by remote controlled manner and is run, component here includes Hdfs service and yarn service；Specifically, the present embodiment can be started in each virtual machine by Telnet mode Ambari service, and each component based on Hadoop in each virtual machine is installed and activated using Restful Api, as starting is deposited The hdfs service and yarn service in each virtual machine in resource group are stored up, the yarn in each virtual machine in computing resource group is started Service etc., so that virtual machine be made to become the slave node in Hadoop cluster.

After successfully building Hadoop cluster using virtual machine, if there is the dilatation demand of Hadoop cluster, held One specific example of capable operation is as shown in figure 3, the method in Fig. 3 includes the following steps:

S300, determination be computing resource group need to increase new slave node or storage resource group need to increase it is new from Node can determine that only storage resource group needs to increase newly such as in the case where the memory space inadequate of Hadoop cluster From node, it is excessive in the calculating task of Hadoop cluster for another example and when forming accumulation or harmful competition, can determine only to count Resource group is calculated to need to increase new slave node, if above-mentioned two situations occur simultaneously, can determine storage resource group with Computing resource group is required to increase new slave node.

S310, increased new slave number of nodes needed for computing resource group and storage resource group is determined, and according to the quantity The new virtual machine of respective numbers is created, distributes Hostname for each new virtual machine, and be respectively in each new virtual machine Password login permission is exempted from long-range control node (i.e. above-mentioned Telnet side) setting.Since virtual machine at this time is not added also Hadoop cluster, therefore, this step will not generate any influence to respectively executing from node in cluster for task.

S320, each new virtual machine is registered to the Ambari in all nodes of Hadoop cluster service in (will such as wrap The registration information including the Hos tname of each new virtual machine is included to be arranged in the configuration file of Ambari service), and start each Corresponding assembly (hdfs service and/or yarn service in such as starting virtual machine) on new virtual machine, to make new void Quasi- machine is added in the computing resource group or storage resource group of Hadoop cluster, and then improves scale and the place of Hadoop cluster Reason ability.

After successfully building Hadoop cluster using virtual machine, if there is the capacity reducing demand of Hadoop cluster, held One specific example of capable operation is as shown in figure 4, the method in Fig. 4 includes the following steps:

S400, determination are that computing resource group needs to reduce from node or the needs reduction of storage resource group from node, are such as existed In the case that the idle memory space of Hadoop cluster is excessively high, it can determine that only storage resource group needs to reduce from node, then Calculating task such as in Hadoop cluster it is very few and formed idle computing resource it is excessively high when, can determine that only computing resource group needs Reduce from node, if above-mentioned two situations occur simultaneously, can determine that storage resource group and computing resource group are both needed to Reduce from node.

S410, the slave number of nodes reduced needed for computing resource group and storage resource group is determined, and according to the quantity from meter Calculate resource group and storage resource group and choose the slave node being contracted by, such as can using storage data quantity it is least from node as being contracted The slave node subtracted, for another example can using the calculating task undertaken it is least from node as the slave node being contracted by.

S420, in the case where not interrupting all services in Hadoop cluster, in Hadoop cluster in addition to being contracted Other all nodes except the slave node subtracted are in the gap for executing first latter two task, respectively by the note for the slave node being contracted by Volume information from Ambari service in configuration file in delete, (such as by Ambari service configuration file in be contracted by from The Hos tname of node is deleted).After each node has been performed both by information deletion operation, S430 is arrived.

S430, the virtual machine for realizing the slave node being contracted by is deleted.

For realizing the method for cluster dynamic retractility in embodiment three, Distributed Architecture.

Method in the Distributed Architecture of the present embodiment for realizing cluster dynamic retractility is by remote with cluster telecommunication Process control node executes, and the process of the method for the embodiment is as shown in Figure 5.

In Fig. 5, S500, long-range control node receive all calculating tasks in the current cluster that reports of host node of cluster It is all from the occupied memory space A2 of the stored data of node in occupied computing resource A1 and current cluster.

S510, long-range control node obtain currently from the computing resource group information and storage resource group information of local maintenance All in cluster can mention from all in the total computing resource Z1 and current cluster that node is capable of providing from node The total storage resource Z2 supplied, and calculate the ratio X 1 of A1 and Z1 and the ratio X 2 of A2 and Z2.

S520, long-range control node judge whether X1 is more than whether first threshold Y1, X1 be lower than second threshold Y2, X2 Whether it is lower than the 4th threshold value Y4 more than third threshold value Y3 and X2；

If it is judged that X1 is more than first threshold Y1 (such as 0.9), it is determined that go out computing resource group and need to add new slave section Point arrives step S531；

If it is judged that X1 be lower than second threshold Y2 (such as 0.4), it is determined that go out computing resource group need to reduce it is existing from Node arrives step S532；

If it is judged that X2 is more than third threshold value Y3 (such as 0.8), it is determined that go out storage resource group and need to add new slave section Point arrives step S533；

If it is judged that X2 be lower than the 4th threshold value Y4 (such as 0.5), it is determined that go out storage resource group need to reduce it is existing from Node arrives step S534.

S531, long-range control node determines the quantity for needing to add new slave node, and creates the virtual machine of respective numbers, It is distributed respectively essential information (such as IP address and host name) for each virtual machine, and in the cluster by the registration of each virtual machine, such as The registration information of each virtual machine is set in the configuration file of all nodes of cluster.

S541, long-range control node control each virtual machine and start its distributed computing services, make virtual machine as in cluster Calculating come into operation from node, long-range control node by the slave node division realized by virtual machine in computing resource group, Such as each it is maintained in what is newly expanded in local computing resource group information from the computing resource information configuration information of node.

S532, long-range control node determine the quantity for needing the slave node reduced, and select to calculate from computing resource group The slave node of the most light respective numbers of task；The registration information for needing the slave node reduced is deleted from each node of cluster.

S542, long-range control node delete each virtual machine realized and need the slave node reduced, and from computing resource group The slave node for needing to reduce is deleted, if each computing resource information configuration information from node that will need to reduce is from local It is deleted in computing resource group information.

S533, long-range control node determines the quantity for needing to add new slave node, and creates the virtual machine of respective numbers, It is distributed respectively essential information (such as IP address and host name) for each virtual machine, and in the cluster by the registration of each virtual machine, such as The registration information of each virtual machine is set in the configuration file of all nodes of cluster.

S543, long-range control node control each virtual machine and start its distributed computing services and distributed storage service, make Virtual machine comes into operation as the storage in cluster from node, the slave node division that long-range control node will be realized by virtual machine It is each from the computing resource information and storage resource information configuration information of node dimension in storage resource group, such as by what is newly expanded It protects in local storage resource group information.

S534, long-range control node determine the quantity for needing the slave node reduced, and storage is selected from storage resource group The slave node of the most light respective numbers of task；The registration information for needing the slave node reduced is deleted from each node of cluster.

S544, long-range control node delete each virtual machine realized and need the slave node reduced, and from storage resource group It deletes and needs the slave node that reduces, such as each match need to reduce from the computing resource information and storage resource information of node Confidence breath is deleted from local storage resource group information.

For realizing the device of cluster dynamic retractility in example IV, Distributed Architecture.

Computer would generally be arranged in the Distributed Architecture of the present embodiment for realizing the device of cluster dynamic retractility In equipment, it is preferred that the device in Distributed Architecture described in this embodiment for realizing cluster dynamic retractility can be set In server, desktop computer and other network equipments.In addition, being used in the Distributed Architecture of the present embodiment of the present embodiment It can be with cluster telecommunication in the computer equipment where the device for realizing cluster dynamic retractility.Cluster in the present embodiment can Think Hadoop cluster, or the cluster based on other Distributed Architecture.The slave section that cluster in the present embodiment is included Point is divided into computing resource group and storage resource group, and the foundation of division waits the description in such as above-described embodiment one, herein no longer Repeated explanation.

It is as shown in Figure 6 for realizing the primary structure of the device of cluster dynamic retractility in the Distributed Architecture of the present embodiment. It is illustrated referring to specific embodiment in Distributed Architecture for realizing the device of cluster dynamic retractility.

In Fig. 6, the device in the Distributed Architecture of the present embodiment for realizing cluster dynamic retractility, which specifically includes that, to be used for It determines there are the device of cluster dynamic retractility demand (following to be referred to as " determining demand device 600 ") and for according to cluster Dynamic retractility demand adjusts device (following referred to as " adjustment of computing resource group and/or the slave number of nodes in storage resource group Resource group device 610 ").

Determine that demand device 600 is mainly used for determining there are cluster dynamic retractility demand, and the demand determined can be with Slave node demand including expanding in computing resource group, expands storage resource group at the slave node demand in Reduction Computation resource group In slave node demand and reduction storage resource group in slave node demand in one or more.

As an example, determining that demand device 600 can excessively form accumulation or harmful competition etc. in the calculating task of cluster In the case where, determine that there are cluster dynamic retractility demands, and determine that demand device 600 can be in the memory space inadequate of cluster Or there are in the case where crisis, determine that there are cluster dynamic retractility demands for memory space.

As an example, above-mentioned determining demand device 600 may include: for being determined according to the clustering performance information got There is the device (following referred to as " first determines demand device 601 ") of cluster dynamic retractility demand out and for according to reception To resource adjustment control information determine there are the device of cluster dynamic retractility demand (it is following referred to as " second determine demands Device 602 ") (as shown in Figure 7).

First determines that clustering performance information accessed by demand device 601 may include: that the computing resource of cluster utilizes At least one of rate and the utilization ratio of storage resources of cluster；And in general, first determine that demand device 601 is got Clustering performance information simultaneously include: the computing resource utilization rate of cluster and the utilization ratio of storage resources of cluster.First determination needs Ask device 601 that can obtain the computing resource utilization rate of cluster and the storage resource utilization of cluster from the host node of cluster Description in rate, detailed process such as above-described embodiment one, this will not be repeated here.

As an example, first determines that demand device 601 can determine that cluster is using the computing resource utilization rate of cluster It is no to there is the slave node demand expanded in slave node demand or Reduction Computation resource group in computing resource group；And first determines Demand device 601 using the utilization ratio of storage resources of cluster can determine cluster with the presence or absence of expand storage resource group in from Slave node demand in node demand or reduction storage resource group.

As an example, first determines that demand device 601 determines whether cluster is deposited using the computing resource utilization rate of cluster One specific example of the slave node demand in the slave node demand or Reduction Computation resource group expanded in computing resource group Son is that first determines that demand device 601 in the case where the computing resource utilization rate for judging cluster is more than first threshold, determines There is the slave node demand expanded in computing resource group out；And first determines demand device 601 in the calculating money for judging cluster In the case that source utilization rate is lower than second threshold, determine that there are the slave node demands in Reduction Computation resource group；And above-mentioned One threshold value is typically much deeper than second threshold.

As an example, first determines that demand device 601 can determine that cluster is using the utilization ratio of storage resources of cluster No one in the presence of the slave node demand in the slave node demand expanded in storage resource group or reduction storage resource group is specifically Example be that first determines that demand device 601 in the utilization ratio of storage resources of judging storage resource group is more than third threshold value In the case of, determine there is the slave node demand expanded in storage resource group；And first determines that demand device 601 is being judged to count The utilization ratio of storage resources of resource group is calculated lower than the slave section in the case where the 4th threshold value, determining to have in reduction storage resource group Point demand；And above-mentioned third threshold value is typically much deeper than the 4th threshold value.

As an example, second determines that demand device 602 can be determined to deposit according to the resource adjustment control information received In cluster dynamic retractility demand；As second determine demand device 602 receiving host node transmission come resource adjustment control letter When breath, determine that there are cluster dynamic retractility demands；Second determine demand device 602 in the resource for receiving user's input for another example When adjustment control information, determine that there are cluster dynamic retractility demands.

As an example, second determines that the received resource of demand device 602 adjustment control information can be with are as follows: expand computing resource The control information of slave node in group, expands in storage resource group at the control information of slave node in Reduction Computation resource group From any one in the control information of the slave node in the control information and reduction storage resource group of node；Or expand It fills any in the control information of the slave node in computing resource group and the control information of the slave node in Reduction Computation resource group One and expand storage resource group in slave node control information and reduce storage resource group in slave node control Any one in information.

Adjustresources group device 610 is mainly used for adjusting computing resource group and/or storage according to cluster dynamic retractility demand Slave number of nodes in resource group.

As an example, adjustresources group device 610 can first be determined to need in computing resource group and/or storage resource group The quantity of the slave node of adjustment, then, adjustresources group device 610 further according to the quantity determined in computing resource group and/or Increase or reduce the slave node of respective numbers in storage resource group.

As an example, the adjustresources group device 610 of the present embodiment can optionally include: to calculate money for determining It is the device (referred to as following " the first quantification device 611 ") for the slave number of nodes that source group needs to expand, above-mentioned for creating The device (following referred to as " creation virtual machine 612 ") of the virtual machine of quantity, for the registration information of virtual machine to be configured The device of device (following to be referred to as " register device 613 ") in the cluster, the distributed computing services for starting virtual machine (following referred to as " the first starting service units 614 ") and for using virtual machine as the dress of the slave node in computing resource group Set (following to be referred to as " the first maintenance device 615 ") (as shown in Figure 8).That is, slave node in computing resource group by Virtual machine is come in the case where realizing, adjustresources group device 610 expands a specific example of the slave node in computing resource group Are as follows: firstly, the first quantification device 611 determines the slave number of nodes for needing to expand in computing resource group, then, creation is empty Quasi- machine device 612 creates the virtual machine of equivalent amount according to the quantity, and register device 613 infuses newly created each virtual machine respectively (such as register device 613 adds the registration information including the host name of newly created virtual machine to volume respectively in the cluster Host node in cluster and respectively from the cluster configuration file in node), the first starting service unit 614 starts successful registration Each virtual machine distributed computing services, later, each virtual machine is divided in computing resource group by the first maintenance device 615, If the first maintenance device 615 is by relevant information (the computing resource information of the virtual machine such as newly increased of each virtual machine newly increased Equal configuration informations) it is maintained in computing resource group information.

It is previously provided with the computing resource utilization rate of preferable cluster in first quantification device 611, is needing counting In the case where calculating the slave node that addition is new in resource group, the first quantification device 611 can be according to all in current cluster The occupied computing resource of calculating task determines total calculating needed for reaching the computing resource utilization rate of above-mentioned preferable cluster Resource calculates the difference of total computing resource provided by required total computing resource and current cluster, then, the first quantification Device 611 determines need in computing resource group according to the computing resource that the slave node that the difference and one newly increase is capable of providing The slave number of nodes to be expanded.Another specific example, in the case where needing the slave node in Reduction Computation resource group, the One quantification device 611 can be determined to reach according to the occupied computing resource of all calculating tasks in current cluster State total computing resource needed for the computing resource utilization rate of preferable cluster, the first quantification device 611 calculates required total The difference of total computing resource provided by computing resource and current cluster, then, the first quantification device 611 is according to the difference And in computing resource group one the slave section for needing to reduce in computing resource group is determined from the computing resource that node is capable of providing Point quantity.

As another example, the adjustresources group device 610 of the present embodiment can optionally include: for determining storage The device (following to be referred to as " the second quantification device 616 ") for the slave number of nodes that resource group needs to expand is used to create State the device (following referred to as " creation virtual machines 612 ") of the virtual machine of quantity, for matching the registration information of virtual machine Set device (referred to as following " register device 613 ") in the cluster, the distributed computing services for starting virtual machine and point The device (following referred to as " second starting service unit 617 ") of cloth storage service and for using virtual machine as storage money The device (following to be referred to as " the second maintenance device 618 ") (as shown in Figure 9) of slave node in the group of source.That is, storing In the case that slave node in resource group is realized by virtual machine, adjustresources group device 610 expand storage resource group in from One specific example of number of nodes are as follows: firstly, what the second quantification device 616 was determined to need to expand in storage resource group From number of nodes, then, creation virtual machine 612 creates the virtual machine of equivalent amount, then, register device according to the quantity 613 register newly created each virtual machine in the cluster that (such as register device 613 is by the host including newly created virtual machine respectively Registration information including name adds host node in the cluster and respectively respectively from the cluster configuration information in node), second Start distributed computing services and the distributed storage service of each virtual machine of the starting successful registration of service unit 617, later, the Each virtual machine is divided in storage resource group by two maintenance devices 618, each virtual machine that such as the second maintenance device 618 will newly increase Relevant information (the computing resource information and storage resource information configuration information of the virtual machine such as newly increased) be maintained in and deposit It stores up in resource group information.

As an example, the utilization ratio of storage resources of preferable cluster is previously provided in the second quantification device 616, In the case where needing to add new slave node in storage resource group, the second quantification device 616 can be according to current cluster In the occupied storage resource of storing data determine needed for reaching the utilization ratio of storage resources of above-mentioned preferable cluster Total storage resource, the second quantification device 616 calculate always storage provided by required total storage resource and current cluster and provide The difference in source, and storage resource group is determined according to the storage resource that the slave node that the difference and one newly increase is capable of providing The middle slave number of nodes for needing to expand.Another specific example, the case where needing to reduce the slave node in storage resource group Under, the second quantification device 616 can be determined according to the occupied storage resource of all storing datas in current cluster Total storage resource needed for reaching the utilization ratio of storage resources of above-mentioned preferable cluster out, the second quantification device 616 calculate The difference of total storage resource provided by required total storage resource and current cluster, then, the second 616, quantification device It is determined in storage resource group according to one in the difference and storage resource group from the storage resource that node is capable of providing and needs to contract The slave number of nodes subtracted.

As an example, above-mentioned register device 613 can optionally include: for Telnet virtual machine device it is (following Referred to as " Telnet device 6131 "), for being directed to the setting of Telnet side in virtual machine exempt from the dress of code entry permission Set (following referred to as " setting authority devices 6132 ") and for matching the registration information including host name of virtual machine It sets in the host node of cluster and respectively from the device (following to be referred to as " configuration information device 6133 ") in node (such as Figure 10 institute Show).That is, virtual machine is registered a specific example in the cluster by register device 613 are as follows: Telnet device 6131, by the newly created virtual machine of the continuous logon attempt of Telnet mode, remotely step in the success of Telnet device 6131 After recording the newly created virtual machine, setting authority device 6132 can be in the virtual machine, for remotely stepping on for this Telnet Record side setting exempt from code entry permission, in order to it is subsequent deletion from node during can be convenient to this from node into Row remote operation；Then, the registration informations such as the host name of the virtual machine are configured the main section in cluster by configuration information device 6133 It puts and respectively from node, is added to this from node in the cluster where making the host node in cluster and respectively knowing it from node.

As another example, the adjustresources group device 610 of the present embodiment can optionally include: to calculate for determining Device (following referred to as " the third quantification devices for the slave number of nodes for needing to reduce in resource group and/or storage resource group 619 "), for choosing device (following referred to as " the selection node apparatus of respective numbers being contracted by from node from respective sets 620 "), for being configured with the node being contracted by from the registration information of node for any one in cluster, two are executed in node The gap of task, being contracted by deletion of node are (following referred to as " to delete registration information dress from the device of the registration information of node Set 621 ") and delete for all nodes in the cluster and to be contracted by the case where the registration information of node, it deletes It realizes and is contracted by virtual machine from node, and will be contracted by and deleted from computing resource group and/or storage resource group from node Device (following to be referred to as " third maintenance device 622 ") (as shown in figure 11).

As an example, the slave node in computing resource group by virtual machine to realize in the case where, adjustresources group device One specific example of the slave node in 610 Reduction Computation resource groups are as follows: firstly, third quantification device 619 is determined to count The slave number of nodes for needing to reduce in resource group is calculated, then, node apparatus 620 is chosen and is selected from computing resource group according to the quantity Take accordingly from node, it is each from node for what is selected, delete registration information device 621 notify host node in cluster and Respectively from node by select it is each from node from cluster delete (as delete registration information device 621 control in cluster respectively Host node and each registration information from node that the gap between two tasks will select respectively respectively is executed from node at it Deleted from cluster configuration file respectively), the slave node being picked is deleted in all host nodes and from node After registration information, third maintenance device 622 deletes the virtual machine for realizing the slave node being picked, and the virtual machine that will be deleted Relevant information (configuration information of such as deleted virtual machine) deleted from computing resource group information.

As an example, the slave node in storage resource group by virtual machine to realize in the case where, adjustresources group device One specific example of the slave number of nodes in 610 reduction storage resource groups are as follows: firstly, third quantification device 619 determines Then the slave number of nodes for needing to reduce in storage resource group out chooses node apparatus 620 according to the quantity from storage resource group Middle selection is each from node for what is selected accordingly from node, deletes the host node in the notice cluster of registration information device 621 And respectively (cluster is controlled respectively as deleted registration information device 621 from node by each delete from cluster from node selected In host node and respectively respectively execute each registration from node that the gap between two tasks will select from node at it Information is deleted from cluster configuration file respectively), the slave section being picked is deleted in all host nodes and from node After the registration information of point, third maintenance device 622 deletes the virtual machine for realizing the slave node being picked, third maintenance device 622 by the relevant information (configuration information for the virtual machine such as being deleted) of deleted virtual machine from storage resource group information It deletes.

As an example, during adjustresources group device 610 is reduced for storage resource group, in usual situation Under, the selected slave number of nodes being contracted by taken out of selection node apparatus 620 should be less than all from node in storage resource group And the difference of company-data copy amount；In the case where being 3 such as data copy quantity in the cluster, in adjustresources group device After slave node in 610 pairs of storage resource groups is reduced, the slave number of nodes in storage resource group should be not less than 3, with Reduce the loss of data risk in cluster as far as possible.

As an example, during adjustresources group device 610 is reduced for computing resource group, in usual situation Under, choose the institute in the total computing resource and cluster that the selected slave node being contracted by taken out of node apparatus 620 is capable of providing There is the ratio of total computing resource provided by the node to be no more than predetermined ratio (such as 5%), to reduce re-computation as far as possible Caused computation delay.

For realizing a concrete application of the device of cluster dynamic retractility in embodiment five, Distributed Architecture.

Firstly, in the current cluster that the host node that the determination demand device 600 in long-range control node receives cluster reports The occupied computing resource A1 of all calculating tasks and current cluster in it is all occupied by the stored data of node Memory space A2.

Secondly, determining computing resource group information and storage resource of the demand device 600 from long-range control node local maintenance All institutes from the total computing resource Z1 and current cluster that node is capable of providing in current cluster are obtained in group information There is the total storage resource Z2 being capable of providing from node, and calculates the ratio X 1 of A1 and Z1 and the ratio X 2 of A2 and Z2.

Determine that demand device 600 judges whether X1 is more than whether first threshold Y1, X1 is lower than whether second threshold Y2, X2 surpass Cross whether third threshold value Y3 and X2 are lower than the 4th threshold value Y4；

If it is determined that demand device 600 judges that X1 is more than first threshold Y1 (such as 0.9), it is determined that demand device 600 is true Computing resource group is made to need to add new slave node；Adjustresources group device 610 in long-range control node, which determines, to be needed to add Add the new quantity from node, and create the virtual machine of respective numbers, adjustresources group device 610 is that each virtual machine distributes respectively Essential information (such as IP address and host name), and in the cluster by the registration of each virtual machine, as adjustresources group device 610 will The registration information of each virtual machine is set in the configuration file of all nodes of cluster.Adjustresources group device 610 controls each void Quasi- machine starts its distributed computing services, and virtual machine is made to come into operation as the calculating in cluster from node, adjustresources group Device 610 by the slave node division realized by virtual machine in computing resource group, as adjustresources group device 610 will newly expand Respectively from the computing resource group information that the configuration informations such as the computing resource information of node are maintained in long-range control node local.

If it is determined that demand device 600 judges X1 lower than second threshold Y2 (such as 0.4), it is determined that demand device 600 is true Make computing resource group need to reduce it is existing from node；Adjustresources group device 610 determines the number for needing the slave node reduced Amount, and the slave node for the respective numbers for selecting calculating task most light from computing resource group；Adjustresources group device 610 is from cluster Each node in delete the registration information of slave node for needing to reduce.Adjustresources group device 610, which is deleted, realizes what needs reduced From each virtual machine of node, and the slave node for needing to reduce is deleted from computing resource group, as adjustresources group device 610 need to That to be reduced each deletes from local computing resource group information from configuration informations such as the computing resource information of node.

If it is determined that demand device 600 judges that X2 is more than third threshold value Y3 (such as 0.8), it is determined that demand device 600 is true Storage resource group is made to need to add new slave node；Adjustresources group device 610 determines the number for needing to add new slave node Amount, and the virtual machine of respective numbers is created, adjustresources group device 610 is that each virtual machine distributes essential information (such as IP respectively Location and host name etc.), and in the cluster by the registration of each virtual machine, if adjustresources group device 610 is by the registration of each virtual machine Information is set in the configuration file of all nodes of cluster.Adjustresources group device 610 controls each virtual machine and starts its distribution Formula calculates service and distributed storage service, so that virtual machine is come into operation as the storage in cluster from node, adjustresources Device 610 is organized by the slave node division realized by virtual machine in storage resource group, as adjustresources group device 610 will newly expand Each be maintained in local storage resource group information from the configuration informations such as the computing resource information of node and storage resource information In.

If it is determined that demand device 600 judges X2 lower than the 4th threshold value Y4 (such as 0.5), it is determined that demand device 600 is true Make storage resource group need to reduce it is existing from node.Adjustresources group device 610 determines the number for needing the slave node reduced Amount, and the slave node for the respective numbers for selecting store tasks most light from storage resource group；Adjustresources group device 610 is from cluster Each node in delete the registration information of slave node for needing to reduce.Adjustresources group device 610, which is deleted, realizes what needs reduced From each virtual machine of node, and the slave node for needing to reduce is deleted from storage resource group, as adjustresources group device 610 need to That to be reduced is each from the configuration informations such as the computing resource information of node and storage resource information from local storage resource group It is deleted in information.

For realizing another concrete application of the device of cluster dynamic retractility in embodiment six, Distributed Architecture.

In the present embodiment, the host node in cluster calculates the occupied calculating money of all calculating tasks in current cluster The ratio X 1 of source A1 and all total computing resource Z1 being capable of providing from node in current cluster, and calculate in current cluster All be capable of providing from the occupied memory space A2 of the stored data of node and all in current cluster from node Total storage resource Z2 ratio X 2, and calculated ratio X 1 and X2 are reported to determining demand device 600, determine that demand fills Set 600 according to the information that receives judge X1 whether be more than first threshold Y1, X1 whether be lower than second threshold Y2, X2 whether be more than Whether third threshold value Y3 and X2 are lower than the 4th threshold value Y4；610 institute of subsequent determining demand device 600 and adjustresources group device The device of execution is identical as the description in above-described embodiment five, and this will not be repeated here.

It should be noted that the present invention can be carried out in the assembly of software and/or software and hardware, for example, Specific integrated circuit (ASIC) can be used in each device of the invention or any other is realized similar to hardware device.At one In embodiment, software program of the invention can be executed to implement the above steps or functions by processor.Similarly, originally The software program (including relevant data structure) of invention can be stored in computer readable recording medium, for example, RAM is deposited Reservoir, magnetic or CD-ROM driver or floppy disc and similar devices.In addition, hardware can be used in some steps of the invention or function It realizes, for example, as the circuit cooperated with processor thereby executing each step or function.

It will be apparent to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter In terms of which, should it regard embodiment as exemplary, and be non-limiting, the scope of the present invention is by institute Attached claim rather than above description limit, accordingly, it is intended to which the meaning and scope of the equivalent requirements of the claims will be fallen in Interior all changes are included in the present invention.It should not treat any reference in the claims as limiting related power Benefit requires.Furthermore, it is to be understood that one word of " comprising " does not exclude other units or steps, odd number is not excluded for plural number.In system claims The multiple units or device of statement can also be implemented through software or hardware by a unit or device.First and second Equal words are used to indicate names, and are not offered as any particular order.

Although front is specifically shown and describes exemplary embodiment, it will be understood to those of skill in the art that It is that without departing substantially from the spirit and scope of claims, can be varied in terms of its form and details.Here Sought protection illustrates in the dependent claims.

Claims

1. for realizing the method for cluster dynamic retractility in a kind of Distributed Architecture, wherein the described method comprises the following steps:

Determine that there are cluster dynamic retractility demands, wherein the cluster includes: multiple from node, and the multiple from node The resource service provided according to it is divided into computing resource group and storage resource group, described to determine that there are cluster dynamic retractilities The step of demand includes: to determine that there are cluster dynamic retractility demands according to the clustering performance information got；And/or root Determine that there are cluster dynamic retractility demands according to the resource adjustment control information received；

The slave number of nodes in the computing resource group and/or storage resource group is adjusted according to the cluster dynamic retractility demand；

Wherein, it is described according to the cluster dynamic retractility demand adjust in the computing resource group and/or storage resource group from The step of number of nodes includes:

Determine the slave number of nodes that computing resource group needs to expand；

Create the virtual machine of the quantity；

In the cluster by the registration information configuration of the virtual machine；

Start the distributed computing services of the virtual machine, alternatively, starting distributed computing services and the distribution of the virtual machine Formula storage service；

Using virtual machine as the slave node in computing resource group.

2. according to the method described in claim 1, wherein, the slave node in the computing resource group provides distributed computing clothes It is engaged in, the slave node in the storage resource group provides distributed storage service and distributed computing services.

3. according to the method described in claim 1, wherein, the cluster dynamic retractility demand includes: to expand in computing resource group Slave node demand, the slave node demand in Reduction Computation resource group, expand storage resource group in slave node demand and contracting Subtract at least one of the slave node demand in storage resource group.

4. according to the method described in claim 1, wherein, the clustering performance information includes: the computing resource utilization rate of cluster And/or the utilization ratio of storage resources of cluster.

5. according to the method described in claim 1, the registration information by the virtual machine is wrapped the step of configuration in the cluster It includes:

Virtual machine described in Telnet；

Exempt from code entry permission for the setting of Telnet side in the virtual machine；

Registration information of the virtual machine including host name is configured in the host node of cluster and respectively from node.

6. for realizing the method for cluster dynamic retractility in a kind of Distributed Architecture, wherein the described method comprises the following steps:

Determine the slave number of nodes for needing to reduce in computing resource group and/or storage resource group；

Being contracted by from node for respective numbers is chosen from respective sets；

For any one in cluster configured with the node being contracted by from the registration information of node, two are executed in the node The gap of a task, the registration information being contracted by from node in deletion of node；

All nodes in the cluster, which delete, to be contracted by the case where the registration information of node, delete realize be contracted by from The virtual machine of node, and will be contracted by from node and be deleted from computing resource group and/or storage resource group.

7. according to the method described in claim 6, wherein:

It is all in the total computing resource and cluster that the slave node that needs to reduce in the computing resource group provides to be provided from node The ratio of total computing resource is no more than predetermined ratio；

The slave number of nodes for needing to reduce in the storage resource group is less than all from number of nodes and cluster in storage resource group The difference of data copy quantity.

8. for realizing the device of cluster dynamic retractility in a kind of Distributed Architecture, wherein include:

For determining that there are the devices of cluster dynamic retractility demand, wherein the cluster includes: multiple from node, and described It is multiple that computing resource group and storage resource group are divided into according to its resource service provided from node, it is described to determine there is collection The step of group's dynamic retractility demand includes: to determine that there are cluster dynamic retractility demands according to the clustering performance information got； And/or determine that there are cluster dynamic retractility demands according to the resource adjustment control information received；

For adjusting the slave number of nodes in the computing resource group and/or storage resource group according to the cluster dynamic retractility demand The device of amount；

It is described to be used to adjust the slave section in the computing resource group and/or storage resource group according to the cluster dynamic retractility demand Point quantity device include:

The device of slave number of nodes for needing to expand for determining computing resource group；

For creating the device of the virtual machine of the quantity；

For the device by the registration information configuration of the virtual machine in the cluster；

For starting the device of the distributed computing services of the virtual machine, or based on the distribution for starting the virtual machine Calculate the device of service and distributed storage service；

For using virtual machine as the device of the slave node in computing resource group.

9. the device according to claim 8 for realizing cluster dynamic retractility, wherein in the computing resource group from Node provides distributed computing services, and the slave node in the storage resource group provides distributed storage service and distributed computing Service.

10. the device according to claim 8 for realizing cluster dynamic retractility, wherein the cluster dynamic retractility needs Ask includes: the slave node demand expanded in computing resource group, the slave node demand in Reduction Computation resource group, expansion storage resource At least one of the slave node demand in slave node demand and reduction storage resource group in group.

11. the device according to claim 8 for realizing cluster dynamic retractility, wherein the clustering performance packet It includes: the computing resource utilization rate of cluster and/or the utilization ratio of storage resources of cluster.

12. the device according to claim 8 for realizing cluster dynamic retractility, described to be used for the note of the virtual machine Volume information configuration device in the cluster includes:

Device for virtual machine described in Telnet；

Exempt from the device of code entry permission for being directed to the setting of Telnet side in the virtual machine；

For configuring the registration information including host name of the virtual machine in the host node of cluster and respectively from node In device.

13. for realizing the device of cluster dynamic retractility in a kind of Distributed Architecture, wherein include:

Wherein, described for being adjusted in the computing resource group and/or storage resource group according to the cluster dynamic retractility demand The device of slave number of nodes include:

For determining the device for the slave number of nodes for needing to reduce in computing resource group and/or storage resource group；

For choosing the device of respective numbers being contracted by from node from respective sets；

For, configured with the node being contracted by from the registration information of node, being held in the node for any one in cluster The gap of two tasks of row, the device being contracted by from the registration information of node in deletion of node；

It deletes and is contracted by the case where the registration information of node for all nodes in the cluster, delete and realize and is contracted Subtract the virtual machine from node, and the device deleted from computing resource group and/or storage resource group from node will be contracted by.

14. the device according to claim 13 for realizing cluster dynamic retractility, in which: