A kind of copy dynamic control method towards cloud computing platform and system thereof
Technical field
A kind of method that the present invention relates to Distributed Calculation and field of cloud calculation, a kind of pair towards cloud computing platform
This dynamic control method and system thereof.
Background technology
Along with computer technology and the development of network technology, Internet scale constantly expands, and the network bandwidth improves constantly.
The fast development of Internet, various information are enlisted the services of wherein, define a huge wide information space, magnanimity
Data leave in this space.
Traditional centralised storage system leaves in data on a single equipment, and all access to data and request will
By this equipment.This way causes the load of this equipment relatively big, becomes the bottleneck of system, it is impossible to meet Mass storage to can
By property and the needs of safety.How to go to store these data, user can rapidly and efficiently be found and obtain required for oneself
Resource, be one of the problem that have to solve of internet development.
In this context, cloud storage is suggested as a kind of brand-new solution.Cloud storage system passes through cluster application, grid
Different types of storage devices a large amount of in network are gathered association by application software by the technology such as technology or distributed file system
With work, common externally offer data storage and Operational Visit function.Not only increase the reliability of system, availability and access
Efficiency, is also easy to extension.
In cloud storage system, data trnascription is the ingredient that it is important.How to create data trnascription, how to delete data trnascription,
How to select data trnascription, how to manage the vital task that copy resource is the management of cloud storage system data trnascription.On suitable opportunity
Create with suitable node or delete corresponding copy, it is possible to being effectively improved the access speed of data, reduce network bandwidth consumption and all
Balance system loads, and can keep the availability that data are higher simultaneously.
Copy resource is moved by the replica management technology demand according to system the parameter index by monitoring system in cloud storage
State planning, adjustment, be one of key element affecting cloud computing system performance.Current cloud storage is based particularly on HDFS framework
Cloud storage system remains some shortcomings in this aspect.
First it is the problem of storage scheduling of resource shortage motility.Owing to the demand of different pieces of information is also not quite similar by user, such one
Some data file can be made to become " focus ", and some data file is relative to unexpected winner.Therefore managed in cloud computing stored copies
It is irrational in journey making no exception these data files, needs a kind of dynamic replication management strategy distinguishing different pieces of information demand.
Next to that the problem that Replica placement selects.Cloud storage system based on HDFS framework is to randomly choose when selecting Replica placement
, so do not account for the communication cost that user accesses.
For the problems referred to above, a kind of Novel cloud stored copies management method of research and system thereof are significant.
Summary of the invention
For overcoming above-mentioned the deficiencies in the prior art, the present invention provide a kind of copy dynamic control method towards cloud computing platform and
System, the method and system thereof are for improving the reliabilty and availability of data in cloud computing.
Realizing the solution that above-mentioned purpose used is:
A kind of copy dynamic control method towards cloud computing platform, it thes improvement is that: said method comprising the steps of:
I, setting copy minimum availability Rexp;
II, minimum number of copies r obtained in unit interval tmin;
III, judge whether system resource exists waste;
IV, the cost sequence of acquisition replica node to standby copy node;
V, acquisition are for placing the standby copy node of copy;
VI, by copy replication to standby copy node.
Further, described step I includes following:
S101, according to design require and systematic function, determine acquiescence number of copies m of current file f;
S102, acquisition replica node set FA={A1, A2..., Am};
S103, using replica node adjacent node as standby copy node set FB={B1, B2..., Bn, set copy minimum and can use
Rate Rexp。
Further, described step II includes following:
The read request number RN of copy in S201, record unit time ti(0 < i < m), effective read request number ERNi(0<i<m)、
Write request number WNi(0 < i < m) and effective write request number EWNi(0<i<m);
S202, obtain the average availability of current copy according to the read-write number of times in unit interval t and effectively read-write number of times
S203, according to described copy minimum availability RexpObtain minimum number of copies rmin, such as following formula (1):
Wherein, RNi(0 < i < m) is the read request number of copy, ERNi(0 < i < m) is effective read request number of copy,
WNi(0 < i < m) is the write request number of copy, EWNi(0 < i < m) is effective write request number of copy, and m is current copy
Number, RexpFor copy minimum availability.
Further, described step III includes following:
S301, according to minimum number of copies rminJudging whether system resource exists waste with current copy number m, waste then enters S302,
Balance then enters S303, and number of copies is crossed and entered S304 at least;
S302, work as rmin< during m, there is waste in system resource, as following formula (2) is worked as according to effective read-write requests acquisition in time t
Average service rate PN of front copyi, by described average service rate PNiMinimum copy is labeled as hidden file, in the time Do not activate in if, delete completely;
Wherein, ERNi(0 < i < m) is effective read request number of copy, EWNi(0 < i < m) is effective write request number of copy;
S303, work as rminDuring=m, maintain copy constant;
S304, work as rmin> m time, the average availability of copy less than set minimum availability Rexp, check in system and whether contain
The hidden file of current copy;If existing, then activating hidden file, if not existing, performing step IV.
Further, described step IV includes following:
S401, obtain two internodal call duration times and network hop count by IP address;
S402, obtain replica node X according to call duration time and network hop count to the renewal cost between standby copy node Y
CX, Y, the reading cost of user node u to standby copy node Y is CU, Y, it is thus achieved that update cost, read cost all such as following formula
(3):
CX, Y=RTX, Y/α+HCX, Y(0<α<1) (3)
Wherein, RTX, YFor from nodes X to Y, then the two-way time from Y to X;α is factor of influence, sets according to network condition
Fixed, span is 0-1;HCX, YNetwork hop count for X to Y;
S403, acquisition replica node A1To standby copy node B1Total cost A of communication produced1B1;
A1B1=CA1, B1+CU, B1 (4)
Wherein, CA1, B1For replica node A1To standby copy node B1Between renewal cost, CU, B1For user node u to standby pair
This node B1Reading cost;
S404, acquisition replica node A1To standby copy node BjTotal cost A of (0 < j < n)1BjThe cost sequence of composition
F1={ A1B1, A1B2..., A1Bn};
S405, acquisition replica node Ai(0 < i < m) arrives standby copy node BjThe cost sequence of (0 < j < n)
Fi={ AiB1, AiB2..., AiBn(0 < i < m).
Further, described step VI includes following:
S501, to each described cost sequence FiCarry out sort ascending, it is thus achieved that cost sequence { AiB1', AiB2' ..., AiBn', reject
Described cost sequence { AiB1', AiB2' ..., AiBn' after (n+m-rmin) individual element obtains new cost sequence
S502, described new cost sequence merger is obtained set MF=F1′∪F2′∪...∪Fm', carry out being incremented by row to set MF
Sequence, obtains cost sequence G={AiBj|AiBj∈ MF, 1≤i≤m, 1≤j≤n};
S503, remove k element after described cost sequence G, k=(m-1) (rmin-m), it is thus achieved that for placing the standby of copy
With the cost sequence MF '={ A of replica nodeiBj′|AiBj' ∈ MF, 1≤i≤m, 1≤j≤n}.
A kind of novel cloud storage system applying to the above-mentioned copy dynamic control method towards cloud computing platform, its improvements exist
In: described system includes monitoring system and the data-storage system connected by user agent module.
Further, described monitoring system includes Replica placement module, copy monitoring system and copy removing module;Described copy
Monitoring system sends information to described Replica placement module and copy removing module, described Replica placement module and copy respectively and deletes
Module sends information to described user agent module respectively;Described user agent module sends information to described copy monitoring system.
Further, described data monitoring system includes replica node and the standby copy node corresponding with described replica node;Institute
State user agent module respectively with the node communication of described data monitoring system.
Compared with prior art, the method have the advantages that
(1) method of the present invention is when network peak period or mechanical disorder, can increase copy dynamically, prevent user from occurring
The situation that cannot access or wait, when visit capacity reduces, can dynamically delete the copy that utilization rate is relatively low, save resource
Unnecessary consumption, remain to keep higher availability simultaneously.
(2) method of the present invention is when Replica placement, it is contemplated that is positioned over and reads cost and update the node that cost is relatively low, from whole
Reduce the communication cost of network on body, preferably improve copy availability;No matter the dynamic replication model that the present invention proposes exists
In memory space utilization rate or compared to traditional fixing Replica placement, there is sizable advantage in communication cost.
(3) the approach application dynamic storage model of the present invention, is monitored the availability of copy in real time, calculates current optimal
Copy amount, when best copy is in right amount more than current copy number, then can use copy replication strategy, if best copy is appropriate
During more than current copy number, then can use copy deletion strategy, reaction is timely and flexible, has ensured that access resource and data can be used
Property.
(4) the approach application dynamic storage model of the present invention, largely reduces the requirement of the network bandwidth, and keeps relatively
High availability;What is more important, the method for the present invention, in the case of replica node breaks down simultaneously, remains to quickly
Realize the recovery of data, ensure the availability of data.
(5) method of the present invention considers focus situation and the problem of Replica placement selection, the replica management model of proposition of data
Deletion copy can be increased dynamically, greatly reduce the demand of the network bandwidth, save Internet resources, ensure availability of data,
System is run more efficient safer.
Accompanying drawing explanation
Fig. 1 is the system construction drawing dynamically controlled towards the copy of cloud computing platform;
Fig. 2 is the dynamic control flow chart of copy towards cloud computing platform.
Detailed description of the invention
Below in conjunction with the accompanying drawings the detailed description of the invention of the present invention is described in further detail.
As it is shown in figure 1, Fig. 1 is the system construction drawing dynamically controlled towards the copy of cloud computing platform;A kind of facing cloud calculates flat
The system that the copy of platform dynamically controls includes user agent module, monitoring system and data-storage system.
Monitoring system includes Replica placement module, copy monitoring system and copy removing module;Copy monitoring system sends letter respectively
Breath to Replica placement module and copy removing module, Replica placement module and copy removing module sends information to user agent respectively
Module;User agent module sends information to copy monitoring system.
Data monitoring system includes replica node and standby copy node (replica node 1 and the standby copy corresponding with replica node
Node 1, replica node 2 and standby copy node 2 ... replica node N and standby copy node N);User agent module is divided
Not and the node communication of data monitoring system.
As in figure 2 it is shown, Fig. 2 is the dynamic control flow chart of copy towards cloud computing platform;A kind of pair towards cloud computing platform
This method dynamically controlled, can effectively improve the availability of copy, reduce unnecessary resource consumption simultaneously when disposing.
The availability to each copy that the copy monitoring system of this system is real-time is monitored, such that it is able to calculate current best copy
Number, deletes the copy that utilization rate is minimum when copy is too much, when copy is less than best copy number, dynamically increases copy,
And it is positioned over rational position.So can either keep higher availability, reduce unnecessary resource consumption simultaneously, save
Cost.The method comprises the following steps:
Step one, setting copy minimum availability Rexp。
Requiring and systematic function according to design, determining the acquiescence replica node number m of current file f, replica node collection is combined into
FA={ A1, A2..., Am, it is F using each copy adjacent node as standby copy node setB={ B1, B2..., Bn}。
Set copy minimum availability Rexp, minimum availability needs copy availability to set according to system, such as 95%, 98% etc..
Step 2, read request number RN according to copyi(0 < i < m), effective read request number ERNi(0 < i < m), write request number
WNi(0 < i < m) and effective write request number EWNi(0 < i < m) obtains minimum number of copies r in unit interval tmin。
Obtain current copy in real time and can use number m, the read request number RN of each copy in record unit time ti(0 < i < m), effectively reads
Number of request ERNi(0 < i < m), write request number WNi(0 < i < m), effective write request number EWNi(0 < i < m).
The average availability of current copy is tried to achieve according to the read-write number of times in unit interval t and effectively read-write number of timesUtilize copy minimum availability RexpDetermine minimum number of copies rmin, determine minimum number of copies rmin
Such as following formula (1):
Wherein, RNi(0 < i < m) is the read request number of copy, ERNi(0 < i < m) is effective read request number,
WNi(0 < i < m) is write request number, EWNi(0 < i < m) is effective write request number, and m is current copy number, RexpFor copy
Minimum availability.
Step 3, according to best copy number rminJudge whether system resource exists waste with current copy number m.
Work as rmin< during m, representing that current copy number is too much, there is waste in system resource, performs step 4 and deletes copy;
Work as rminDuring=m, represent that system resources consumption and copy availability reach balance, maintain copy constant;
Work as rmin> m time, represent current copy number very few, the average availability of copy less than set minimum availability Rexp, perform
Step 5 increases copy.
Step 4, deletion copy.
Average service rate PN of current copy is tried to achieve according to the effective read-write requests number in unit interval ti, such as following formula (2), incite somebody to action
Utilization rate PNiMinimum copy is labeled as hidden file.
Wherein, RNi(0 < i < m) is the read request number of copy, ERNi(0 < i < m) is effective read request number,
EWNi(0 < i < m) is effective write request number, and m is that current copy can use number, and t is the unit time.
Delete copy according to copy hot topic degree, i.e. set the time to be deleted according to the copy effectively read-write number of times in time t,
In the timeDo not activate in if, delete completely,Representing the difference according to read-write number of times, the time to be deleted is the most different, if
Copy is to read to write few more, then erasing time tends to reading number of times and is multiplied by time t, i.e. ERNi× t, if copy is to write many readings less,
Then erasing time tends to EWNi× t, if read-write is equal mutually, then erasing time tends to (EWNi+ERNi)×t。
Step 5, increase copy.
Check the hidden file whether containing current copy in system.If existing, then activate hidden file.If not existing, perform
Subsequent step.
Step 6, obtain internodal call duration time and network hop count according to IP address, thus obtain the renewal cost of node.
Obtain two internodal call duration times and network hop count by IP address, and then calculate pair according to formula (3)
This node AiTo standby copy node BjBetween renewal costUser node u to standby copy node BjReading cost
ForObtain and update cost, reading cost all such as following formula (3):
CX, Y=RTX, Y/α+HCX, Y(0<α<1) (3)
Wherein, RTX, YFor from nodes X to Y, then the two-way time from Y to X, α is that factor of influence is (according to network condition
Setting factor of influence, span is 0-1), HCX, YNetwork hop count for X to Y.
Step 7, acquisition replica node Ai(0 < i < m) arrives standby copy node BjThe cost sequence of (0 < j < n).
Standby copy node BjTotal cost A of communication that (0 < j < n) producesiBjIncluding replica node AiTo standby copy node BjBetween
Renewal costWith user node u to standby copy node BjReading costAs following formula (4) always obtains communication
Cost AiBj;
Wherein, AiBjFor by node A1Start to standby copy node BjTotal cost of (0 < j < n),For replica node AiArrive
Standby copy node BjBetween renewal cost,For user node u to standby copy node BjReading cost.
The total cost of communication according to each standby copy node, it is thus achieved that by node A1Start to standby copy node Bj(0 < j < n) total
Cost A1BjCost sequence F of composition1={A1B1, A1B2..., A1Bn};Thus obtain replica node Ai(0 < i < m) arrives standby copy
Node BjThe cost formula of (0 < j < n), i.e. cost sequence Fi={AiB1, AiB2..., AiBn}(0<i<m)。
Step 8, to cost sequence Fi={AiB1, AiB2..., AiBn(0 < i < m) tentatively delete choosing.First, to cost sequence
FiSort ascending obtains new cost sequence { AiB1′,AiB2′,...,AiBn′};Secondly as current copy number is m, minimum needs
Seeking number is rmin, then also need to select rmin-m copy, and a combination has n element, therefore need after rejecting
n-(rmin-m) individual element, it is thus achieved that the r of Least-costminThe new cost sequence of-m element
M the new cost sequence obtained after tentatively deleting choosing is integrated in a set, i.e. MF=F1′∪F2′∪...∪Fm',
Set MF is carried out sort ascending, obtains sequence cost { AiBj| AiBj∈ MF, 1≤i≤m, 1≤j≤n}.
Make G={AiBj| AiBj∈ MF, 1≤i≤m, 1≤j≤n}, rejects k element below, wherein, k=(m-1) (rmin-m),
Thus obtain rmin-m standby copy node is used for placing copy f, obtains new sequence cost
MF '=(AiBj′|AiBj' ∈ MF, 1≤i≤m, 1≤j≤n}, copies to copy f in the node element in MF ' set.
Finally should be noted that: above example is merely to illustrate the technical scheme of the application rather than the restriction to its protection domain,
Although being described in detail the application with reference to above-described embodiment, those of ordinary skill in the field are it is understood that this area
Technical staff still can carry out all changes, amendment or equivalent to the detailed description of the invention of application after reading the application, but
These changes, amendment or equivalent, all within the claims that application is awaited the reply.