CN104767806B - A kind of methods, devices and systems of cloud data center task backup - Google Patents

A kind of methods, devices and systems of cloud data center task backup Download PDF

Info

Publication number
CN104767806B
CN104767806B CN201510147743.6A CN201510147743A CN104767806B CN 104767806 B CN104767806 B CN 104767806B CN 201510147743 A CN201510147743 A CN 201510147743A CN 104767806 B CN104767806 B CN 104767806B
Authority
CN
China
Prior art keywords
server
backup
task
time
data center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510147743.6A
Other languages
Chinese (zh)
Other versions
CN104767806A (en
Inventor
夏云霓
周刚
罗辛
俞可
朱庆生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quzhou Haiyi Technology Co ltd
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201510147743.6A priority Critical patent/CN104767806B/en
Publication of CN104767806A publication Critical patent/CN104767806A/en
Application granted granted Critical
Publication of CN104767806B publication Critical patent/CN104767806B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)
  • Computer And Data Communications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of methods, devices and systems of cloud data center task backup, belong to cloud computing system control field, the present invention obtains the information of the historical failure time of occurrence of each server of cloud data center first, calculates time between failures and unit interval multistep time between failures growth rate, the stand-by period for calculating next subtask backup, the recent individual server maximum number of faults of calculating, recent individual server minimum number of faults, the serial number of the serial number and backup tasks destination server of backup tasks source server;Then task backup is carried out.The present invention can be according to the variation tendency of each server reliability in cloud data center, the risk that assessment system in future is integrally collapsed, preventative task backup is carried out in advance, and formulates variable control interval time, and it is extreme to avoid " backup overfrequency " and " backup is insufficient " two.

Description

A kind of methods, devices and systems of cloud data center task backup
Technical field
The invention belongs to cloud computing system control field, a kind of method backed up more particularly to cloud data center task, Device and system.
Background technology
Cloud computing is a kind of calculation Internet-based, in this way, shared software and hardware resources and information It can be supplied to computer and other equipment on demand.Relative to traditional software and form is calculated, cloud computing has loose coupling The significant advantages such as conjunction, on-demand, cost is controllable, resource is virtual, isomery collaboration, make its more adapt to e-commerce now, The applications such as flexible manufacturing, mobile Internet.
Cloud data center refers to by multiple isomeries, is carried by what the server of network connection together was formed for carrying For the distributed computing system of the enterprise-level application of online cloud service.In cloud data center, a large amount of server is collected Middle unified management can ensure that server runs required stabilized power source environment, suitable Temperature and Humidity Control and Netowrk tape Wide condition.
The same with other software and hardware systems, there is also the risks of failure and failure for the server in cloud data center.Due to Cloud computing system now more apply to the high loads such as extensive scientific algorithm, real time financial, online transaction, Streaming Media multicast and The application of high complexity, server is in the state of overload operation often, thus the frequency for breaking down and failing is higher And the loss brought is larger.Further, since the when and where distribution of cloud task requests embody erratic behavior and it is artificial accidentally Property, therefore the real time load of cloud system also has more dynamic fluctuation, and then lead to the reliability properties of server in data center It is arbitrarily fluctuated at any time with malfunction and failure risk, it is difficult to carry out preventative control and disaster avoids.Existing task backup skill Art, it tends to be difficult to which the dynamic reliability variation tendency for holding each server in data center has " overfrequency backup " and " backup It is insufficient " the problem of:In order to avoid some server often excessively will frequently be recognized in recent failure or failure, management strategy It is set to the task on the server of high risk and backups to other servers, these task immigrations and backup activity itself is brought very High overhead, and it is identified that the server of high risk actually may be at no distant date there is no breaking down, but because of task It moves out in the idle state to dally, forms waste;It, may if underestimated to the possibility of server failure and failure Cause backup insufficient, comes temporarily in server failure and failure, also many tasks do not have enough time moving out, and then cause in operation Task malfunction therewith, eventually lead to system and integrally collapse.
Existing technical solution is primarily present deficiency below:
(1) means of fixed cycle control are mostly used.Existing method more preset a fixed interval time into The periodic task backup of row.However, due to the dynamic variability of system load, the control strategy of fixed interval is often difficult It is made and being responded rapidly to the instant sudden variation to short time server reliability;
(2) lack the mechanism of quantization trend prediction.Existing technology, not adequately to Server history reliability data Analyzed, modeled and trend prediction, and be mostly machinery use history it is average or data are controlled certainly as foundation recently Plan.
In this context, the reliability state of each server of cloud data center how is dynamically tracked, setting is rational Task backs up opportunity, avoid overfrequency and it is insufficient two it is extreme, it is final to realize before not increasing considerably system operation expense It puts and promotes cloud data center global reliability, become the hot and difficult issue for research.
Invention content
In view of the drawbacks described above of the prior art, technical problem to be solved by the invention is to provide one kind capable of promoting cloud The cloud data center task backup method of data center's reliability.
To achieve the above object, the present invention provides a kind of method of cloud data center task backup, include the following steps:
Step 1: obtaining the information of the historical failure time of occurrence of each server of cloud data center:It is serviced including nearest k times The time ft that device failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2... fnk;K is just Integer;
Step 2: calculating time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
Step 3: calculating the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
Step 4: calculate recent individual server maximum number of faults dgzs, recent individual server minimum number of faults xgzs, The serial number yhx of the backup tasks source server and serial number mdhx of backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
Step 5: carrying out task backup:If it is 0 to have at least one in yxh and mdxh, any operation is not done;If yxh It is not 0 with mdxh, then being carrying out on yxh servers for task backups on mdxh servers;Then etc. It waits for the dt times, returns to step one.
Preferably, in step 2, the improper point determines according to the following steps:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point;
The xs is previously given coefficient, and xs is positive integer.
Another technical problem to be solved by this invention is to provide a kind of cloud number that can promote cloud data center reliability According to central task back-up device.
To achieve the above object, the present invention provides a kind of cloud data center task back-up devices, including malfunction monitoring list Member, control decision module and task backup module;The output end of the malfunction monitoring unit connects the control decision module Input terminal, the output end of the control decision module connect the input terminal of the task backup module;
The malfunction monitoring unit is used to obtain the information of the historical failure time of occurrence of each server of cloud data center;
The control decision module is used to analyze the risk of each server future failure of data center, calculates next time The stand-by period of task backup calculates control decision reference value;
The task backup module is used for the task backup between execute server.
It is calculated preferably, the control decision module includes risk analysis unit, control opportunity decision package and controlled quentity controlled variable Unit;
First output end of the malfunction monitoring unit connects the input terminal of the risk analysis unit;The malfunction monitoring The second output terminal of unit connects the first input end of the control opportunity decision package;The third of the malfunction monitoring unit is defeated Outlet connects the input terminal of the controlled quentity controlled variable computing unit;The output end connection control opportunity of the risk analysis unit determines Second input terminal of plan unit;The output end of the control opportunity decision package connects the first input of the task backup module End, the output end of the controlled quentity controlled variable computing unit connect the second input terminal of the task backup module;
The malfunction monitoring unit obtains the information of the historical failure time of occurrence of each server of cloud data center:Including most The time ft that nearly k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2, ...fnk;K is positive integer;
The risk analysis unit calculates time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
Wherein mean { } is the operation for gathering averaging;
The control opportunity decision package calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
It is minimum that the controlled quentity controlled variable computing unit calculates recent individual server maximum number of faults dgzs, recent individual server The serial number mdhx of number of faults xgzs, the serial number yhx of backup tasks source server and backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
The task backup module carries out task backup:If it is 0 to have at least one in yxh and mdxh, do not do any Operation;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to No. mdxh service On device.
Preferably, the risk analysis unit determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point;
The xs is previously given coefficient, and xs is positive integer.
The present invention technical problem also to be solved is to provide a kind of cloud data that can promote cloud data center reliability Central task standby system.
To achieve the above object, the present invention provides a kind of cloud data center task standby systems, including cloud data center Server is provided with cloud data center back-up device in the cloud data center server;The cloud data center back-up device Including malfunction monitoring unit, control decision module and task backup module;Described in the output end connection of the malfunction monitoring unit The input terminal of control decision module, the output end of the control decision module connect the input terminal of the task backup module;
The malfunction monitoring unit is used to obtain the information of the historical failure time of occurrence of each server of cloud data center;
The control decision module is used to analyze the risk of each server future failure of data center, calculates next time The stand-by period of task backup calculates control decision reference value;
The task backup module is used for the task backup between execute server.
It is calculated preferably, the control decision module includes risk analysis unit, control opportunity decision package and controlled quentity controlled variable Unit;
First output end of the malfunction monitoring unit connects the input terminal of the risk analysis unit;The malfunction monitoring The second output terminal of unit connects the first input end of the control opportunity decision package;The third of the malfunction monitoring unit is defeated Outlet connects the input terminal of the controlled quentity controlled variable computing unit;The output end connection control opportunity of the risk analysis unit determines Second input terminal of plan unit;The output end of the control opportunity decision package connects the first input of the task backup module End, the output end of the controlled quentity controlled variable computing unit connect the second input terminal of the task backup module;
The malfunction monitoring unit obtains the information of the historical failure time of occurrence of each server of cloud data center:Including most The time ft that nearly k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2, ...fnk;K is positive integer;
The risk analysis unit calculates time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
Wherein mean { } is the operation for gathering averaging;
The control opportunity decision package calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
It is minimum that the controlled quentity controlled variable computing unit calculates recent individual server maximum number of faults dgzs, recent individual server The serial number mdhx of number of faults xgzs, the serial number yhx of backup tasks source server and backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
The task backup module carries out task backup:If it is 0 to have at least one in yxh and mdxh, do not do any Operation;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to No. mdxh service On device.
Preferably, the risk analysis unit determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point;The xs is previously given coefficient, and xs is positive integer.
The beneficial effects of the invention are as follows:The present invention has fully considered the dynamic fluctuation of system reliability, by tracking it The rational task backup frequency of trend prediction, while the present invention eliminates the influence of the abnormal point in historical reliability data, it is ensured that The accuracy of trend prediction.The present invention can assess not according to the variation tendency of each server reliability in cloud data center Carry out the risk that system is integrally collapsed, carries out preventative task backup in advance, and formulate variable control interval time, avoid " backup overfrequency " and " backup is insufficient " two is extreme.
Description of the drawings
Fig. 1 is the flow diagram of one specific implementation mode of cloud data center task backup method of the present invention.
Fig. 2 is the principle schematic of one specific implementation mode of cloud data center task back-up device of the present invention.
Fig. 3 is the principle schematic of one specific implementation mode of cloud data center task standby system of the present invention.
Specific implementation mode
The invention will be further described with reference to the accompanying drawings and examples:
As shown in Figure 1, a kind of method of cloud data center task backup, includes the following steps:
Step 1: obtaining the information of the historical failure time of occurrence of each server of cloud data center:It is serviced including nearest k times The time ft that device failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2... fnk;, k is Positive integer;In the present embodiment, the random natural number of the value range of k between 10-100.
Step 2: calculating time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k;
Wherein mean { } is the operation for gathering averaging,
Step 3: calculating the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, in the present embodiment, the value range of csz be 0.1 to 1 second it Between any numerical value.M is the quantity of server in data center.The intuitive meaning of above-mentioned formula is:If occurred in the csz times The expection number of stoppages be less than in entire data center 0.7 times of number of servers, then it is assumed that system failure risk is little, still presses Next round Standby control is carried out according to the scheduled csz stand-by period;Conversely, then with maximum time between failures growth rate in history The ratio of the expection number of stoppages and m that occur in the csz times is calculated, and by csz divided by this ratio, backs up and controls as next round The stand-by period of system.
Step 4: calculate recent individual server maximum number of faults dgzs, recent individual server minimum number of faults xgzs, The serial number yhx of the backup tasks source server and serial number mdhx of backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
Step 5: carrying out task backup:If it is 0 to have at least one in yxh and mdxh, any operation is not done;If yxh It is not 0 with mdxh, then being carrying out on yxh servers for task backups on mdxh servers;Then etc. It waits for the dt times, returns to step one.
Due to actual cloud computing system operation by many system factors (message exception deferral, connection bandwidth variation, Calculation resources conflict etc.) and nonsystematic factor (system, the accidental failure of software and hardware, information drop-out etc.) influence, it is above-mentioned The case where being significantly departing from overall variation rule there are part record value in time between failures sequence, these improper points cannot be by It is considered as general routine data to be analyzed and assessed, and is rejected.The improper point determines according to the following steps:
The average positive and negative cymomotive force of time between failures sequence is calculated, bp and bn, the bp are time between failures sequences The average positive wave fatigue resistance of row, the bn is the average negative wave fatigue resistance of time between failures sequence:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point;
The xs is previously given coefficient, and xs is positive integer.In the present embodiment, xs 10.
As shown in Fig. 2, a kind of cloud data center task back-up device, including malfunction monitoring unit 3, control decision module 4 With task backup module 5;The output end of the malfunction monitoring unit 3 connects the input terminal of the control decision module 4, the control The output end of decision-making module 4 processed connects the input terminal of the task backup module 5.
The malfunction monitoring unit 3 is used to obtain the information of the historical failure time of occurrence of each server of cloud data center.
The control decision module 4 is next for analyzing risk, calculating that data center will break down in each server future The stand-by period of subtask backup calculates control decision reference value.
The task backup module 5 is used for the task backup between execute server.
The control decision module 4 includes that risk analysis unit 401, control opportunity decision package 402 and controlled quentity controlled variable calculate Unit 403.
First output end of the malfunction monitoring unit 3 connects the input terminal of the risk analysis unit 401;The failure The second output terminal of monitoring unit 3 connects the first input end of the control opportunity decision package 402;The malfunction monitoring unit 3 third output end connects the input terminal of the controlled quentity controlled variable computing unit 403;The output end of the risk analysis unit 401 connects Connect the second input terminal of the control opportunity decision package 402;Described in the output end connection of the control opportunity decision package 402 The output end of the first input end of task backup module 5, the controlled quentity controlled variable computing unit 403 connects the task backup module 5 The second input terminal.
The malfunction monitoring unit 3 obtains the information of the historical failure time of occurrence of each server of cloud data center:Including The time ft that nearest k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2... fnk;And by ft1, ft2... ftkAnd fn1, fn2... fnkIt is sent to risk analysis unit, control opportunity decision package With controlled quentity controlled variable computing unit, k is positive integer;In the present embodiment, the random natural number of the value range of k between 10-100.
The risk analysis unit 401 calculates time between failures ifiIncrease with unit interval multistep time between failures Rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k;
Wherein mean { } is the operation for gathering averaging;
Then, zzl values are issued control opportunity decision package by risk analysis unit 401.
The control opportunity decision package 402 calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and in the present embodiment, csz value ranges are between 0.1 to 1 second Any numerical value.M is the quantity of server in data center.Dt values are sent to task by the control opportunity decision package 402 Backup module.
The controlled quentity controlled variable computing unit 403 calculates recent individual server maximum number of faults dgzs, recent individual server The serial number mdhx of minimum number of faults xgzs, the serial number yhx of backup tasks source server and backup tasks destination server.
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
Yxh and mdxh values are sent to task backup module by the controlled quentity controlled variable computing unit 403.
The task backup module 5 carries out task backup:If it is 0 to have at least one in yxh and mdxh, do not do any Operation;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to No. mdxh service On device.
The risk analysis unit 401 determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point.
The xs is previously given coefficient, and xs is positive integer.In the present embodiment, xs values are 10.
As shown in figure 3, a kind of cloud data center task standby system, including cloud data center server 1, the cloud data Cloud data center back-up device 2 is provided in central server 1;The cloud data center back-up device 2 includes malfunction monitoring list Member 3, control decision module 4 and task backup module 5;The output end of the malfunction monitoring unit 3 connects the control decision mould The input terminal of block 4, the output end of the control decision module 4 connect the input terminal of the task backup module 5.
The malfunction monitoring unit 3 is used to obtain the information of the historical failure time of occurrence of each server of cloud data center.
The control decision module 4 is next for analyzing risk, calculating that data center will break down in each server future The stand-by period of subtask backup calculates control decision reference value.
The task backup module 5 is used for the task backup between execute server.
The control decision module 4 includes that risk analysis unit 401, control opportunity decision package 402 and controlled quentity controlled variable calculate Unit 403;
First output end of the malfunction monitoring unit 3 connects the input terminal of the risk analysis unit 401;The failure The second output terminal of monitoring unit 3 connects the first input end of the control opportunity decision package 402;The malfunction monitoring unit 3 third output end connects the input terminal of the controlled quentity controlled variable computing unit 403;The output end of the risk analysis unit 401 connects Connect the second input terminal of the control opportunity decision package 402;Described in the output end connection of the control opportunity decision package 402 The output end of the first input end of task backup module 5, the controlled quentity controlled variable computing unit 403 connects the task backup module 5 The second input terminal.
The malfunction monitoring unit 3 obtains the information of the historical failure time of occurrence of each server of cloud data center:Including The time ft that nearest k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2... fnk;And by ft1, ft2... ftkAnd fn1, fn2... fnkIt is sent to risk analysis unit, control opportunity decision package With controlled quentity controlled variable computing unit, k is positive integer;In the present embodiment, the random natural number of the value range of k between 10-100.
The risk analysis unit 401 calculates time between failures ifiIncrease with unit interval multistep time between failures Rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k;
Wherein mean { } is the operation for gathering averaging;
Then, zzl values are issued control opportunity decision package by risk analysis unit 401.
The control opportunity decision package 402 calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and in the present embodiment, csz value ranges are between 0.1 to 1 second Any numerical value.M is the quantity of server in data center.Dt values are sent to task by the control opportunity decision package 402 Backup module.
The controlled quentity controlled variable computing unit 403 calculates recent individual server maximum number of faults dgzs, recent individual server The serial number mdhx of minimum number of faults xgzs, the serial number yhx of backup tasks source server and backup tasks destination server.
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
Yxh and mdxh values are sent to task backup module by the controlled quentity controlled variable computing unit 403.
The task backup module 5 carries out task backup:If it is 0 to have at least one in yxh and mdxh, do not do any Operation;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to No. mdxh service On device.
The risk analysis unit 401 determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
Wherein, max { } is to gather to ask maximum operation, and min { } is that set asks minimum operation.
When (And)
Or
(And) when, time between failures value ifiIt is improper Point.
The xs is previously given coefficient, and xs is positive integer.In the present embodiment, xs values are 10.
The device for the cloud data center task backup that the embodiment of the present invention was provided analyzed based on reliability trends, can be with Be deployed in an existing server, can also dispose with one be separately provided be exclusively used in analyzing based on reliability trends In the server of cloud data center task backup.For this purpose, the present invention provides a kind of server, including the embodiment of the present invention is carried The cloud data center task back-up device based on reliability trends analysis supplied.One of ordinary skill in the art will appreciate that realizing The process of the cloud data center task backup based on reliability trends analysis, can pass through program instruction in above-described embodiment method Relevant hardware is completed, which executes the correspondence step in the above method when being executed.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be in the protection domain being defined in the patent claims.

Claims (3)

1. a kind of method of cloud data center task backup, it is characterised in that include the following steps:
Step 1: obtaining the information of the historical failure time of occurrence of each server of cloud data center:Including the event of nearest k server Hinder the time ft occurred1, ft2... ftkThe server number fn occurred with this k times failure1, fn2... fnk;K is just whole Number;
Step 2: calculating time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
Step 3: calculating the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
Step 4: calculating recent individual server maximum number of faults dgzs, recent individual server minimum number of faults xgzs, backup The serial number yxh of the task source server and serial number mdxh of backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
Step 5: carrying out task backup:If it is 0 to have at least one in yxh and mdxh, any operation is not done;If yxh and Mdxh is not 0, then being carrying out on yxh servers for task backups on mdxh servers;Then it waits for The dt times, return to step one;
In step 2, the improper point determines according to the following steps:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
When (And)
Or
(And) when, time between failures value ifiFor improper point;
The xs is previously given coefficient, and xs is positive integer.
2. a kind of cloud data center task back-up device, it is characterized in that:Including malfunction monitoring unit (3), control decision module (4) With task backup module (5);The output end of the malfunction monitoring unit (3) connects the input terminal of the control decision module (4), The output end of the control decision module (4) connects the input terminal of the task backup module (5);
The malfunction monitoring unit (3) is used to obtain the information of the historical failure time of occurrence of each server of cloud data center;
The control decision module (4) is used to analyze the risk of each server future failure of data center, calculates next time The stand-by period of task backup calculates control decision reference value;
The task backup module (5) is used for the task backup between execute server;
The control decision module (4) includes risk analysis unit (401), control opportunity decision package (402) and control gauge Calculate unit (403);
First output end of the malfunction monitoring unit (3) connects the input terminal of the risk analysis unit (401);The failure The second output terminal of monitoring unit (3) connects the first input end of the control opportunity decision package (402);The malfunction monitoring The third output end of unit (3) connects the input terminal of the controlled quentity controlled variable computing unit (403);The risk analysis unit (401) Output end connect the second input terminal of the control opportunity decision package (402);The control opportunity decision package (402) Output end connects the first input end of the task backup module (5), the output end connection of the controlled quentity controlled variable computing unit (403) Second input terminal of the task backup module (5);
The malfunction monitoring unit (3) obtains the information of the historical failure time of occurrence of each server of cloud data center:Including most The time ft that nearly k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2, ...fnk;K is positive integer;
The risk analysis unit (401) calculates time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
The control opportunity decision package (402) calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
The controlled quentity controlled variable computing unit (403) calculates recent individual server maximum number of faults dgzs, recent individual server most The serial number mdxh of glitch number xgzs, the serial number yxh of backup tasks source server and backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
The task backup module (5) carries out task backup:If it is 0 to have at least one in yxh and mdxh, any behaviour is not Make;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to mdxh servers On;
The risk analysis unit (401) determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
When (And)
Or
(And) when, time between failures value ifiFor improper point;
The xs is previously given coefficient, and xs is positive integer.
3. a kind of cloud data center task standby system, including cloud data center server (1), it is characterized in that:The cloud data It is provided with cloud data center back-up device (2) in central server (1);The cloud data center back-up device (2) includes failure Monitoring unit (3), control decision module (4) and task backup module (5);The output end of the malfunction monitoring unit (3) connects The output end of the input terminal of the control decision module (4), the control decision module (4) connects the task backup module (5) input terminal;
The malfunction monitoring unit (3) is used to obtain the information of the historical failure time of occurrence of each server of cloud data center;
The control decision module (4) is used to analyze the risk of each server future failure of data center, calculates next time The stand-by period of task backup calculates control decision reference value;
The task backup module (5) is used for the task backup between execute server;
The control decision module (4) includes risk analysis unit (401), control opportunity decision package (402) and control gauge Calculate unit (403);
First output end of the malfunction monitoring unit (3) connects the input terminal of the risk analysis unit (401);The failure The second output terminal of monitoring unit (3) connects the first input end of the control opportunity decision package (402);The malfunction monitoring The third output end of unit (3) connects the input terminal of the controlled quentity controlled variable computing unit (403);The risk analysis unit (401) Output end connect the second input terminal of the control opportunity decision package (402);The control opportunity decision package (402) Output end connects the first input end of the task backup module (5), the output end connection of the controlled quentity controlled variable computing unit (403) Second input terminal of the task backup module (5);
The malfunction monitoring unit (3) obtains the information of the historical failure time of occurrence of each server of cloud data center:Including most The time ft that nearly k server failure occurs1, ft2... ftkThe server number fn occurred with this k times failure1, fn2, ...fnk;K is positive integer;
The risk analysis unit (401) calculates time between failures ifiWith unit interval multistep time between failures growth rate zzl:
ifi=fti+1-fti, 0 < i≤k-1;
Zzl=mean { ZI, j| i < j < k };
The control opportunity decision package (402) calculates the stand-by period dt of next subtask backup:
Wherein, csz is the backup interval time of system default, and m is the quantity of server in data center;
The controlled quentity controlled variable computing unit (403) calculates recent individual server maximum number of faults dgzs, recent individual server most The serial number mdxh of glitch number xgzs, the serial number yxh of backup tasks source server and backup tasks destination server;
Dgzs=max { the gzsj| 0 < j≤m };
Xgzs=min { the gzsj| 0 < j≤m };
It is described
It is described
The gzsjIndicate that j-th of server in the number of faults occurred in the recent period, is calculated as:
It is described
The task backup module (5) carries out task backup:If it is 0 to have at least one in yxh and mdxh, any behaviour is not Make;If yxh and mdxh are not 0, being carrying out on yxh servers for task backups to mdxh servers On;
The risk analysis unit (401) determines improper point by the following method:
Calculate the average positive and negative cymomotive force of time between failures sequence, bp and bn:
When (And)
Or
(And) when, time between failures value ifiFor improper point;It is described Xs is previously given coefficient, and xs is positive integer.
CN201510147743.6A 2015-03-31 2015-03-31 A kind of methods, devices and systems of cloud data center task backup Active CN104767806B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510147743.6A CN104767806B (en) 2015-03-31 2015-03-31 A kind of methods, devices and systems of cloud data center task backup

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510147743.6A CN104767806B (en) 2015-03-31 2015-03-31 A kind of methods, devices and systems of cloud data center task backup

Publications (2)

Publication Number Publication Date
CN104767806A CN104767806A (en) 2015-07-08
CN104767806B true CN104767806B (en) 2018-09-25

Family

ID=53649405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510147743.6A Active CN104767806B (en) 2015-03-31 2015-03-31 A kind of methods, devices and systems of cloud data center task backup

Country Status (1)

Country Link
CN (1) CN104767806B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105204961B (en) * 2015-09-21 2018-10-26 重庆大学 Method, device and system for setting check point of cloud data center host
US10157105B2 (en) 2016-07-28 2018-12-18 Prophetstor Data Services, Inc. Method for data protection for cloud-based service system
TWI608358B (en) * 2016-08-04 2017-12-11 先智雲端數據股份有限公司 Method for data protection in cloud-based service system
CN107707588B (en) * 2016-08-09 2021-03-16 中国移动通信集团公司 Data processing method and cloud classroom system
CN107728929A (en) * 2016-08-10 2018-02-23 先智云端数据股份有限公司 Method for data protection in cloud service system
CN108111625B (en) * 2018-01-02 2019-12-27 Oppo广东移动通信有限公司 Data transmission method and related product
CN108989456B (en) * 2018-08-11 2019-06-25 广东易积网络股份有限公司 A kind of network implementation approach based on big data
CN109213588A (en) * 2018-09-17 2019-01-15 重庆大学 A kind of cloud data center Batch Arrival task allocation apparatus, system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101025741A (en) * 2006-02-17 2007-08-29 鸿富锦精密工业(深圳)有限公司 Database back up system and method
CN102546654A (en) * 2012-02-07 2012-07-04 苏州工业园区飞酷电子科技有限公司 Security management system for server
CN102637169A (en) * 2011-02-14 2012-08-15 大连兆阳软件科技有限公司 Safe and practical method and system for database backup
CN104104730A (en) * 2014-07-25 2014-10-15 重庆广播电视大学 High-reliability cloud system virtual machine oriented task backup device, system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8407518B2 (en) * 2007-10-26 2013-03-26 Vmware, Inc. Using virtual machine cloning to create a backup virtual machine in a fault tolerant system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101025741A (en) * 2006-02-17 2007-08-29 鸿富锦精密工业(深圳)有限公司 Database back up system and method
CN102637169A (en) * 2011-02-14 2012-08-15 大连兆阳软件科技有限公司 Safe and practical method and system for database backup
CN102546654A (en) * 2012-02-07 2012-07-04 苏州工业园区飞酷电子科技有限公司 Security management system for server
CN104104730A (en) * 2014-07-25 2014-10-15 重庆广播电视大学 High-reliability cloud system virtual machine oriented task backup device, system and method

Also Published As

Publication number Publication date
CN104767806A (en) 2015-07-08

Similar Documents

Publication Publication Date Title
CN104767806B (en) A kind of methods, devices and systems of cloud data center task backup
CN111049705B (en) Method and device for monitoring distributed storage system
CN108632365B (en) Service resource adjusting method, related device and equipment
CN108845878A (en) The big data processing method and processing device calculated based on serverless backup
US20150100806A1 (en) Power Supply Engagement and Method Therefor Data
US11361262B2 (en) Blockchain-enabled edge computing method for production scheduling
EP3201717B1 (en) Monitoring of shared server set power supply units
CN115277566B (en) Load balancing method and device for data access, computer equipment and medium
US10623482B2 (en) Server load management for data migration
Birje et al. Cloud monitoring system: basics, phases and challenges
CN105204961B (en) Method, device and system for setting check point of cloud data center host
Nair BEYOND THE CLOUD-UNRAVELING THE BENEFITS OF EDGE COMPUTING IN IOT
GB2613725A (en) Managing communication between microservices
CN112887407B (en) Job flow control method and device for distributed cluster
CN104468710A (en) Mixed big data processing system and method
CN106444685A (en) Distributed control system and method of distributed control system for dynamic scheduling resources
CN107729218A (en) A kind of system and method for monitoring processing computing resource equipment
US8620621B2 (en) Maintenance of intelligent assets
CN112291303A (en) Multidirectional distance dynamic monitoring and early warning method based on edge calculation
CN103973784A (en) Method for effectively utilizing resources of cloud storage server
CN107124314B (en) data monitoring method and device
CN109360118A (en) A kind of plant states monitoring method, device, system, equipment and storage medium
Fang et al. An adaptive job shop scheduling mechanism for disturbances by running reinforcement learning in digital twin environment
CN106201847B (en) Consider method for allocating tasks, the device and system of the decaying of cloud platform host performance
CN105550094B (en) A kind of high-availability system state automatic monitoring method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190702

Address after: 610041 6/F 613, No. 722, No. 6, Middle Section of Yizhou Avenue, Chengdu High-tech Zone, Chengdu City, Sichuan Province

Patentee after: Chengdu Vermont Sichen Technology Co., Ltd.

Address before: 400045 Sha Zheng street, Shapingba District, Chongqing City, No. 174

Patentee before: Chongqing University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211115

Address after: 400030 No. 174, Shapingba Main Street, Shapingba District, Chongqing

Patentee after: Xia Yunni

Address before: 610041 No. 613, 6 / F, building 4, No. 722, middle section of Yizhou Avenue, Chengdu hi tech Zone, Chengdu, Sichuan

Patentee before: Chengdu fumengsichen Technology Co., Ltd

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220419

Address after: 324003 room 304-7, building 10, No. 258, Huayuan East Avenue, Baiyun Street, Kecheng District, Quzhou City, Zhejiang Province

Patentee after: Quzhou Haiyi Technology Co.,Ltd.

Address before: 400030 No. 174 Shapingba street, Shapingba District, Chongqing

Patentee before: Xia Yunni