CN105607967A - Data center-oriented energy consumption perception-based data backup method - Google Patents
Data center-oriented energy consumption perception-based data backup method Download PDFInfo
- Publication number
- CN105607967A CN105607967A CN201510846316.7A CN201510846316A CN105607967A CN 105607967 A CN105607967 A CN 105607967A CN 201510846316 A CN201510846316 A CN 201510846316A CN 105607967 A CN105607967 A CN 105607967A
- Authority
- CN
- China
- Prior art keywords
- data
- backup
- energy consumption
- standby
- duplication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention relates to a data center-oriented energy consumption perception-based data backup method and belongs to the technical field of big data energy conservation backup. The method comprises the steps of creating a backup task, starting backup, and performing a repeated data deletion process for all data; recording a current repeated data deletion parameter by a backup dataset subjected to repeated data deletion; calculating the values of Preduced and Pincremental after the repeated data deletion process each time, wherein Preduced represents the energy consumption reduced due to the adoption of the repeated data deletion technology in the backup process, and Pincremental represents the energy consumption increased due to the adoption of the repeated data deletion technology in the backup process; and recording the values in dataset metadata and judging a numerical relationship of Preduced and Pincremental; if the relationship of Preduced<Pincremental occurs for continuous three times in the dataset, indicating that the dataset is no longer subjected to repeated data deletion, namely, the dataset is backed up according to a normal process; on the contrary, as long as the situation of Preduced<Pincremental does not occur for continuous three times in the dataset, continuing to perform repeated data deletion and detection; and no matter whether the repeated data deletion is carried out or not, executing the backup task as usual until the backup task is naturally ended or terminated. According to the method, the purpose of energy consumption optimization is achieved.
Description
Technical field
The present invention relates to a kind of backup method towards energy optimization, belong to the energy-conservation redundancy technique of large data field.
Background technology
In the data back up method of data center, the efficiency of data de-duplication mainly by data set before overweight deletingThe ratio taking up room afterwards decides, and this ratio is commonly called compression ratio or duplicate removal ratio. And repeating data is deletedThe performance of removing is mainly to evaluate by data set overall read-write speed in backup or storing process.
The backup method existing at present has the backup method of facing cloud backup services. The method has a special inspection file to repairChange situation and make the module (middleware) of respective handling, and can be embedded in existing standby system and go. And canDifferent classes of according to data centralization file, has designed an intelligent data deduplication system. This system set notWith the data partition method of dynamics, comprise single-instance (SIS), fixed block is divided (fixed-sizedpartion, FSP),Content-based division (CDC) and sliding shoe are divided (slidingblock). It can be according to the type of file, as textDocument, audio file, image file, linux sound code file etc., select division methods, to obtain best duplicate removal ratio.
In addition also has the backup policy towards throughput. This strategy will heavily be deleted than including the model of measurement index in BACKUP TIMEEnclose, but these indexs can not directly embody the energy-output ratio of backup tasks. Although one have height heavily delete than withLow time overhead, high DET is worth backup data set, necessarily also has higher energy consumption efficiency in backup procedure.
But above backup method can not accurately be distinguished the performance of these backup tasks in energy consumption. Therefore, needFormulate one from energy consumption angle, based on the backup policy of energy consumption perception.
Summary of the invention
Technology of the present invention is dealt with problems and is: overcome the deficiencies in the prior art, provide a kind of data-oriented center based on energy consumptionThe data back up method of perception, to reach the object of energy optimization.
The technology of the present invention solution: the data back up method of a kind of data-oriented center based on energy consumption perception, its feature existsIn comprising the following steps:
(1) backup tasks, once establishment, starts backup, and full backup task all needs through data de-duplication process,Obtain the backup data set of data de-duplication;
(2) backup data set to data de-duplication, records this heavy backup data set parameters of deleting, described parameterComprise size, heavily delete ratio, BACKUP TIME; Wherein size is divided into initial data size, and logic size and process repeating data are deletedExcept being stored in the size of data on medium, i.e. physical size after process; Heavily delete than being that to weigh data de-duplication effect the most directly perceivedData, i.e. the logic size of backup data set and the ratio of physical size; Above three parameters are for calculating in step (3)Intermediate variable;
(3) calculate P after each data de-duplication processreducedAnd PincrementalValue, and be recorded in Backup Data element of setIn data, and judge PreducedAnd PincrementalMagnitude relationship, wherein PreducedRepresent that backup procedure adopts repeating data to deleteThe energy consumption reducing except technology, PincrementalThe energy consumption that represents backup procedure to adopt data de-duplication technology and to increase;
(4) if there is P continuous 3 times in a data setreduced<PincrementalRelation, this backup data set is no longerHeavily delete, backup data set backs up according to normal flow; Otherwise, as long as existing for continuous 3 times does not appear in data setPreduced<PincrementalSituation, continue duplicate removal and detect;
(5) no matter heavily whether delete, backup tasks can as usual be carried out nature and finishes or be terminated.
The number of times of the data de-duplication process in described step (1) is less than 3.
P in described step (2)incrementalComputing formula as follows:
Pincremental=∑nodes((Pdedup-Pstandby)*Tdedup(1.1)
Wherein, TdedupFor the time that data de-duplication expends thereon, (Pdedup-Pstandby) represent, at thisIn the section time, owing to carrying out data de-duplication process, the part that server power increases, PdedupRepresent to carry out repetitionThe power of memory device after data delete procedure, PstandbyThe memory device power under idle condition.
Described (Pdedup-Pstandby) or by (Pdedup-Pstandby) be considered as the mean value that server power increases, and logicalCross energy consumption model
Pactive=Pstandby+Pseq+Prand(1.2)
Resource utilization situation with data de-duplication process, calculates, and wherein Pseq order of representation IO producesEnergy consumption, Prand represents the energy consumption that random IO produces, they are the relevant energy consumption of the load bringing due to IO expense.
P in described step (1)reducedComputing formula as follows:
(PMaxSeqIO-Pstandby) be the poor of memory device power under transmission state and under idle condition,PMaxSeqIORepresent the memory device power under transmission state, PstandbyRepresent the memory device under idle conditionPower; TiosaveRepresent the transmission time reducing due to data volume, record in actual experiment process; NactiveBe illustrated in the quantity of the memory device that transmitting procedure relates to; (Pstandby-Pidle) represent under standby and in low meritMemory device power poor under consumption or closed condition, PidleThe storage of expression under low-power consumption or closed condition establishedStandby power; TbackupRepresent the execution cycle of backup tasks, elapsed time before once carrying out on backup tasks is logicalThe configuration information of crossing backup tasks obtains;Expression can be by the quantity of idle equipment, wherein Senergy_unitBe equal toIn the capacity of the manipulable minimum memory of storage system unit, SdupExpression can be by the capacity of idle equipment,Senergy_unitBe equal to the capacity of the manipulable minimum memory of storage system unit.
The present invention's advantage is compared with prior art:
(1) can not accurately distinguish the performance of these backup tasks in energy consumption using throughput as the index of screening completely.Formulate one from energy consumption angle, the backup policy that the present invention is based on energy consumption perception can reach the object of energy optimization.The impact of comprehensive two aspects is analyzed, and finds energy consumption to increase the equalization point reducing with energy consumption, is to formulate based on energy consumption senseThe key point of the backup policy of knowing. According to the energy consumption changing value calculating, copy the backup policy towards throughput,Data set to backup tasks screens. Then, the data set that standby system only passes through screening is heavily deleted operation,To reach the object of energy optimization.
(2) Fig. 2 has shown the average power consumption performance of standby system under Different Strategies impact, owing to having avoided heavily deletingThe energy that journey consumes utilizes idle hard disk to carry out energy-conservation simultaneously, and application the present invention is towards the backup method of energy optimizationExperimental group power consumption is minimum. Approximately 8% the energy consumption reducing than backup policy towards throughput, can close more how hardThe strategy towards energy optimization of dish has reduced the power of 300W. When data set size is larger, experimental facilities is largerTime, energy-saving effect will be more obvious.
Brief description of the drawings
Fig. 1 is the inventive method realization flow figure;
Fig. 2 is the average power consumption of different pieces of information collection standby system under Different Strategies.
Detailed description of the invention
As shown in Figure 1, be the backup decision making algorithm flow chart towards energy optimization
Because data de-duplication technology has dual character to backup procedure aspect energy consumption, therefore, comprehensive two aspectsImpact is analyzed, and finds energy consumption to increase the equalization point reducing with energy consumption, is to formulate backup policy based on energy consumption perceptionKey point. So-called energy consumption equalization point (energybalancepoint) refers to, and data de-duplication process causes standbyThe energy consumption expense that part server end increases is held because space requirement reduces the energy consumption of saving, the situation that the two is equal with storageUnder, the size of backup data set, heavily delete than etc. parameter state. Find the energy consumption equalization point of data set, needing to calculate shouldData set is after through data de-duplication process, at the energy consumption of server end increase and the energy reducing at storage system endConsumption.
Energy consumption mainly the conduct oneself with dignity extra time overhead of the process of deleting and extra computational resource expense that server end increases. WithGeneral backup procedure is compared, and data de-duplication has taken more computational resource within the more time. According to serviceDevice energy consumption model, can obtain calculating the formula of the energy consumption that this part increases:
Pincremental=∑nodes((Pdedup-Pstandby)*Tdedup(1.4)
Use PincrementalThe energy consumption that represents backup procedure to adopt data de-duplication technology and to increase. A backup tasks canCan need multiple servers to process, need to according to circumstances calculate respectively and gather. And for every station server, repeatIt is T that data are deleted the time of expending thereondedup. This time can heavily deleting than handling up with server by data setRate is calculated the ratio of logic size with physics size of heavily deleting than being backup data set by formula.
(Pdedup-Pstandby) represent, within this period of time, owing to carrying out data de-duplication process, serverThe part of increased power. Can be obtained by experimental result, use the data de-duplication framework proposing heavily to delete, at clothesBusiness device does not have in the situation of other computation-intensive tasks, and its resource service condition is stable. Therefore, for convenientCalculate, by (Pdedup-Pstandby) be considered as the mean value that server power increases, and by energy consumption model (1.2) andThe resource utilization situation of data de-duplication process, calculates result.
The energy consumption that storage system reduces can be divided into two parts, and a part is owing to heavily deleting that little data set size causesTransport overhead reduce the Energy Intensity Reduction bringing, another part is that the reduction taking up room causes how possible memory deviceCan be in low power operation or closed condition, thus the energy consumption of saving. According to server energy consumption model, calculatedThe formula of the energy consumption that this part increases:
Use PreducedThe energy consumption that represents backup procedure to adopt data de-duplication technology and to reduce. Transmitting procedure reduced overheadBe actually due to the transmission working time of associated hard disk reduced and cause. (PMaxSeqIO-Pstandby) be in passingPoor with memory device power under idle condition under defeated state. These memory devices need according to concrete storage systemDetermine with data set size. In general, these have the memory device of significant change with load is mainly hard disk, Ke YitongThe energy consumption model of crossing hard disk calculates, concrete formula (1.2).
In order to simplify sight, think in hard disk in the transmitting procedure here situation in maximum order IO load. TiosaveTableShow the transmission time reducing due to data volume, can pass through data set size, duplicate removal rate and storage system bandwidth calculation obtainArrive. NactiveBe illustrated in the quantity of the memory device that transmitting procedure relates to, refer to the quantity of the hard disk relating to here.(Pstandby-Pidle) represent the poor of memory device power under standby and under low-power consumption or closed condition. Specifically adoptCalculate with which state, which the state conversion that can control these equipment according to storage system determines. TbackupTableShow the execution cycle of backup tasks, elapsed time before once carrying out on backup tasks, can be by the setting of backup tasksInformation obtains.Expression can be by the quantity of idle equipment, wherein Senergy_unitBeing equal to storage system can operateThe capacity of minimum memory unit. Do not allow according to the setting of storage system and function, the equipment minimum unit that it can be controlled may beHard disk, the RAID group of polylith hard disk formation or a storage subsystem that comprises corresponding Control Server.
Can calculate owing to having introduced data de-duplication technology in backup procedure by formula (1.6) aboveThe change of the energy consumption producing. From energy consumption angle, can be right in actual backup procedure by this energy consumption variation predictionBackup tasks data set carries out data de-duplication, and whole to including storage system and backup server producingThe impact of standby system. The energy consumption changing value calculating according to this, copies the backup policy towards throughput, to standbyThe data set of part task screens. Then, the data set that standby system only passes through screening is heavily deleted operation, to reachTo the object of energy optimization. Here the backup wheel subthreshold N before still continuing to use, in the backup policy proposing in the present invention,For pre-check weighing delete than backup round N be appointed as 3. This is because energy consumption prediction equally need to heavily deleting than entering data setRow prediction, need to, through the process of heavily deleting of certain number of times, add up data set characteristic parameter. Set threshold value P,And according to threshold value P and difference (Preduced-Pincremental) magnitude relationship.
In order to make there is higher energy consumption efficiency through the data set of screening in the time carrying out data de-duplication process, in concrete realityIn existing, threshold value P is set to 0. The object of doing is like this to be bound to through the data set of data de-duplication in order to ensureTherefore reduce the energy consumption of standby system. The similar backup policy towards throughput, standby by towards energy optimization of paperThe flow definition of part strategy is as follows, as shown in Figure 1:
(1) backup tasks is once establishment, and starts backup, all need to pass through data de-duplication process.
(2) through the backup data set of data de-duplication, record this heavy parameters of deleting, comprise size, heavily delete ratio,BACKUP TIME etc. And according to heavily deleting rate estimation above, calculate and heavily delete ratio, the logic size of backup data set and physics are largeLittle ratio.
(3) calculate P after each data de-duplication processreducedAnd PincrementalValue, and be recorded in data set metadataIn. And judge PreducedAnd PincrementalMagnitude relationship.
(4) if there is P continuous 3 times in a data setreduced<PincrementalRelation, this data set no longer carries outHeavily delete, data set backs up according to normal flow. Otherwise, as long as data set does not occur having P continuous 3 timesreduced<PincrementalSituation, continue duplicate removal and detect.
(5) no matter heavily whether delete, backup tasks can as usual be carried out nature and finishes or be terminated.
Provide above embodiment to be only used to describe object of the present invention, and do not really want to limit the scope of the invention. ThisBright scope is defined by the following claims. Various being equal to that does not depart from spirit of the present invention and principle and make, replaces and repaiiesChange, all should contain within the scope of the present invention.
Claims (5)
1. the data back up method of data-oriented center based on energy consumption perception, is characterized in that comprising the following steps:
(1) backup tasks, once establishment, starts backup, and full backup task all needs through data de-duplication process,Obtain the backup data set of data de-duplication;
(2) backup data set to data de-duplication, records this heavy backup data set parameters of deleting, described parameterComprise size, heavily delete ratio, BACKUP TIME; Wherein size is divided into initial data size, and logic size and process repeating data are deletedExcept being stored in the size of data on medium, i.e. physical size after process; Heavily delete than being that to weigh data de-duplication effect the most directly perceivedData, i.e. the logic size of backup data set and the ratio of physical size; Above three parameters are for calculating in step (3)Intermediate variable;
(3) calculate P after each data de-duplication processreducedAnd PincrementalValue, and be recorded in Backup Data element of setIn data, and judge PreducedAnd PincrementalMagnitude relationship, wherein PreducedRepresent that backup procedure adopts repeating data to deleteThe energy consumption reducing except technology, PincrementalThe energy consumption that represents backup procedure to adopt data de-duplication technology and to increase;
(4) if there is P continuous 3 times in a data setreduced<PincrementalRelation, this backup data set is no longerHeavily delete, backup data set backs up according to normal flow; Otherwise, as long as existing for continuous 3 times does not appear in data setPreduced<PincrementalSituation, continue duplicate removal and detect;
(5) no matter heavily whether delete, backup tasks can as usual be carried out nature and finishes or be terminated.
2. the data back up method of data-oriented according to claim 1 center based on energy consumption perception, its feature existsIn: the number of times of the data de-duplication process in described step (1) is less than 3.
3. the data back up method of data-oriented according to claim 1 center based on energy consumption perception, is characterized in that:P in described step (2)incrementalComputing formula as follows:
Pincremental=∑nodes((Pdedup-Pstandby)*Tdedup(1.1)
Wherein, TdedupFor the time that data de-duplication expends thereon, (Pdedup-Pstandby) represent, at thisIn the section time, owing to carrying out data de-duplication process, the part that server power increases, PdedupRepresent to carry out repetitionThe power of memory device after data delete procedure, PstandbyThe memory device power under idle condition.
4. the data back up method of data-oriented according to claim 3 center based on energy consumption perception, is characterized in that:Described (Pdedup-Pstandby) or by (Pdedup-Pstandby) be considered as the mean value that server power increases, and pass through energyConsumption model:
Pactive=Pstandby+Pseq+Prand(1.2)
Resource utilization situation with data de-duplication process, calculates, and wherein Pseq order of representation IO producesEnergy consumption, Prand represents the energy consumption that random IO produces, they are the relevant energy consumption of the load bringing due to IO expense.
5. the data back up method of data-oriented according to claim 1 center based on energy consumption perception, is characterized in that:P in described step (1)reducedComputing formula as follows:
(PMaxSeqIO-Pstandby) be the poor of memory device power under transmission state and under idle condition,PMaxSeqIORepresent the memory device power under transmission state, PstandbyRepresent the memory device under idle conditionPower; TiosaveRepresent the transmission time reducing due to data volume, record in actual experiment process; NactiveBe illustrated in the quantity of the memory device that transmitting procedure relates to; (Pstandby-Pidle) represent under standby and in low meritMemory device power poor under consumption or closed condition, PidleThe storage of expression under low-power consumption or closed condition establishedStandby power; TbackupRepresent the execution cycle of backup tasks, elapsed time before once carrying out on backup tasks is logicalThe configuration information of crossing backup tasks obtains;Expression can be by the quantity of idle equipment, wherein Senergy_unitBe equal toIn the capacity of the manipulable minimum memory of storage system unit, SdupExpression can be by the capacity of idle equipment,Senergy_unitBe equal to the capacity of the manipulable minimum memory of storage system unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510846316.7A CN105607967A (en) | 2015-11-27 | 2015-11-27 | Data center-oriented energy consumption perception-based data backup method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510846316.7A CN105607967A (en) | 2015-11-27 | 2015-11-27 | Data center-oriented energy consumption perception-based data backup method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105607967A true CN105607967A (en) | 2016-05-25 |
Family
ID=55987920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510846316.7A Pending CN105607967A (en) | 2015-11-27 | 2015-11-27 | Data center-oriented energy consumption perception-based data backup method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105607967A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134544A (en) * | 2018-02-08 | 2019-08-16 | 广东亿迅科技有限公司 | The method and its system of datamation backup |
CN111859703A (en) * | 2020-07-30 | 2020-10-30 | 暨南大学 | Data center energy-saving data copy placement method based on heat sensing |
WO2022143405A1 (en) * | 2020-12-30 | 2022-07-07 | 欧普照明股份有限公司 | Energy consumption data processing method, cloud server, and energy consumption data processing system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103634133A (en) * | 2012-08-25 | 2014-03-12 | 成都勤智数码科技股份有限公司 | Data backup method suitable for energy consumption control platform |
US20150127612A1 (en) * | 2013-10-30 | 2015-05-07 | Muralidhara R. Balcha | Method and apparatus of managing application workloads on backup and recovery system |
-
2015
- 2015-11-27 CN CN201510846316.7A patent/CN105607967A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103634133A (en) * | 2012-08-25 | 2014-03-12 | 成都勤智数码科技股份有限公司 | Data backup method suitable for energy consumption control platform |
US20150127612A1 (en) * | 2013-10-30 | 2015-05-07 | Muralidhara R. Balcha | Method and apparatus of managing application workloads on backup and recovery system |
Non-Patent Citations (3)
Title |
---|
BO MAO 等: "SAR: SSD Assisted Restore Optimization for Deduplication-based Storage Systems in the Cloud", 《2012 IEEE SEVENTH INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE》 * |
YIZHOU YAN等: "Analysis of Energy Consumption of Deduplication in Storage Systems", 《2015 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY》 * |
阎芳 等: "重复数据删除系统元数据存储布局研究", 《北京理工大学学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134544A (en) * | 2018-02-08 | 2019-08-16 | 广东亿迅科技有限公司 | The method and its system of datamation backup |
CN111859703A (en) * | 2020-07-30 | 2020-10-30 | 暨南大学 | Data center energy-saving data copy placement method based on heat sensing |
WO2022143405A1 (en) * | 2020-12-30 | 2022-07-07 | 欧普照明股份有限公司 | Energy consumption data processing method, cloud server, and energy consumption data processing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111190688B (en) | Cloud data center-oriented Docker migration method and system | |
CN102111337B (en) | Method and system for task scheduling | |
CN103139302A (en) | Real-time copy scheduling method considering load balancing | |
CN110289994B (en) | Cluster capacity adjusting method and device | |
JP5699715B2 (en) | Data storage device and data storage method | |
CN103678579A (en) | Optimizing method for small-file storage efficiency | |
CN102982122A (en) | Repeating data deleting method suitable for mass storage system | |
CN103616944A (en) | Consumption reduction method in cloud storage system based on pre-judging green data classification strategy | |
CN105607967A (en) | Data center-oriented energy consumption perception-based data backup method | |
Qazi et al. | Workload prediction of virtual machines for harnessing data center resources | |
CN112954707B (en) | Energy saving method and device for base station, base station and computer readable storage medium | |
CN107977167A (en) | Optimization method is read in a kind of degeneration of distributed memory system based on correcting and eleting codes | |
CN105630810A (en) | Method for uploading mass small files in distributed storage system | |
CN111666266A (en) | Data migration method and related equipment | |
CN105247492A (en) | Detection of user behavior using time series modeling | |
CN105095495A (en) | Distributed file system cache management method and system | |
CN103095812B (en) | A kind of copy creating method based on user's request response time | |
CN112799597A (en) | Hierarchical storage fault-tolerant method for stream data processing | |
CN114238516A (en) | Data synchronization method, system and computer readable medium | |
CN107341091A (en) | Distributed memory system power consumption management method and device | |
CN113722072A (en) | Storage system file merging method and device based on intelligent distribution | |
CN201804331U (en) | Date deduplication system based on co-processor | |
CN101800771B (en) | Copy selection method based on kernel density estimation | |
KR102089450B1 (en) | Data migration apparatus, and control method thereof | |
CN116225696A (en) | Operator concurrency optimization method and device for stream processing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160525 |