CN101989929A - Disaster recovery data backup method and system - Google Patents

Disaster recovery data backup method and system Download PDF

Info

Publication number
CN101989929A
CN101989929A CN2010105481461A CN201010548146A CN101989929A CN 101989929 A CN101989929 A CN 101989929A CN 2010105481461 A CN2010105481461 A CN 2010105481461A CN 201010548146 A CN201010548146 A CN 201010548146A CN 101989929 A CN101989929 A CN 101989929A
Authority
CN
China
Prior art keywords
data
data block
backup
current data
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010105481461A
Other languages
Chinese (zh)
Other versions
CN101989929B (en
Inventor
赵巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201010548146.1A priority Critical patent/CN101989929B/en
Publication of CN101989929A publication Critical patent/CN101989929A/en
Priority to PCT/CN2011/073780 priority patent/WO2012065408A1/en
Application granted granted Critical
Publication of CN101989929B publication Critical patent/CN101989929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history
    • H04L41/0853Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
    • H04L41/0856Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information by backing up or archiving configuration information

Abstract

The invention discloses a disaster recovery data backup method and a disaster recovery data backup system, and belongs to the field of network management. The method comprises the following steps of: receiving a data file to be backed up and transmitted by a network management system; segmenting the data file to be backed up to obtain segmented data blocks; by utilizing a weak calibration value hash algorithm and a strong calibration value hash algorithm, calculating data fingerprint values of the data blocks aiming at the data blocks and searching whether a target data block has the same data fingerprint value in the data file which is backed up; if the target data block has the same data fingerprint value, comparing the target data block with a current data block byte by byte; and backing the current data block according to a comparison result. The system comprises a receiving module, a segmentation module, a calculation and search module, a comparison module and a backup module. The technical scheme can improve the applicability of a data backup file, reduce the occupation of storage space and improve system performance.

Description

Disaster tolerance method of data backup and system
Technical field
The invention belongs to network management system, particularly a kind of network management system long-distance disaster method of data backup and system based on data de-duplication.
Background technology
Network management system is the system of management communication network element, has disposed the configuration data of the whole network network element, and these element configuration datas are extremely important, if there are not these configuration datas, network element just can not normally move business.Based on the consideration of disaster tolerance, configuration data need get up in remote backup.In case network management system suffers earthquake, fire etc. and is damaged, then the allocation data recovering of remote backup can be come, to guarantee that net element business can normally move.Generally speaking, the remote backup of configuration data requires backup every day once.
The remote backup technology of existing a kind of disaster tolerance data just simply exports to file with configuration data, and file was named according to the date, copies files in the remote backup system then, but does the problem that can produce data redundancy like this.Also have other network management systems to consider the processing of redundant data of Backup Data, concrete processing method is as follows:
Configuration data is derived, generate text, in order to write down the concrete configuration data of each network element.When copying the text of this generation to remote backup system, standby system can be with the configuration data of today and the configuration data contrast of preserving yesterday, extract the configuration data of vicissitudinous network element, be saved in the backup file of today, the configuration data of the network element that does not change is not then preserved.
There is obvious defects in this way: the file to the network management system backup has strict demand, network management system and standby system will be observed same file format regulation, data backup and later recovery could be realized, all network management systems can not be adapted to, poor for applicability; In addition, also requiring backup file is text, and text can not compress, and it is big to take memory space, and transmits unpressed text, takies the network bandwidth, big to system resources consumption, influences systematic function.
Summary of the invention
In order to improve the applicability of backup data file, reduce memory space and take, improve systematic function, the invention provides a kind of disaster tolerance method of data backup and system, technical scheme is as follows:
A kind of disaster tolerance method of data backup comprises:
Receive the data file to be backed up that network management system sends;
Described data file to be backed up is cut apart the data block that obtains cutting apart;
Utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
If have, then described target data block and described current data block are carried out byte-by-byte comparison;
Carry out the backup of described current data block according to comparative result.
In the preferred embodiment of the present invention, utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files, comprising:
Utilize weak check value hash algorithm to calculate its first data fingerprint value earlier at described current data block, and in described backup data files, search the described target data block that whether has with the identical described first data fingerprint value with the described first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in described backup data files, search the described target data block of the identical described second data fingerprint value at described current data block.
In the preferred embodiment of the present invention, describedly carry out the backup of described current data block according to comparative result, comprising:
When comparative result is identical, determines that then described current data block is the repeating data piece, and store the logic index information of described current data block;
When comparative result not simultaneously, determine that then described current data block is new unique data piece, and store the metamessage of described current data block.
In the preferred embodiment of the present invention,, then store the metamessage of described current data block if in backup data files, do not find the target data block of identical data fingerprint value.
In the preferred embodiment of the present invention, described data file to be backed up is cut apart the data block that obtains cutting apart according to NE quantity.
In the preferred embodiment of the present invention, the metamessage of described current data block comprises: the logic index information of current data block, current data block, the weak check value of current data block and strong check value.
A kind of system of disaster tolerance data backup comprises:
Receiver module is used to receive the data file to be backed up that network management system sends;
Cut apart module, be used for described data file to be backed up being cut apart the data block that obtains cutting apart;
Calculate and search module, be used to utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
Comparison module is used for then described target data block and described current data block being carried out byte-by-byte comparison when backup data files finds the target data block of identical data fingerprint value;
Backup module is used for carrying out according to the comparative result of described comparison module the backup of described current data block.
In the preferred embodiment of the present invention, described calculating and search module, specifically be used for utilizing earlier weak check value hash algorithm to calculate its first data fingerprint value at described current data block, and in described backup data files, search the described target data block that whether has with the identical described first data fingerprint value with the described first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in described backup data files, search the described target data block of the identical described second data fingerprint value at described current data block.
In the preferred embodiment of the present invention, described comparison module specifically is used for when comparative result is identical, determines that then described current data block is the repeating data piece, and stores the logic index information of described current data block;
When comparative result not simultaneously, determine that then described current data block is new unique data piece, and store the metamessage of described current data block.
In the preferred embodiment of the present invention, described backup module if also be used for not finding the target data block of identical data fingerprint value in backup data files, is then stored the metamessage of described current data block.
In the preferred embodiment of the present invention, the metamessage of described current data block comprises: the logic index information of current data block, current data block, the weak check value of current data block and strong check value.
The present invention is by cutting apart the data file to be backed up that receives, utilize weak check value hash algorithm and strong check value hash algorithm that the data block of cutting apart is calculated its data fingerprint value then, with this data fingerprint value is that keyword carries out Hash lookup, behind the target data block that finds the identical data fingerprint value, target data block and current data block are carried out byte-by-byte comparison, and carry out the backup of data block according to comparative result, can realize the data file of various forms is backed up, improve the applicability of backup file; Can carry out the deletion of repeating data in real time, can effectively control the sharp increase of Backup Data, thereby increase effective memory space, improve storage efficiency; And backup file can compress, and reduces taking of the network bandwidth, has improved systematic function.
Description of drawings
Accompanying drawing described herein is used to provide further understanding of the present invention, constitutes a part of the present invention, and illustrative examples of the present invention and explanation thereof are used to explain the present invention, does not constitute improper qualification of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of disaster tolerance method of data backup provided by the invention;
Fig. 2 is the detail flowchart of network management system long-distance disaster method of data backup provided by the invention;
Fig. 3 is the structure chart of the system of disaster tolerance data backup provided by the invention.
Embodiment
In order to make technical problem to be solved by this invention, technical scheme and beneficial effect clearer, clear,, the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
As shown in Figure 1, the invention provides a kind of disaster tolerance method of data backup, comprising:
Step 101 receives the data file to be backed up that network management system sends;
Step 102 is treated backup data files and is cut apart, the data block that obtains cutting apart;
Step 103 is utilized weak check value hash algorithm and strong check value hash algorithm, calculates its data fingerprint value at current data block, and is searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
Step 104 if having, is then carried out byte-by-byte comparison with target data block and current data block;
Step 105 is carried out the backup of described current data block according to comparative result.
In a preferred embodiment of the invention, if in backup data files, do not find the target data block of identical data fingerprint value, then store the metamessage of current data block.
In a preferred embodiment of the invention, carry out the backup of described current data block, comprising according to comparative result: when comparative result is identical, determine that then current data block is the repeating data piece, and the logic index information of storage current data block; When comparative result not simultaneously, determine that then current data block is new unique data piece, and the metamessage of storage current data block.
In a preferred embodiment of the invention, utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files, comprising:
Utilize weak check value hash algorithm to calculate its first data fingerprint value earlier at current data block, and searching the target data block that whether has with the identical first data fingerprint value in the backup data files with the described first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in backup data files, search the target data block of the identical second data fingerprint value at current data block.
In a preferred embodiment of the invention, treat backup data files according to NE quantity and cut apart, the data block that obtains cutting apart.
In a preferred embodiment of the invention, metamessage comprises the logic index information of current data block, current data block, the weak check value and the strong check value of current data block.
Below in conjunction with accompanying drawing the invention process process is described in detail.
As shown in Figure 2, network management system long-distance disaster method of data backup provided by the invention comprises:
Step 201, network management system derives the data file to be backed up of network element configuration, and this data file is transferred to the remote backup system.
Wherein, this data file can be text or binary any formatted file.
Step 202, standby system receive data file to be backed up, should data file to be backed up be divided into one group of data block according to NE quantity.
Particularly, adopting adopted in advance good data block size to treat backup data files cuts apart.The data block size can determine that the configuration data file of 10000 network elements can be 300MB according to NE quantity.The data block granularity of divided file is too thin, and then system resource overhead is too big; Granularity is thick excessively, then the poor effect of data de-duplication.Need the between balance compromise, draw following empirical value according to test: network element is in 1000, the data block size can be 1KB, and network element data block size between 1000 to 5000 can be 4KB, and network element data block size between 5000 to 10000 can be 8KB.
After treating backup data files and cutting apart, standby system distributes unique data block logic index information to each data block, and this logic index information can be the logic call number.
Step 203, standby system be to current data block calculated data fingerprint, and be that keyword is carrying out Hash lookup in the backup data files with the data fingerprint value, obtains the identical target data block of data fingerprint.
Particularly, data fingerprint is the substantive characteristics of data block, and each unique data piece has unique data fingerprint value, and the data fingerprint value is data block contents to be carried out the Hash mathematical operation obtain.Hash algorithm commonly used has FNV1, CRC, MD5, SHA1, SHA-256, SHA-512 etc.Different hash algorithm collision probability of happening differences (all there is collision problem in hash algorithm, and promptly the different pieces of information piece may produce identical data fingerprint), the figure place of the data fingerprint value of calculating is also different, and the corresponding calculated amount is also different.Have the lower collision probability of happening and the hash algorithm of multidata fingerprint value figure place more, its amount of calculation is a lot of greatly.
The data fingerprint of calculated data piece need weighed aspect performance and the Information Security, and the CRChash algorithm is weak verification hash algorithm, calculates fast.In the present embodiment, adopting the data fingerprint value of CRC algorithm computation is 32.The MD5hash algorithm is strong verification hash algorithm, has low-down collision probability of happening, and the data fingerprint value of calculating is 128.Wherein, strong hash algorithm and weak hash algorithm are as the differentiation standard with 128 normally, are lower than 128, then belong to weak hash algorithm, are higher than 128 and belong to strong hash algorithm.Standby system uses CRC hash algorithm and MD5hash algorithm to be data block calculated data fingerprint, and is specific as follows:
At cutting apart each good data block in the step 202, calculate earlier with CRC hash algorithm computation CRC check value, to be keyword carry out hash in backup data files search with this CRC check value then, judge whether the occurrence identical with this CRC check value, if do not have, represent that then this data block is new unique data piece, store the CRC check value and the MD5 check value of logic call number and this data block of this data block, this data block this moment; If exist, then use the MD5 check value of this data block of MD5hash algorithm computation, and carrying out Hash lookup in the backed up data file with this MD5 check value, judge whether the occurrence that this MD5 check value is identical, if have, then judge to have the repeating data piece, and change step 204 over to; If no, then store this data block, and create relevant meta information.
Generally speaking, the MD5Hash algorithm can not produce collision, and the MD5 check value that a data block (block) is calculated is unique, that is to say the corresponding unique data fingerprint value of a data block, can shine upon with 1:1 and represent.The element entry of traditional hash table uses two tuples to represent:
<md5_hashkey, block 〉, wherein md5_hashkey represents the md5 check value of data block.
But in the actual conditions, may exist the MD5 check value of two data blocks identical, for example, the MD5 check value that data block 1 is calculated equals MD5 check value that data block 2 is calculated, at this moment the corresponding data fingerprint value of a plurality of data blocks just needs to use 1:n to shine upon and represents.In the present invention, use tlv triple to represent the element entry of hash table:
<md5_hashkey,block_nr,block_IDs>
Wherein, md5_hashkey represents data block MD5 check value, and block_nr represents the data block quantity that the MD5 check value is identical, and block_IDs represents the logic call number of these data blocks.In algorithm design of the present invention, block_nr and block_IDs are incorporated in a formation chained list, structure is as follows:
block_nr|block_ID1|block_ID2|...|block_IDn
Wherein block_ID1|block_ID2|...|block_IDn is a data block logical block number (LBN) chained list, below represents data block logic call number chained list with block_ID list, and block_nr is the length of chained list.
In the present invention, be actually and use the chained list method to solve the hash collision problem, the block_ID list indefinite length of the element entry of each hash table.It is as follows to utilize the chained list method to search the target data block identical with the MD5 check value in backed up data file:
(1) the MD5 check value hashkey of calculating current data block block, i.e. hashkey=hash_md5 (block);
(2) search the hash table of backup data files with hashkey, bindex=hash_value (hashkey, hash table), wherein, bindex represents the logic call number of current data block;
(3) if in the hash table, do not find the coupling element entry, promptly bindex==NULL then directly inserts hashkey the hash table, and block_nr=1, the logic call number of block_ID1=current data block block;
(4) if find the coupling element entry in the hash table, judge that then this data block may be the repeating data piece, also collision has all taken place in possibility CRC hash algorithm, MD5hash algorithm, and this data block is not the repeating data piece.
Be the first data fingerprint value of utilizing weak check value algorithm computation current data block earlier in this step, carry out the data fingerprint value again and search; And then utilize the second data fingerprint value of strong check value algorithm computation current data block, carry out the data fingerprint value again and search; In actual applications, also can utilize the second data fingerprint value of strong check value algorithm computation current data block earlier, carry out the data fingerprint value again and search; And then the first data fingerprint value of the weak check value algorithm computation current data block of utilization, to carry out the data fingerprint value again and search, concrete principle is similar, does not repeat them here.
Step 204, standby system carries out byte level relatively with target data block and the current data block that obtains, if comparative result is identical, then changes step 205 over to; If the comparative result difference then changes step 206 over to.
Step 205 determines that this current data block is the repeating data piece, stores the logic call number of this current data block.
Step 206 determines that this current data block is new unique, the metamessage of storage current data block, and this metamessage comprises: the logic call number of this current data block, this current data block, CRC check value and MD5 check value.
For step 204-206, accept step 203, after finding the identical match item, travel through the block_ID list of this coupling element entry, the target data block and the current data block of each the data block logic call number correspondence among the block_ID list are carried out byte-by-byte comparison, if identical, illustrate that then current data block exists, store the logic call number of this current data block; If do not find identical block, then store this data block, current data block is inserted block_ID list ending, and current data block is write file, and block_nr numerical value increases by 1, the logic call number of block_IDn=current data block.Wherein, storage of the present invention can adopt the RAID5 mode.
So far, a data file is represented at the just corresponding logical file of standby system, is made up of the metamessage that one group of data fingerprint is formed.
Finishing the data file backup of network element configuration, occurring under the situation that needs to recover, standby system carries out file and reads, read logical file earlier,, take out respective data blocks then according to the data block fingerprint, reduction physics duplicate of the document is issued this duplicate of the document network management system again and is used for recovering.
As shown in Figure 3, the invention provides a kind of system of disaster tolerance data backup, comprising:
Receiver module 301 is used to receive the data file to be backed up that network management system sends;
Cut apart module 302, be used to treat backup data files and cut apart, the data block that obtains cutting apart;
Calculate and search module 303, be used to utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
Comparison module 304 is used for then target data block and current data block being carried out byte-by-byte comparison when backup data files finds the target data block of identical data fingerprint value;
Backup module 305 is used for carrying out according to the comparative result of comparison module 304 backup of current data block.
In a preferred embodiment of the invention, backup module 305 if also be used for not finding the target data block of identical data fingerprint value in backup data files, is then stored the metamessage of current data block.
In a preferred embodiment of the invention, comparison module 304 specifically is used for when comparative result is identical, determines that then current data block is the repeating data piece, and the logic index information of storage current data block;
When comparative result not simultaneously, determine that then current data block is new unique data piece, and the metamessage of storage current data block.
In a preferred embodiment of the invention, calculate and search module 303, specifically be used for utilizing earlier weak check value hash algorithm to calculate its first data fingerprint value at current data block, and searching the target data block that whether has with the identical described first data fingerprint value in the backup data files with the first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in backup data files, search the target data block of the identical described second data fingerprint value at current data block.
In a preferred embodiment of the invention, cut apart module 302, specifically be used for described data file to be backed up being cut apart the data block that obtains cutting apart according to NE quantity.
Existing long-distance disaster data back up method is on the text basis, compares the data de-duplication that carries out according to content of text.The backup of existing disaster tolerance data has 80% data repetition rate, but text is relatively deleted redundant mode, does not reach 80% data de-duplication rate.And be example for the binary file incapability, limited the form of backup file, poor for applicability.Disaster tolerance method of data backup provided by the invention goes for the backup file of various forms, for example text, binary data library file etc.Delete redundant data owing to the binary data blocks that is based on several KB compares, data de-duplication rate height can be near 80%.And the number of times of Backup Data is many more, and is short more at interval, and data de-duplication is than just high more.
Prior art is single for the dividing method or the partition strategy of data file, or is according to file content types, carries out the block boundary feature calculation in advance, and then piecemeal, but actual using on the network management system to configure data, the data de-duplication rate is not high.Disaster tolerance method of data backup provided by the invention is at this specific file format of the data file of network management configuration, with this occasion of configuration data periodic backups, according to NE quantity specified data block size, improved the data de-duplication rate of Backup Data effectively.
A kind of hash algorithm computation of available technology adopting data fingerprint, the mode that disaster tolerance method of data backup provided by the invention adopts weak check value hash algorithm and strong check value hash algorithm to combine is come the calculated data fingerprint, algorithm speed is fast, greatly reduce the probability that collision produces with less performance cost, improved systematic function.In addition, the mode that weak check value hash algorithm and strong check value hash algorithm combine, can carry out the deletion of repeating data in real time, standby system is after receiving the data file of network management system, can the online deletion of carrying out repeating data, change into local logical file and store, processed offline again when not needing that by the time follow-up system has the free time.The entire process cycle is short, can tackle promptly to carry out very short at interval remote backup operation.
Existing disaster tolerance data back up method has only adopted the littler hash algorithm of collision probability when the deletion repeating data, do not solve the collision problem of hash algorithm, so can not be used to the application scenario of network management system long-distance disaster data backup, in case bump the generation enormous economic loss.Disaster tolerance method of data backup provided by the invention is by the identical data block of all data fingerprint values of traversal, and carry out byte and fully relatively solve collision problem, make the Information Security of network management system improve greatly, the long-distance disaster data backup that can be applied to network management system to configure data is this on the very high occasion of the security requirement of data.
In sum, the backup method of disaster tolerance data provided by the invention goes for various document format datas, and applicability is strong; Can effectively control the sharp increase of Backup Data, thereby increase effective memory space, improve storage efficiency, and then saved storage total cost and management cost; Can save the network bandwidth of transfer of data; Can save O﹠M costs such as space, supply of electric power, cooling.
Above-mentioned explanation illustrates and has described a preferred embodiment of the present invention, but as previously mentioned, be to be understood that the present invention is not limited to the disclosed form of this paper, should not regard eliminating as to other embodiment, and can be used for various other combinations, modification and environment, and can in invention contemplated scope described herein, change by the technology or the knowledge of above-mentioned instruction or association area.And change that those skilled in the art carried out and variation do not break away from the spirit and scope of the present invention, then all should be in the protection range of claims of the present invention.

Claims (11)

1. a disaster tolerance method of data backup is characterized in that, comprising:
Receive the data file to be backed up that network management system sends;
Described data file to be backed up is cut apart the data block that obtains cutting apart;
Utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
If have, then described target data block and described current data block are carried out byte-by-byte comparison;
Carry out the backup of described current data block according to comparative result.
2. the method for claim 1, it is characterized in that, utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files, comprising:
Utilize weak check value hash algorithm to calculate its first data fingerprint value earlier at described current data block, and in described backup data files, search the described target data block that whether has with the identical described first data fingerprint value with the described first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in described backup data files, search the described target data block of the identical described second data fingerprint value at described current data block.
3. the method for claim 1 is characterized in that, describedly carries out the backup of described current data block according to comparative result, comprising:
When comparative result is identical, determines that then described current data block is the repeating data piece, and store the logic index information of described current data block;
When comparative result not simultaneously, determine that then described current data block is new unique data piece, and store the metamessage of described current data block.
4. the method for claim 1 is characterized in that, if do not find the target data block of identical data fingerprint value in backup data files, then stores the metamessage of described current data block.
5. as any described method of claim 1 to 4, it is characterized in that, described data file to be backed up is cut apart the data block that obtains cutting apart according to NE quantity.
6. as claim 3 or 4 described methods, it is characterized in that the metamessage of described current data block comprises: the logic index information of current data block, current data block, the weak check value of current data block and strong check value.
7. the system of a disaster tolerance data backup is characterized in that, comprising:
Receiver module is used to receive the data file to be backed up that network management system sends;
Cut apart module, be used for described data file to be backed up being cut apart the data block that obtains cutting apart;
Calculate and search module, be used to utilize weak check value hash algorithm and strong check value hash algorithm, calculate its data fingerprint value at current data block, and searching the target data block whether the identical data fingerprint value is arranged in the backup data files;
Comparison module is used for then described target data block and described current data block being carried out byte-by-byte comparison when backup data files finds the target data block of identical data fingerprint value;
Backup module is used for carrying out according to the comparative result of described comparison module the backup of described current data block.
8. system as claimed in claim 7, it is characterized in that, described calculating and search module, specifically be used for utilizing earlier weak check value hash algorithm to calculate its first data fingerprint value at described current data block, and in described backup data files, search the described target data block that whether has with the identical described first data fingerprint value with the described first data fingerprint value, if have, then utilize strong check value hash algorithm to calculate its second data fingerprint value, and in described backup data files, search the described target data block of the identical described second data fingerprint value at described current data block.
9. system as claimed in claim 7 is characterized in that, described comparison module specifically is used for when comparative result is identical, determines that then described current data block is the repeating data piece, and stores the logic index information of described current data block;
When comparative result not simultaneously, determine that then described current data block is new unique data piece, and store the metamessage of described current data block.
10. system as claimed in claim 7 is characterized in that, described backup module if also be used for not finding the target data block of identical data fingerprint value in backup data files, is then stored the metamessage of described current data block.
11. system is characterized in that as claimed in claim 8 or 9, the metamessage of described current data block comprises: the logic index information of current data block, current data block, the weak check value of current data block and strong check value.
CN201010548146.1A 2010-11-17 2010-11-17 Disaster recovery data backup method and system Active CN101989929B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201010548146.1A CN101989929B (en) 2010-11-17 2010-11-17 Disaster recovery data backup method and system
PCT/CN2011/073780 WO2012065408A1 (en) 2010-11-17 2011-05-06 Disaster tolerance data backup method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010548146.1A CN101989929B (en) 2010-11-17 2010-11-17 Disaster recovery data backup method and system

Publications (2)

Publication Number Publication Date
CN101989929A true CN101989929A (en) 2011-03-23
CN101989929B CN101989929B (en) 2014-07-02

Family

ID=43746287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010548146.1A Active CN101989929B (en) 2010-11-17 2010-11-17 Disaster recovery data backup method and system

Country Status (2)

Country Link
CN (1) CN101989929B (en)
WO (1) WO2012065408A1 (en)

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156727A (en) * 2011-04-01 2011-08-17 华中科技大学 Method for deleting repeated data by using double-fingerprint hash check
CN102184198A (en) * 2011-04-22 2011-09-14 深圳市广道高新技术有限公司 Data deduplication method suitable for working load protecting system
WO2012065408A1 (en) * 2010-11-17 2012-05-24 中兴通讯股份有限公司 Disaster tolerance data backup method and system
CN102541685A (en) * 2011-11-16 2012-07-04 中标软件有限公司 Linux system backup method and Linux system repair method
CN102799598A (en) * 2011-05-25 2012-11-28 英业达股份有限公司 Data recovery method for deleting repeated data
WO2012159532A1 (en) * 2011-05-25 2012-11-29 成都市华为赛门铁克科技有限公司 Data processing method and device
CN103034564A (en) * 2012-12-05 2013-04-10 华为技术有限公司 Data disaster tolerance demonstration and practicing method and data disaster tolerance demonstration and practicing device and system
CN103259729A (en) * 2012-12-10 2013-08-21 上海德拓信息技术有限公司 Network data compaction transmission method based on zero collision hash algorithm
CN103269352A (en) * 2012-12-07 2013-08-28 北京奇虎科技有限公司 Point-to-point (P2P) file downloading method and device
CN103269351A (en) * 2012-12-07 2013-08-28 北京奇虎科技有限公司 File download method and device
CN103365745A (en) * 2013-06-07 2013-10-23 上海爱数软件有限公司 Block level backup method based on content-addressed storage and system
CN103399853A (en) * 2013-06-28 2013-11-20 苏州海客科技有限公司 Method for selecting file cutting granularity
CN103428242A (en) * 2012-05-18 2013-12-04 阿里巴巴集团控股有限公司 Method, device and system for increment synchronization
CN103473278A (en) * 2013-08-28 2013-12-25 苏州天永备网络科技有限公司 Repeating data processing technology
CN103713963A (en) * 2012-09-29 2014-04-09 南京壹进制信息技术有限公司 Efficient file backup and restoration method
CN103744939A (en) * 2013-12-31 2014-04-23 华为技术有限公司 Recording method of log, recovering method of log and log manager
CN103795783A (en) * 2014-01-14 2014-05-14 上海上讯信息技术股份有限公司 Data synchronization method and system
CN103942125A (en) * 2014-05-06 2014-07-23 南宁博大全讯科技有限公司 Automatic backup method and system
CN103970852A (en) * 2014-05-06 2014-08-06 浪潮电子信息产业股份有限公司 Data de-duplication method of backup server
CN104375905A (en) * 2014-11-07 2015-02-25 北京云巢动脉科技有限公司 Incremental backing up method and system based on data block
CN104484402A (en) * 2014-12-15 2015-04-01 杭州华三通信技术有限公司 Method and device for deleting repeating data
CN104750743A (en) * 2013-12-31 2015-07-01 中国银联股份有限公司 System and method for ticking and rechecking transaction files
CN106326035A (en) * 2016-08-13 2017-01-11 南京叱咤信息科技有限公司 File-metadata-based incremental backup method
CN106802841A (en) * 2017-01-19 2017-06-06 四川奥诚科技有限责任公司 Data extract analytic method, device and server
CN106817419A (en) * 2017-01-19 2017-06-09 四川奥诚科技有限责任公司 Data based on VoLTE AS network elements extract analytic method, device and service terminal
CN106934293A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 The collision calculation device and collision calculation method of digital digest
CN104268034B (en) * 2014-10-09 2017-11-07 中国人民解放军国防科学技术大学 A kind of data back up method and device and data reconstruction method and device
CN107346271A (en) * 2016-05-05 2017-11-14 华为技术有限公司 The method and calamity of Backup Data block are for end equipment
CN107704342A (en) * 2017-09-26 2018-02-16 郑州云海信息技术有限公司 A kind of snap copy method, system, device and readable storage medium storing program for executing
CN107729766A (en) * 2017-09-30 2018-02-23 中国联合网络通信集团有限公司 Date storage method, method for reading data and its system
CN108089949A (en) * 2017-12-29 2018-05-29 广州创慧信息科技有限公司 A kind of method and system of automatic duplicating of data
CN108090355A (en) * 2017-11-28 2018-05-29 西安交通大学 A kind of APK automatic triggers instrument
WO2019141128A1 (en) * 2018-01-18 2019-07-25 阿里巴巴集团控股有限公司 Data processing method, apparatus and device
CN110618790A (en) * 2019-09-06 2019-12-27 上海电力大学 Mist storage data redundancy removing method based on repeated data deletion
CN110692047A (en) * 2019-05-19 2020-01-14 深圳齐心集团股份有限公司 Stationery information scheduling system based on big data
CN110692045A (en) * 2019-05-19 2020-01-14 深圳齐心集团股份有限公司 Big data-based stationery information distributed planning system
CN112202910A (en) * 2020-10-10 2021-01-08 上海威固信息技术股份有限公司 Computer distributed storage system
CN113254262A (en) * 2020-02-13 2021-08-13 中国移动通信集团广东有限公司 Database disaster tolerance method and device and electronic equipment
CN114691430A (en) * 2022-04-24 2022-07-01 北京科技大学 Incremental backup method and system for CAD (computer-aided design) engineering data files

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI588670B (en) * 2016-05-25 2017-06-21 精品科技股份有限公司 System and method for segment backup

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101814045A (en) * 2010-04-22 2010-08-25 华中科技大学 Data organization method for backup services

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101989929B (en) * 2010-11-17 2014-07-02 中兴通讯股份有限公司 Disaster recovery data backup method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216791A (en) * 2008-01-04 2008-07-09 华中科技大学 File backup method based on fingerprint
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101814045A (en) * 2010-04-22 2010-08-25 华中科技大学 Data organization method for backup services

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
廖海生,赵跃龙: "基于MD5算法的重复数据删除技术的研究与改进", 《计算机测量与控制》, vol. 18, no. 3, 31 March 2010 (2010-03-31), pages 635 - 638 *
廖竣锴: "基于Internet的容灾系统的设计与实现", 《CNKI优秀硕士学位论文全文库》, 14 July 2005 (2005-07-14), pages 93 - 99 *

Cited By (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012065408A1 (en) * 2010-11-17 2012-05-24 中兴通讯股份有限公司 Disaster tolerance data backup method and system
CN102156727A (en) * 2011-04-01 2011-08-17 华中科技大学 Method for deleting repeated data by using double-fingerprint hash check
CN102184198A (en) * 2011-04-22 2011-09-14 深圳市广道高新技术有限公司 Data deduplication method suitable for working load protecting system
CN102184198B (en) * 2011-04-22 2016-04-27 张伟 Be applicable to the data de-duplication method of operating load protection system
CN102799598A (en) * 2011-05-25 2012-11-28 英业达股份有限公司 Data recovery method for deleting repeated data
WO2012159532A1 (en) * 2011-05-25 2012-11-29 成都市华为赛门铁克科技有限公司 Data processing method and device
CN102541685A (en) * 2011-11-16 2012-07-04 中标软件有限公司 Linux system backup method and Linux system repair method
CN103428242B (en) * 2012-05-18 2016-12-14 阿里巴巴集团控股有限公司 A kind of method of increment synchronization, Apparatus and system
CN103428242A (en) * 2012-05-18 2013-12-04 阿里巴巴集团控股有限公司 Method, device and system for increment synchronization
CN103713963B (en) * 2012-09-29 2017-06-23 南京壹进制信息技术股份有限公司 A kind of efficient file backup and restoration methods
CN103713963A (en) * 2012-09-29 2014-04-09 南京壹进制信息技术有限公司 Efficient file backup and restoration method
CN103034564B (en) * 2012-12-05 2016-06-15 华为技术有限公司 Data disaster tolerance drilling method, data disaster tolerance practice device and system
CN103034564A (en) * 2012-12-05 2013-04-10 华为技术有限公司 Data disaster tolerance demonstration and practicing method and data disaster tolerance demonstration and practicing device and system
CN103269352A (en) * 2012-12-07 2013-08-28 北京奇虎科技有限公司 Point-to-point (P2P) file downloading method and device
CN103269351A (en) * 2012-12-07 2013-08-28 北京奇虎科技有限公司 File download method and device
CN103259729A (en) * 2012-12-10 2013-08-21 上海德拓信息技术有限公司 Network data compaction transmission method based on zero collision hash algorithm
CN103259729B (en) * 2012-12-10 2018-03-02 上海德拓信息技术股份有限公司 Network data compaction transmission method based on zero collision hash algorithm
CN103365745A (en) * 2013-06-07 2013-10-23 上海爱数软件有限公司 Block level backup method based on content-addressed storage and system
CN103399853A (en) * 2013-06-28 2013-11-20 苏州海客科技有限公司 Method for selecting file cutting granularity
CN103473278A (en) * 2013-08-28 2013-12-25 苏州天永备网络科技有限公司 Repeating data processing technology
CN103744939A (en) * 2013-12-31 2014-04-23 华为技术有限公司 Recording method of log, recovering method of log and log manager
CN104750743A (en) * 2013-12-31 2015-07-01 中国银联股份有限公司 System and method for ticking and rechecking transaction files
CN103795783A (en) * 2014-01-14 2014-05-14 上海上讯信息技术股份有限公司 Data synchronization method and system
CN103970852A (en) * 2014-05-06 2014-08-06 浪潮电子信息产业股份有限公司 Data de-duplication method of backup server
CN103942125A (en) * 2014-05-06 2014-07-23 南宁博大全讯科技有限公司 Automatic backup method and system
CN104268034B (en) * 2014-10-09 2017-11-07 中国人民解放军国防科学技术大学 A kind of data back up method and device and data reconstruction method and device
CN104375905A (en) * 2014-11-07 2015-02-25 北京云巢动脉科技有限公司 Incremental backing up method and system based on data block
CN104484402A (en) * 2014-12-15 2015-04-01 杭州华三通信技术有限公司 Method and device for deleting repeating data
CN104484402B (en) * 2014-12-15 2018-02-09 新华三技术有限公司 A kind of method and device of deleting duplicated data
CN106934293B (en) * 2015-12-29 2020-04-24 航天信息股份有限公司 Collision calculation device and method for digital abstract
CN106934293A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 The collision calculation device and collision calculation method of digital digest
CN107346271A (en) * 2016-05-05 2017-11-14 华为技术有限公司 The method and calamity of Backup Data block are for end equipment
CN106326035A (en) * 2016-08-13 2017-01-11 南京叱咤信息科技有限公司 File-metadata-based incremental backup method
CN106802841A (en) * 2017-01-19 2017-06-06 四川奥诚科技有限责任公司 Data extract analytic method, device and server
CN106817419B (en) * 2017-01-19 2020-06-30 四川奥诚科技有限责任公司 VoLTE AS network element-based data extraction and analysis method and device and service terminal
CN106817419A (en) * 2017-01-19 2017-06-09 四川奥诚科技有限责任公司 Data based on VoLTE AS network elements extract analytic method, device and service terminal
CN107704342A (en) * 2017-09-26 2018-02-16 郑州云海信息技术有限公司 A kind of snap copy method, system, device and readable storage medium storing program for executing
CN107729766B (en) * 2017-09-30 2020-02-07 中国联合网络通信集团有限公司 Data storage method, data reading method and system thereof
CN107729766A (en) * 2017-09-30 2018-02-23 中国联合网络通信集团有限公司 Date storage method, method for reading data and its system
CN108090355A (en) * 2017-11-28 2018-05-29 西安交通大学 A kind of APK automatic triggers instrument
CN108089949A (en) * 2017-12-29 2018-05-29 广州创慧信息科技有限公司 A kind of method and system of automatic duplicating of data
WO2019141128A1 (en) * 2018-01-18 2019-07-25 阿里巴巴集团控股有限公司 Data processing method, apparatus and device
TWI700905B (en) * 2018-01-18 2020-08-01 香港商阿里巴巴集團服務有限公司 Data processing method, device and equipment
CN110692045A (en) * 2019-05-19 2020-01-14 深圳齐心集团股份有限公司 Big data-based stationery information distributed planning system
CN110692047A (en) * 2019-05-19 2020-01-14 深圳齐心集团股份有限公司 Stationery information scheduling system based on big data
WO2020232591A1 (en) * 2019-05-19 2020-11-26 深圳齐心集团股份有限公司 Stationery information distributed planning system based on big data
CN110618790A (en) * 2019-09-06 2019-12-27 上海电力大学 Mist storage data redundancy removing method based on repeated data deletion
CN110618790B (en) * 2019-09-06 2023-04-28 上海电力大学 Mist storage data redundancy elimination method based on repeated data deletion
CN113254262A (en) * 2020-02-13 2021-08-13 中国移动通信集团广东有限公司 Database disaster tolerance method and device and electronic equipment
CN113254262B (en) * 2020-02-13 2023-09-05 中国移动通信集团广东有限公司 Database disaster recovery method and device and electronic equipment
CN112202910A (en) * 2020-10-10 2021-01-08 上海威固信息技术股份有限公司 Computer distributed storage system
CN114691430A (en) * 2022-04-24 2022-07-01 北京科技大学 Incremental backup method and system for CAD (computer-aided design) engineering data files

Also Published As

Publication number Publication date
WO2012065408A1 (en) 2012-05-24
CN101989929B (en) 2014-07-02

Similar Documents

Publication Publication Date Title
CN101989929B (en) Disaster recovery data backup method and system
US10126973B2 (en) Systems and methods for retaining and using data block signatures in data protection operations
CN109871366B (en) Block chain fragment storage and query method based on erasure codes
CN104932956B (en) A kind of cloud disaster-tolerant backup method towards big data
CN102222085B (en) Data de-duplication method based on combination of similarity and locality
CN100547555C (en) A kind of data backup system based on fingerprint
US9268783B1 (en) Preferential selection of candidates for delta compression
US8918390B1 (en) Preferential selection of candidates for delta compression
CN102012851B (en) Continuous data protection method and server
CN102722583A (en) Hardware accelerating device for data de-duplication and method
CN105120003B (en) A kind of method for realizing data backup under cloud environment
CN104932841A (en) Saving type duplicated data deleting method in cloud storage system
CN106708653B (en) Mixed tax big data security protection method based on erasure code and multiple copies
CN107391306A (en) A kind of isomeric data library backup file access pattern method
CN102156727A (en) Method for deleting repeated data by using double-fingerprint hash check
CN107885619A (en) A kind of data compaction duplicate removal and the method and system of mirror image remote backup protection
CN103970852A (en) Data de-duplication method of backup server
CN102411637A (en) Metadata management method of distributed file system
CN105069111A (en) Similarity based data-block-grade data duplication removal method for cloud storage
CN105487942A (en) Backup and remote copy method based on data deduplication
CN106407224A (en) Method and device for file compaction in KV (Key-Value)-Store system
CN104735110A (en) Metadata management method and system
CN102508902A (en) Block size variable data blocking method for cloud storage system
CN103118104A (en) Data restoration method based on version vector, and server
CN105095027A (en) Data backup method and apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant