CN104714858A - Data backup method, data recovery method and device - Google Patents

Data backup method, data recovery method and device Download PDF

Info

Publication number
CN104714858A
CN104714858A CN201310685278.2A CN201310685278A CN104714858A CN 104714858 A CN104714858 A CN 104714858A CN 201310685278 A CN201310685278 A CN 201310685278A CN 104714858 A CN104714858 A CN 104714858A
Authority
CN
China
Prior art keywords
data
backup
recovery
snapshot
controlling vertex
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310685278.2A
Other languages
Chinese (zh)
Inventor
秦平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201310685278.2A priority Critical patent/CN104714858A/en
Publication of CN104714858A publication Critical patent/CN104714858A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data backup, in particular to a data backup method and device and a data recovery method and device. The data backup method and device and the data recovery method and device are used for solving the problem that needed backup and recovery window are comparatively long in the method of using an Export tool and an Import tool in an Hbase to conduct data backup and data recovery in the prior art. The data backup method comprises the steps that backup nodes create snapshoots for backup objects through the Hbase according to instructions of control nodes; the backup nodes backup the data in the created snapshoots to far-end storage nodes through a Hadoop distributed file system (HDFS). The data recovery method comprises the steps that recovery nodes read data stored in far-end storage nodes according to the instructions of the control nodes; the recovery nodes handle the read data to data in a snapshoot format, and the handled data are written into a data recovery system for providing a data access service through an HDFS interface.

Description

Data back up method and device, data reconstruction method and device
Technical field
The present invention relates to technical field of data backup, particularly relate to data back up method and device, data reconstruction method and device.
Background technology
Mobile service operation support system (Business and Operation Support System, BOSS) the detailed single system of account is built for many years, carry the process of Original CDR wholesale price, bill generates, the basic functions such as account QueryTicket, and provide Data Source for statistical study etc., and growing along with customer volume and portfolio, the mass data of the detailed single system of account causes memory space inadequate, query performance declines, statistical study bottleneck, the problems such as library amendment difficulty, based on these problems, people introduce based on row pattern, be suitable for distributed data base (Hbase) system storing mass data, for the performance of the detailed single system of account brings General Promotion.
In the cloud scheme of the detailed single system of BOSS account, Hbase stores the detailed forms data of magnanimity, and data backup plays very important effect in the safety management etc. of data; The existing data backup scenario based on Hbase is the backup utilizing output (Export) instrument of Hbase to carry out data, and utilize input (Import) instrument to carry out the recovery of data, its concrete steps are: utilize Export instrument by the data of the specified scope in Hbase, to show the file exported to for granularity in HDFS; File backup in HDFS is preserved in the memory node of far-end; During date restoring first distally in memory node by date restoring in HDFS, recycling Import instrument by the files loading in HDFS in Hbase.
When needing the data volume of backup larger, the BACKUP TIME window that the method that the Export instrument of the above-mentioned Hbase of utilization carries out the backup of data needs can be longer, thus had a strong impact on backup efficiency; Similarly, when the data volume needing to recover is larger, the recovery window that the method that the above-mentioned Import of utilization instrument carries out the recovery of data needs is also longer, thus has had a strong impact on recovery efficiency.
Summary of the invention
The embodiment of the present invention provides a kind of data back up method and device, the problem that the BACKUP TIME window that the method for carrying out data backup in order to solve in prior art the Export instrument utilizing Hbase needs can be longer;
The embodiment of the present invention also provides a kind of data reconstruction method and device, in order to solve in prior art the longer problem of recovery window needed by the method that Import instrument carries out date restoring utilizing Hbase.
A kind of data back up method that the embodiment of the present invention provides, comprising:
Backup node, according to the instruction of Controlling vertex, by distributed data-storage system Hbase, is backup object establishment snapshot;
Data in the described snapshot created backup in remote storage node by distributed file system HDFS by described backup node, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
Alternatively, described backup node is that backup object creates snapshot, comprising:
In the incremental backup time interval that described backup node indicates according to described Controlling vertex, the backup object indicated for described Controlling vertex creates snapshot; Wherein, the incremental backup time interval between the backup object with incidence relation is identical.
A kind of data back up method that another embodiment of the present invention provides, comprising:
The backup indication information that Controlling vertex inputs according to user, generates backup policy;
Described Controlling vertex is according to described backup policy, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
Alternatively, described backup policy comprises: the relation between backup object, backup object and the incremental backup time interval, and wherein, the incremental backup time interval between the backup object with incidence relation is identical.
A kind of data reconstruction method that the embodiment of the present invention provides, comprising:
Recovery nodes, according to the instruction of Controlling vertex, reads the data that remote storage node stores;
The data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
Alternatively, the data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, comprising:
Described recovery nodes, according to the snapshot catalog structure before backup, creates the snapshot catalog structure of described data in described HDFS read.
A kind of data reconstruction method that another embodiment of the present invention provides, comprising:
The recovery indication information that Controlling vertex inputs according to user, generates recovery policy;
Described Controlling vertex is according to described recovery policy, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
Alternatively, described recovery policy comprises: the time period of recovering object and recovery.
A kind of data backup device that the embodiment of the present invention provides, comprising:
Creation module, for the instruction according to Controlling vertex, by distributed data-storage system Hbase, for backup object creates snapshot;
Backup module, backup in remote storage node for the data in described snapshot that described creation module is created by distributed file system HDFS, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
A kind of data backup device that another embodiment of the present invention provides, comprising:
Generation module, for the backup indication information inputted according to user, generates backup policy;
Indicating module, for the backup policy generated according to described generation module, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
A kind of Data Recapture Unit that the embodiment of the present invention provides, comprising:
Read module, for the instruction according to Controlling vertex, reads the data that remote storage node stores;
Writing module, the data preparation for being read by described read module becomes fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
A kind of Data Recapture Unit that another embodiment of the present invention provides, comprising:
Generation module, for the recovery indication information inputted according to user, generates recovery policy;
Indicating module, for the recovery policy generated according to described generation module, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
In the embodiment of the present invention, backup node is that backup object creates snapshot by Hbase, when needs back up, data in snapshot are backuped in remote storage node by HDFS, thus Export instrument not only can be adopted again to carry out the derivation operation of data, and can only back up the incremental portion of data, thus can backup window be highly shortened, improve backup efficiency.
Accompanying drawing explanation
The data back up method process flow diagram that Fig. 1 provides for the embodiment of the present invention one;
The data back up method process flow diagram that Fig. 2 provides for the embodiment of the present invention two;
The data reconstruction method process flow diagram that Fig. 3 provides for the embodiment of the present invention one;
The data reconstruction method process flow diagram that Fig. 4 provides for the embodiment of the present invention two;
The system architecture schematic diagram of the data backup that Fig. 5 provides for the embodiment of the present invention and recovery;
The data back up method process flow diagram that Fig. 6 provides for the embodiment of the present invention three;
Fig. 6 a is the method flow diagram that backup node carries out data backup;
The data reconstruction method process flow diagram that Fig. 7 provides for the embodiment of the present invention three;
A kind of data backup device structural representation that Fig. 8 provides for the embodiment of the present invention one;
A kind of data backup device structural representation that Fig. 9 provides for the embodiment of the present invention two;
A kind of Data Recapture Unit structural representation that Figure 10 provides for the embodiment of the present invention one;
A kind of Data Recapture Unit structural representation that Figure 11 provides for the embodiment of the present invention two.
Embodiment
In the embodiment of the present invention, backup node is that backup object creates snapshot by Hbase, when needs back up, data in snapshot are backuped in remote storage node by HDFS, thus Export instrument not only can be adopted again to carry out the derivation operation of data, and can only back up the incremental portion of data, thus can backup window be highly shortened, improve backup efficiency.
Below in conjunction with Figure of description, the embodiment of the present invention is described in further detail.
As shown in Figure 1, be the data back up method process flow diagram that the embodiment of the present invention one provides, comprise the following steps:
S101: backup node, according to the instruction of Controlling vertex, by distributed data-storage system Hbase, is backup object establishment snapshot;
S102: the data in the described snapshot created backup in remote storage node by distributed file system HDFS by described backup node, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
In the embodiment of the present invention, by the incremental backup of the complete paired data of snapshot (Snapshot) function of Hbase, the backup operation of data is transferred to distributed file system (Hadoop Distributed FileSystem simultaneously, HDFS), adopt this snap shot, Export instrument not only can be adopted again to carry out the derivation operation of data, and can only back up the incremental portion of data, thus can backup window be highly shortened, improve backup efficiency.
Alternatively, described backup node is that backup object creates snapshot, comprising:
In the incremental backup time interval that described backup node indicates according to described Controlling vertex, the backup object indicated for described Controlling vertex creates snapshot; Wherein, the incremental backup time interval between the backup object with incidence relation is identical.
In specific implementation process, Controlling vertex can according to the backup indication information of user's input, generate backup policy, instruction backup node backs up, and backup policy can comprise relation between backup object, backup object, the incremental backup time interval and backup mode etc.; Wherein, backup object can be the title of the tables of data needing backup, and here, the tables of data of backup can be raw data packets, and other data that can obtain through raw data list processing (LISP) can not back up; Relation between backup object comprises two kinds, one has incidence relation, another kind does not have incidence relation, in concrete enforcement, if have incidence relation between multiple tables of data, need to back up, then the relation between multiple tables of data can represent with (AND) simultaneously, if not there is incidence relation between multiple tables of data, then can with or (OR) represent; The incremental backup time interval is for same backup object, front and back create the mistiming of snapshot for twice, wherein, in order to ensure the accuracy of the logical relation between tables of data, the incremental backup time interval between the tables of data with AND relation is identical, and the incremental backup time interval between the tables of data with OR relation can be different; Except this, user can also select backup mode, such as can select full backup or incremental backup, for improving backup efficiency, preferred incremental backup mode in the embodiment of the present invention.
Corresponding with the data backup flow process of above-described embodiment one, additionally provide the following backup method based on Controlling vertex side in the embodiment of the present invention, specific implementation process is similar to the aforementioned embodiment, repeats part, repeats no more.
As shown in Figure 2, be the data back up method process flow diagram that the embodiment of the present invention two provides, comprise:
S201: the backup indication information that Controlling vertex inputs according to user, generates backup policy;
Here, backup indication information is exactly in fact the discernible backup information of user that user (managerial personnel) is customized by the graphical interfaces of Controlling vertex, Controlling vertex is according to this backup indication information, generate the discernible concrete backup policy of computing machine, be used to indicate backup node and perform backup tasks; In concrete enforcement, the backup information of some acquiescences can be set, user is when inputting backup indication information, detailed backup information can be inputted, such as, user can input and back up weekly once tables of data 1, in the backup policy that Controlling vertex generates according to this backup indication information, gives tacit consent to and backs up once this tables of data 1 on Sunday 0 weekly.
S202: described Controlling vertex is according to described backup policy, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
In the embodiment of the present invention, Controlling vertex indicates multiple backup node executed in parallel backup tasks, and multiple backup node carries out collaborative backup, operates on parallel computation framework, effectively can improve the efficiency of backup.
Alternatively, described backup policy comprises: the relation between backup object, backup object and the incremental backup time interval, and wherein, the incremental backup time interval between the backup object with incidence relation is identical.
It should be noted that, in the embodiment of the present invention one, two, Controlling vertex and backup node can be arranged in different hardware devices, as on computing machine, also can be arranged on same hardware device, are two functional modules of this same hardware device.
After carrying out data backup, if desired obtaining Backup Data, with regard to having related to date restoring problem, based on this, in the embodiment of the present invention, additionally providing following data reconstruction method;
As shown in Figure 3, be the data reconstruction method process flow diagram that the embodiment of the present invention one provides, comprise:
S301: recovery nodes, according to the instruction of Controlling vertex, reads the data that remote storage node stores;
S302: the data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, and reduced data is used for by the write of HDFS interface the data recovery system providing data access service.
Alternatively, the data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, comprising:
Described recovery nodes, according to the snapshot catalog structure before backup, creates the snapshot catalog structure of described data in described HDFS read.
In the embodiment of the present invention, recovery nodes is according to the instruction of Controlling vertex, read the data that remote storage node stores, and the data preparation of reading is become fast data in accordance with the form provided, reduced data is used for by the write of HDFS interface the data recovery system providing data access service, similar to above-mentioned data backup procedure, adopt this snap shot to carry out date restoring, highly shortened recovery window equally.
In specific implementation process, Controlling vertex can according to the recovery policy generated, instruction recovery nodes carries out the recovery of data, recovery policy can comprise recovery target, the time period of recovering and the destination address etc. of recovery, wherein, the target recovered can be the concrete title needing the tables of data recovered, the time period of recovering specifically refers to the time period of recovering data, namely the data recovering to increase within this time period or revise are needed, the destination address recovered can refer to and need date restoring to which system, the data recovered are used externally to provide the system of service to be referred to as data recovery system by needing in the embodiment of the present invention.
Corresponding with above-mentioned data reconstruction method flow process, the embodiment of the present invention additionally provides the following data reconstruction method process flow diagram based on Controlling vertex side;
As shown in Figure 4, be the data reconstruction method process flow diagram that the embodiment of the present invention two provides, comprise;
S401: the recovery indication information that Controlling vertex inputs according to user, generates recovery policy;
S402: described Controlling vertex is according to described recovery policy, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
In the embodiment of the present invention, Controlling vertex indicates multiple recovery nodes executed in parallel recovery tasks, and multiple recovery nodes operates on parallel computation framework, effectively can improve data recovering efficiency.
In step S401, recovering indication information is exactly in fact the discernible recovery information of user that user (managerial personnel) is customized by the graphical interfaces of Controlling vertex, Controlling vertex is according to this recovery indication information, generate the discernible concrete recovery policy of computing machine, be used to indicate recovery nodes and perform recovery tasks; In concrete enforcement, the recovery information of some acquiescences can be set, user is when inputting recovery indication information, detailed recovery information can be inputted, such as, user can input the destination address of date restoring, and Controlling vertex is after the recovery indication information receiving user's input, in the recovery policy generated, directly data recovery system is appointed as the destination address of recovery.
Alternatively, described recovery policy comprises: the time period of recovering object and recovery.
It should be noted that, in the embodiment of the present invention, Controlling vertex and recovery nodes can be arranged in different hardware devices, as on computing machine, also can be arranged on same hardware device, are two functional modules of this same hardware device.Moreover, the embodiment of the present invention can combine with above-mentioned data back up method embodiment, Controlling vertex, backup node and recovery nodes can be arranged in different hardware devices, as on computing machine, also can be arranged on same hardware device, be the different functional module of this same hardware device.
The above-mentioned data backup that the embodiment of the present invention provides and restoration methods, greatly can shorten backup and recover window, reduce the impact on production task, particularly, in the embodiment of the present invention, data backup and recovery operation mainly complete at HDFS layer, and the Hbase impact relied on production task is less; And this snap shot, compared with Export, Import instrument, directly can back up packed data or directly packed data be returned to data recovery system, under the prerequisite not affecting data integrity, the data volume substantially reducing backup and recover; Meanwhile, the read data bandwidth due to HDFS layer is 6 ~ 8 times of Hbase layer, adopts above-mentioned backup and the restoration methods of the embodiment of the present invention, greatly can provide the efficiency of data backup and recovery.
In order to the flow process of carrying out data backup and recovery in the embodiment of the present invention is described better, be described in detail below by specific embodiment;
As shown in Figure 5, the data backup provided for the embodiment of the present invention and the system architecture schematic diagram of recovery; The system that the embodiment of the present invention realizes data backup and restore funcitons mainly comprises: Controlling vertex, backup node, recovery nodes and memory node, wherein, Controlling vertex can provide a graphical interfaces to carry out the customization of backup policy and recovery policy for backup managerial personnel, backup progress record can also be shown, the progress of backup tasks is recorded in this backup progress record, Controlling vertex can control the work of backup node and recovery nodes, such as controls backup and the beginning recovered and end etc.; Backup node performs concrete backup tasks, and in concrete enforcement, multiple backup node operates on parallel computation framework, can, according to the back end at backup file place (NataNode), realize backing up faster; Accordingly, recovery nodes performs concrete recovery tasks, and in concrete enforcement, multiple recovery nodes operates on parallel computation framework, effectively improves the concurrency recovering data, improves data recovering efficiency; Memory node can have multiple, and for storing the Backup Data of magnanimity, index node can provide index for the data of backup, when carrying out date restoring, and can quick position Backup Data; Production system in figure and data recovery system are for for externally providing data access service, and wherein, production system is used for the data before externally providing backup, and data recovery system is for providing the data recovered from memory node.
As shown in Figure 6, be the data back up method process flow diagram that the embodiment of the present invention three provides, comprise:
S601: Controlling vertex generates backup policy;
This backup policy comprises: a) backup object: table 1(table1), table 2(table2), table 3(table3); B) pass between backup object is: table1AND table2OR table3; C) the incremental backup time interval is: for table1 and table2, every day incremental backup once, for table3, incremental backup is once weekly.
S602: Controlling vertex judges whether current time is 0 point, if so, then enters step S603, otherwise, return step S602;
S603: Controlling vertex judges whether it is 0 point on Sunday, if so, then enters step S604, otherwise, enter step S605;
S604: backup node, according to the instruction of Controlling vertex, is table1, table2 and table3 establishment snapshot;
S605: backup node, according to the instruction of Controlling vertex, is table1 and table2 establishment snapshot;
S606: backup node, by HDFS, back up data to memory node, and generating indexes data is stored into index node;
S607: delete snapshot after backup node record backup progress.
Below, above-mentioned steps S606 is described further: after creating snapshot, the bibliographic structure on HDFS is as follows:
As Fig. 6 a, for backup node carries out the method flow diagram of data backup, comprising:
S6a: first backup node arrives in/hbase/.snapshots/completed/regionname/ [columnfamily name]/[hfile name] file the Hfile file obtained involved by this increment;
S6b: backup node analysis also obtains the back end at these all listed files places, forms relation list as follows:
table1/region1/hfile1 10G datanode1,datanode2,datanode3
table/region1/hfile2 50G datanode4,datanode2,datanode3
table1/region1/hfile3 80G datanode4,datanode5,datanode6
...
table2/region1/hfile1 30G datanode7,datanode8,datanode9
table2/region1/hfile2 100G datanode1,datanode3,datanode9
table2/region3/hfile1 56G datanode2,datanode8,datanode4
table3/region1/hfile1 38G datanode5,datanode8,datanode9
table3/region1/hfile2 29G datanode1,datanode3,datanode10
table3/region3/hfile1 55G datanode2,datanode8,datanode3
S6c: the position of backup node employing file and size, as the task matching factor, generate MapReduce task; Like this, can ensure backup node as far as possible read-only remove local file while, increase the concurrency of system, thus complete the backup of data fast, reduce backup window.
Here, MapReduce is a kind of programming model, and for the concurrent operation of large-scale dataset (being greater than 1TB), wherein Map can be translated into mapping, and Reduce can be translated into stipulations.
As shown in Figure 7, be the data reconstruction method process flow diagram that the embodiment of the present invention three provides, comprise:
S701: Controlling vertex generates recovery policy;
This recovery policy comprises: a) recover object: table3; The time period of b) recovering: recover the data in the snapshot created on August 9th, 2013; C) destination address recovered: return in data recovery system.
S702: Controlling vertex, according to described recovery policy, sends date restoring order to recovery nodes;
S703: recovery nodes, after receiving and recovering order, reads corresponding data according to index from memory node, and data preparation is become fast in accordance with the form provided after, by HDFS interface write data recovery system.
Based on same inventive concept, the data backup device corresponding with above-mentioned data back up method is additionally provided in the embodiment of the present invention, the Data Recapture Unit corresponding with above-mentioned data reconstruction method, the principle of dealing with problems due to these devices and above-mentioned data back up method, data reconstruction method are similar, therefore in the embodiment of the present invention, the enforcement of device see the enforcement of method, can repeat part and repeats no more.
As shown in Figure 8, be a kind of data backup device structural representation that the embodiment of the present invention one provides, this device comprises:
Creation module 81, for the instruction according to Controlling vertex, by distributed data-storage system Hbase, for backup object creates snapshot;
Backup module 82, backup in remote storage node for the data in described snapshot that creation module 8 is created by distributed file system HDFS, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
Alternatively, described creation module 81 specifically for:
According to the incremental backup time interval that described Controlling vertex indicates, the backup object indicated for described Controlling vertex creates snapshot; Wherein, the incremental backup time interval between the backup object with incidence relation is identical.
As shown in Figure 9, be a kind of data backup device structural representation that the embodiment of the present invention two provides, this device comprises:
Generation module 91, for the backup indication information inputted according to user, generates backup policy;
Indicating module 92, for according to described backup policy, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
Alternatively, described backup policy comprises: the relation between backup object, backup object and the incremental backup time interval, and wherein, the incremental backup time interval between the backup object with incidence relation is identical.
As shown in Figure 10, be a kind of Data Recapture Unit structural representation that the embodiment of the present invention one provides, this device comprises:
Read module 101, for the instruction according to Controlling vertex, reads the data that remote storage node stores;
Writing module 102, the data preparation for being read by read module 101 becomes fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
Alternatively, said write module specifically for:
According to the snapshot catalog structure before backup, create the snapshot catalog structure of described data in described HDFS read.
As shown in figure 11, be a kind of Data Recapture Unit structural representation that the embodiment of the present invention two provides, this device comprises:
Generation module 111, for the recovery indication information inputted according to user, generates recovery policy;
Indicating module 112, for according to described recovery policy, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
Alternatively, described recovery policy comprises: the time period of recovering object and recovery.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the present invention can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code.
The present invention describes with reference to according to the process flow diagram of the method for the embodiment of the present invention, device (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Although describe the preferred embodiments of the present invention, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the scope of the invention.
Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.

Claims (16)

1. a data back up method, is characterized in that, the method comprises:
Backup node, according to the instruction of Controlling vertex, by distributed data-storage system Hbase, is backup object establishment snapshot;
Data in the described snapshot created backup in remote storage node by distributed file system HDFS by described backup node, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
2. the method for claim 1, is characterized in that, described backup node is that backup object creates snapshot, comprising:
In the incremental backup time interval that described backup node indicates according to described Controlling vertex, the backup object indicated for described Controlling vertex creates snapshot; Wherein, the incremental backup time interval between the backup object with incidence relation is identical.
3. a data back up method, is characterized in that, the method comprises:
The backup indication information that Controlling vertex inputs according to user, generates backup policy;
Described Controlling vertex is according to described backup policy, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
4. method as claimed in claim 3, it is characterized in that, described backup policy comprises: the relation between backup object, backup object and the incremental backup time interval, and wherein, the incremental backup time interval between the backup object with incidence relation is identical.
5. a data reconstruction method, is characterized in that, the method comprises:
Recovery nodes, according to the instruction of Controlling vertex, reads the data that remote storage node stores;
The data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
6. method as claimed in claim 5, is characterized in that, the data preparation of reading is become fast data in accordance with the form provided by described recovery nodes, comprising:
Described recovery nodes, according to the snapshot catalog structure before backup, creates the snapshot catalog structure of described data in described HDFS read.
7. a data reconstruction method, is characterized in that, the method comprises:
The recovery indication information that Controlling vertex inputs according to user, generates recovery policy;
Described Controlling vertex is according to described recovery policy, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
8. method as claimed in claim 7, it is characterized in that, described recovery policy comprises: the time period of recovering object and recovery.
9. a data backup device, is characterized in that, this device comprises:
Creation module, for the instruction according to Controlling vertex, by distributed data-storage system Hbase, for backup object creates snapshot;
Backup module, backup in remote storage node for the data in described snapshot that described creation module is created by distributed file system HDFS, wherein, data in described snapshot are after this snapshot of establishment, be before described backup object creates snapshot next time, the data increasing in described backup object or revise.
10. device as claimed in claim 9, is characterized in that, described creation module specifically for:
According to the incremental backup time interval that described Controlling vertex indicates, the backup object indicated for described Controlling vertex creates snapshot; Wherein, the incremental backup time interval between the backup object with incidence relation is identical.
11. 1 kinds of data backup devices, is characterized in that, this device comprises:
Generation module, for the backup indication information inputted according to user, generates backup policy;
Indicating module, for the backup policy generated according to described generation module, indicate multiple backup node parallel execution of data backup tasks, described data backup task comprises: create snapshot for backup object, the data in the described snapshot created is backuped in remote storage node by distributed file system HDFS.
12. devices as claimed in claim 11, it is characterized in that, described backup policy comprises: the relation between backup object, backup object and the incremental backup time interval, and wherein, the incremental backup time interval between the backup object with incidence relation is identical.
13. 1 kinds of Data Recapture Units, is characterized in that, this device comprises:
Read module, for the instruction according to Controlling vertex, reads the data that remote storage node stores;
Writing module, the data preparation for being read by described read module becomes fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
14. devices as claimed in claim 13, is characterized in that, said write module specifically for:
According to the snapshot catalog structure before backup, create the snapshot catalog structure of described data in described HDFS read.
15. 1 kinds of Data Recapture Units, is characterized in that, this device comprises:
Generation module, for the recovery indication information inputted according to user, generates recovery policy;
Indicating module, for the recovery policy generated according to described generation module, indicate multiple recovery nodes parallel execution of data recovery tasks, described date restoring task comprises: read the data that remote storage node stores, the data preparation of reading is become fast data in accordance with the form provided, and reduced data is used for by the write of distributed file system HDFS interface the data recovery system providing data access service.
16. devices as claimed in claim 15, it is characterized in that, described recovery policy comprises: the time period of recovering object and recovery.
CN201310685278.2A 2013-12-13 2013-12-13 Data backup method, data recovery method and device Pending CN104714858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310685278.2A CN104714858A (en) 2013-12-13 2013-12-13 Data backup method, data recovery method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310685278.2A CN104714858A (en) 2013-12-13 2013-12-13 Data backup method, data recovery method and device

Publications (1)

Publication Number Publication Date
CN104714858A true CN104714858A (en) 2015-06-17

Family

ID=53414221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310685278.2A Pending CN104714858A (en) 2013-12-13 2013-12-13 Data backup method, data recovery method and device

Country Status (1)

Country Link
CN (1) CN104714858A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104765651A (en) * 2014-01-06 2015-07-08 中国移动通信集团福建有限公司 Data processing method and device
CN105159945A (en) * 2015-08-10 2015-12-16 北京思特奇信息技术股份有限公司 Method and system for extracting and converting data between Hbase and Hdfs
CN105260271A (en) * 2015-11-18 2016-01-20 浪潮(北京)电子信息产业有限公司 HDFS snapshot implementation method and system
CN105740101A (en) * 2016-01-29 2016-07-06 青岛海尔智能家电科技有限公司 Automatic backup and automatic restoration method and apparatus for MySQL database
CN105843704A (en) * 2016-03-15 2016-08-10 上海爱数信息技术股份有限公司 Data protection method and system capable of combining with snapshot function based on distributed block storage
CN105938489A (en) * 2016-04-14 2016-09-14 北京思特奇信息技术股份有限公司 Storage and display method and system of compressed detailed lists
CN106569911A (en) * 2016-10-14 2017-04-19 深圳前海微众银行股份有限公司 Data backup method and device
CN107122260A (en) * 2017-04-18 2017-09-01 北京思特奇信息技术股份有限公司 A kind of data back up method and device
CN107330003A (en) * 2017-06-12 2017-11-07 上海藤榕网络科技有限公司 Method of data synchronization, system, memory and data syn-chronization equipment
CN107391303A (en) * 2017-06-30 2017-11-24 北京奇虎科技有限公司 Data processing method, device, system, server and computer-readable storage medium
CN107493330A (en) * 2017-08-16 2017-12-19 北京新网数码信息技术有限公司 A kind of cloud service method and Cloud Server
CN107656992A (en) * 2017-09-14 2018-02-02 上海交通大学 Towards the snapshot method for edition management in more insertion sources
CN107943617A (en) * 2017-11-17 2018-04-20 北京联想超融合科技有限公司 Restorative procedure, device and the server cluster of data
CN108573049A (en) * 2018-04-20 2018-09-25 联想(北京)有限公司 Data processing method and distributed storage devices
CN109753379A (en) * 2017-11-08 2019-05-14 阿里巴巴集团控股有限公司 Snapshot data backup, delet method, apparatus and system
CN109976942A (en) * 2017-12-28 2019-07-05 中移(杭州)信息技术有限公司 A kind of data backup and resume method, backup server and source server
CN111143129A (en) * 2019-12-24 2020-05-12 维沃移动通信有限公司 Information backup method and electronic equipment
CN111324485A (en) * 2020-01-20 2020-06-23 杭州安恒信息技术股份有限公司 Data information backup method, device, equipment and storage medium of data table
CN112800019A (en) * 2021-03-03 2021-05-14 国网甘肃省电力公司 Data backup method and system based on Hadoop distributed file system
CN116382974A (en) * 2023-03-21 2023-07-04 安芯网盾(北京)科技有限公司 Customized data protection processing method
CN117520056A (en) * 2024-01-08 2024-02-06 南京云信达科技有限公司 Hbase data backup method, hbase data backup system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1524222A (en) * 2001-07-06 2004-08-25 ���������˼�빫˾ Systems and methods of information backup
CN102096669A (en) * 2009-12-14 2011-06-15 深圳速浪数字技术有限公司 Data backup method and data backup device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1524222A (en) * 2001-07-06 2004-08-25 ���������˼�빫˾ Systems and methods of information backup
CN102096669A (en) * 2009-12-14 2011-06-15 深圳速浪数字技术有限公司 Data backup method and data backup device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苏艳森: "分布式文件存储平台文件备份与恢复系统设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104765651A (en) * 2014-01-06 2015-07-08 中国移动通信集团福建有限公司 Data processing method and device
CN105159945A (en) * 2015-08-10 2015-12-16 北京思特奇信息技术股份有限公司 Method and system for extracting and converting data between Hbase and Hdfs
CN105260271A (en) * 2015-11-18 2016-01-20 浪潮(北京)电子信息产业有限公司 HDFS snapshot implementation method and system
CN105740101A (en) * 2016-01-29 2016-07-06 青岛海尔智能家电科技有限公司 Automatic backup and automatic restoration method and apparatus for MySQL database
CN105843704A (en) * 2016-03-15 2016-08-10 上海爱数信息技术股份有限公司 Data protection method and system capable of combining with snapshot function based on distributed block storage
CN105843704B (en) * 2016-03-15 2018-10-19 上海爱数信息技术股份有限公司 A kind of data guard method and system of the snapshot functions of combination distributed block storage
CN105938489A (en) * 2016-04-14 2016-09-14 北京思特奇信息技术股份有限公司 Storage and display method and system of compressed detailed lists
CN106569911A (en) * 2016-10-14 2017-04-19 深圳前海微众银行股份有限公司 Data backup method and device
CN107122260A (en) * 2017-04-18 2017-09-01 北京思特奇信息技术股份有限公司 A kind of data back up method and device
CN107330003A (en) * 2017-06-12 2017-11-07 上海藤榕网络科技有限公司 Method of data synchronization, system, memory and data syn-chronization equipment
CN107391303A (en) * 2017-06-30 2017-11-24 北京奇虎科技有限公司 Data processing method, device, system, server and computer-readable storage medium
CN107391303B (en) * 2017-06-30 2021-02-23 北京奇虎科技有限公司 Data processing method, device, system, server and computer storage medium
CN107493330A (en) * 2017-08-16 2017-12-19 北京新网数码信息技术有限公司 A kind of cloud service method and Cloud Server
CN107656992A (en) * 2017-09-14 2018-02-02 上海交通大学 Towards the snapshot method for edition management in more insertion sources
CN107656992B (en) * 2017-09-14 2021-09-21 上海交通大学 Multi-insertion-source-oriented snapshot version management method
CN109753379B (en) * 2017-11-08 2022-12-02 阿里巴巴集团控股有限公司 Snapshot data backup and deletion method, device and system
CN109753379A (en) * 2017-11-08 2019-05-14 阿里巴巴集团控股有限公司 Snapshot data backup, delet method, apparatus and system
CN107943617A (en) * 2017-11-17 2018-04-20 北京联想超融合科技有限公司 Restorative procedure, device and the server cluster of data
CN107943617B (en) * 2017-11-17 2021-06-29 北京联想超融合科技有限公司 Data restoration method and device and server cluster
CN109976942B (en) * 2017-12-28 2021-02-19 中移(杭州)信息技术有限公司 Data backup and recovery method, backup server and source server
CN109976942A (en) * 2017-12-28 2019-07-05 中移(杭州)信息技术有限公司 A kind of data backup and resume method, backup server and source server
CN108573049A (en) * 2018-04-20 2018-09-25 联想(北京)有限公司 Data processing method and distributed storage devices
CN111143129A (en) * 2019-12-24 2020-05-12 维沃移动通信有限公司 Information backup method and electronic equipment
CN111324485A (en) * 2020-01-20 2020-06-23 杭州安恒信息技术股份有限公司 Data information backup method, device, equipment and storage medium of data table
CN112800019A (en) * 2021-03-03 2021-05-14 国网甘肃省电力公司 Data backup method and system based on Hadoop distributed file system
CN116382974A (en) * 2023-03-21 2023-07-04 安芯网盾(北京)科技有限公司 Customized data protection processing method
CN117520056A (en) * 2024-01-08 2024-02-06 南京云信达科技有限公司 Hbase data backup method, hbase data backup system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN104714858A (en) Data backup method, data recovery method and device
US11983075B2 (en) Migrating data and metadata from a backup system
US11520755B2 (en) Migration of a database management system to cloud storage
US11748332B2 (en) Organically managing storage of a data object based on an expiry timeframe supplied by a user of the data object
US11042449B2 (en) Database protection using block-level mapping
US20220222147A1 (en) Backup index generation process
US20210049079A1 (en) Systems and methods for change block tracking
US10635546B2 (en) Synthesizing a restore image from one or more secondary copies to facilitate data restore operations to a file server
US8856080B2 (en) Backup using metadata virtual hard drive and differential virtual hard drive
US9939981B2 (en) File manager integration with virtualization in an information management system with an enhanced storage manager, including user control and storage management of virtual machines
US8949183B2 (en) Continuous and asynchronous replication of a consistent dataset
CN106021016A (en) Virtual point in time access between snapshots
EP3234772B1 (en) Efficiently providing virtual machine reference points
US20190251191A1 (en) Targeted search of backup data using facial recognition
US20220188342A1 (en) Targeted search of backup data using calendar event data
US11126365B2 (en) Skipping data backed up in prior backup operations
CN104765651A (en) Data processing method and device
AU2019263048B2 (en) Client managed data backup process within an enterprise information management system
CN110431527B (en) Mapping storage across storage providers
Ho et al. Active data: Supporting the grid data life cycle
CN114328016A (en) Data synthesis method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150617