CN108804253A - A kind of concurrent job backup method for mass data backup - Google Patents
A kind of concurrent job backup method for mass data backup Download PDFInfo
- Publication number
- CN108804253A CN108804253A CN201710301054.5A CN201710301054A CN108804253A CN 108804253 A CN108804253 A CN 108804253A CN 201710301054 A CN201710301054 A CN 201710301054A CN 108804253 A CN108804253 A CN 108804253A
- Authority
- CN
- China
- Prior art keywords
- backup
- file
- node
- copy
- directory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1461—Backup scheduling policy
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of concurrent job backup methods for mass data backup.This method is:1)It chooses more backup nodes and forms a backup cluster, each backup node has unified configuration;2)Terminal chooses a backup node as archive management server, starts the backup policy of object to be backed up;3)Archive management server chooses a backup node as job scheduler, successively obtains the corresponding bibliographic structure of the backup object, often obtains a catalogue and generates a scanning operation;4)Each scanning operation and corresponding working path are submitted to job scheduler by archive management server;Job scheduler sends it to backup node and is scanned to the target directory in scanning operation;5)The archive management server chooses file to be backed up and generates several file sublists;A copy operation, which is generated, according to each sublist is sent to job scheduler;6)Different copy operations are sent to different backup nodes by job scheduler, by file copy to be backed up to corresponding position.
Description
Technical field
The present invention relates to a kind of data back up method more particularly to a kind of concurrent job backups for mass data backup
Method.
Background technology
Data are vital for an enterprise, department, entity or individual.Due to various reasons, it for example sets
Standby failure, hacker's virus, artificial maloperation etc., once data information is lost or is destroyed, it will it causes inestimable
Loss, this makes data backup become extremely important.Data backup is a kind of Data Security, does one to critical data and copies
Shellfish, when failure occurs, to restore data, the loss for avoiding loss of data from bringing by backup software.
With the continuous development of information technology, the Newly Sprouted Things such as cloud computing, Internet of Things, social networks make human society
Data class and scale explosive growth in the world.By the end of 2012, data volume was from TB (1TB=1024GB)
Rank rises to PB (1PB=1024TB), EB (1EB=1024PB) or even ZB (1ZB=1024EB) rank.The big data epoch
It arrives, while also promoting increasing rapidly for backup requirements amount, the mass data of TB and bigger brings new to data backup
Challenge.
In addition, the storage mode of data also tends to diversification:There is the traditional Relational DataBase of structuring;Have unstructured
Non-relational database;Also GFS and HDFS is the distributed file system of representative.With the increasing of data volume and data class
It is more, to the backup of these data become to become increasingly complex with it is time-consuming.
In face of mass data, how software and hardware resources are made full use of, meets different backup requirements, quickly and effectively complete
Data backup and resume is the main purpose of design and research standby system.Existing backup software, there are Railway Projects:
1. being designed not directed to the backup of mass data.In backup procedure, most important is exactly to copy backup object
On shellfish portion copy to other machine.In the process, many backup softwares using single data stream mode carry out data copy and
Transmission, is limited by server or network bandwidth, can not promote backup rate and capacity.Standby to thousands of or tens of thousands of a files progress
When part, performance is fine.But to comprising ten million even hundred million files mass data when, need several days even several weeks time,
Backup tasks can not be completed in acceptable time range.
2. there are the possibility of Single Point of Faliure in standby system.Some standby systems construct more backup servers, but not
Same backup server is responsible for different backup services.Once certain server failure, the backup defined on this server and
Restoring service can not just continue.
3. backup software is for security consideration, using customized storage format, backup file depends on backup software, when
When software fault, backup file can not provide use, lead to have backup equal to the result of no backup.
Invention content
In view of the deficiencies in the prior art, the purpose of the present invention is to provide it is a kind of for mass data backup and
Row operation backup method submits the mode of concurrent job to obtain the mass data list for needing to back up by structure backup cluster,
Further according to the backup policy of customization, parallel backup job is submitted, and backup file is preserved with the linux file formats of standard.
The present invention includes several block structures:
1. mechanical floor:All hardware resource includes by backup node --- the backup cluster that multiple stage computers form, storage
The distributed file system that resource --- more disk arrays are constituted, Internet resources etc..It is standby in order to remove the possibility of Single Point of Faliure
Every node in part cluster is unified installation operation system, standby system and job scheduling and is joined with software, unified configuration is checked
Number, all backup policy of unified definition and server list.Each backup node all can serve as archive management server and
Operation dispatching server.Will be distributed over the disk on multiple disk arrays, filesystem manner is virtually patrolled as one in a distributed manner
Memory space is collected, which is mounted to a share directory on each backup node.It is every in backup cluster
One backup node uses the disk array of bottom by way of accessing the share directory.In backup procedure preservation in need
Data, including mysql databases, operation and job result, backup file, daily record etc. are all stored in the share directory.By certain
One backup node serves as archive management server, is responsible for entire backup management task, when it fails, enables at once another
One backup node is as management server.Another backup node is received and is dispatched for operation as operation dispatching server.
Due to all backup node information unifications in cluster, resource-sharing, so single machine failure does not interfere with whole system
Normal operation.
2. management level:This layer includes all software and application, including backup policy, data base administration, job scheduling, work
Industry inspection, several aspects such as access mandate, each backup node include these softwares and application.
1) customized backup policy.Different data have different backup requirements, can be according to specific requirements, customization backup
Capacity, frequency of your backups, backup mode (backup, incremental backup or differential backup completely), retention time, access level etc..
2) mysql database services are installed, dedicated mysql tables is established for backup, records the directory information of backup object
And fileinfo, in backup procedure, file polling and recovery to can also be provided.
3) for the backup of mass data, in order to improve backup rate, in the form of operation in backup cluster concurrently
Complete the copy of the extraction and backup object of listed files.The operation of identical function is to include the foot of identical program and different objects
This, is generated, and be sent in job scheduler with fixed format.Job scheduler is according to job execution time, backup section
The backup node that the state selection of point executes.
4) operation run on backup node has regularly review of operations script, failure operation to need to enter operation tune
It is executed again in degree device.
5) backup object is defined with catalogue in standby system, a backup object can be made of multiple catalogues.Backup node
The read-only machine list of upper definition mandate, definition format are:Machine name where catalogue+backup object or the addresses ip.Authorized machine
Oneself relevant backup fileinfo on backup node can be inquired, but the backup file of backup node can not be deleted, can not be inquired
With the other catalogue of recovery.
3. service layer:All services and api interface that standby system is externally provided.Application including data backup, number
Request according to recovery and the data to backup are checked and are retrieved.There are linux orders row format and api interface simultaneously, for looking into
Ask the directory information and fileinfo backed up.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of parallel backup method for handling mass data includes the following steps (as shown in Figure 1):
1. checking backup policy.Each backup object can define corresponding backup policy before backup.Back-up plan
After startup, being defined for this backup procedure is obtained in backup policy, including checks that backup object whether there is and award
Reading permission is given, whether the catalogue involved in backup procedure is normal, this backup is (full backup, increment in the form of which kind of backup
Backup or differential backup) it carries out, in the storage to which catalogue or medium of backup file, archive management server and job scheduling clothes
Whether device be engaged in normally without delay machine, listed files according to what threshold value come cutting (data capacity or quantity of documents), after Backup end
Which need that daily record and information recorded and submit.Whether the directory information table and file of the backup object are had in mysql databases
Information table backs up this backup object if it is first time, then initially sets up the two database tables.
2. generating the concurrent job (as shown in Figure 2) of directory scan.Since each backup object is defined with catalogue form,
The bibliographic structure of catalogue is successively obtained by depth-first traversal algorithm, a catalogue is often obtained, in the catalogue of mysql databases
It is inserted into a directory information in information table, while generating a scanning operation, i.e. an executable script, content scan+
Target directory name, then submits to job scheduler by scanning operation and working path.
3. scanning operation is dispatched and operation (as shown in Figure 3).Job scheduler is according to the synthesis of each node in backup cluster
State evaluation, including free memory amount, load values, cpu occupy the idle backup node of the selection such as sum and operation number of processes
As execution node, and scanning operation name and working path are sent to the backup node.The node executes under working path
Scanning operation is scanned the target directory defined in the job script, records the All Files information under the catalogue, and
Mysql tables of data is that each file is inserted into a fileinfo.After the completion of all scanning operations, the backup object it is complete
Directory information and fileinfo be stored in mysql databases.
4. the operating file copy (as shown in Figure 4) in a manner of concurrent job.It is full backup, incremental backup according to this backup
Or differential backup, the file that extraction needs back up from directory information table in mysql databases and file information table are recorded into one
In a text file.If it is full backup, by the All Files in file information table, including the filename of belt path and file it is big
It is small, listed files text is written one by one.If it is incremental backup or difference, the file needed is extracted according to filemodetime,
Listed files text is equally written in filename including belt path and file size one by one.According to cutting threshold value (data capacity or
Person's quantity of documents) segmentation listed files, multiple copy operations are generated, the backup of filename+belt path of tar+ belt paths is included
Filename submits job scheduler, and operation is assigned on idle backup node by scheduler and is run.Backup node executes
Tar programs generate backup file in the path (i.e. the backup filename of belt path) of operational definition.Different operations is responsible for
Different file copies generates different backup files, is run parallel in different backup nodes.
5. the inspection and summary of backup tasks.The review of operations:Each scanning operation or copy operation have setting when
Between threshold value, once it is overtime, failure information will be returned to job scheduler.All failure operations of review of operations program checkout, and
Job scheduler is resubmited in the operation for generating same content.The most multiple throwing of the operation of same content is secondary, once it is more than number
Limitation just generates in error information write-in backup log.Resource inspection:It checks and records in backup cluster on each backup node
Load values, memory usage amount, cpu occupancies, disk array use ratio etc..It was found that when having mechanical disorder delay machine, operation is notified
Scheduler deletes the machine name.When having new engine addition, notice job scheduler increases the machine name.It was found that part storage without
Remaining space or when having Input/Output failures, generates warning message and backup log is written.Daily record is summarized:According to backup plan
Definition slightly records this and backs up running useful information, including backup server name, backup object, backup total capacity are standby
Part time, backup mode, sniffing is accidentally and alarm in backup procedure.
6. the backup for inquiry and recovery service.If user needs the data for restoring to lose, standby system can be first
First audit whether the machine is authorized machine and corresponding directory name.Restore service mainly realized with interactive script, user with
Recovery orders start script and execute, and need the directory name or filename that restore according to the prompt input of textual interface, restore
The time point of data restores destination path etc., and system needs the backup file where recovery file according to the matching of these parameters,
And by backup file XietarBao, and will be in required file copy to user's specified path.User can also call API, with
The mode of order+parameter checks backup fileinfo.
The positive effect of the present invention:
The present invention completes scanning and the data copy task of mass data by the way of concurrent job, gives full play to calculating
The backup tasks of huge data volume are rapidly completed in the performance of machine cluster.Backup result is saved in multiple backup files, is being restored
It also only needs to extract incremental backup file when data, accelerates file access pattern speed.Each backup node configuration one in standby system
It causes, systemic breakdown caused by single machine failure is removed in resource-sharing;Backup is executed with the tar orders of linux standards and data are copied
The backup file of shellfish, generation is preserved with tar formats, and the tar programs that linux or windows operating systems carry can be used to read,
Standby system, which need not be relied on, can restore data, improve the availability of standby system.The compression of Tar orders and encryption parameter
The danger that selection decreases the consumption of network and data are stolen.After each backup, there is task to check automatically and daily record
It is automatic to submit, the robustness of standby system is improved, administrative burden is reduced.
Description of the drawings
Fig. 1 is the parallel backup method flow chart of the present invention;
Fig. 2 is the generation method flow chart of scanning operation;
Fig. 3 is scanning operation scheduling and operation method flow chart;
Fig. 4 is the generation method flow chart of copy operation.
Specific implementation mode
With reference to specific example, the present invention is described further.
With above an entitled login machine /home catalogues backup as example.Name is that bak01 machines are first
It is responsible for the server of backup, it starts selfserver services and checks oneself process first, checks that state is normal in the machine, can obtain/
The relevant configurations of home, each backup process exist.The backup finger daemon backup_agent of startup/home later.backup_
Start checkconf scripts in agent, read the backup policy of defined good/home catalogues, return needed for parameter:
1. backup source directory:login:/home;
2. frequency of your backups:The primary backup of operation daily;
3. backup grade:0 (this backup is using backup completely);
4. access level:Private (non-public, the root user on login machines can restore data);
5. storage catalogue:bak01:/gluster/daily/login_home;
6. Log Directory:bak01:/gluster/$date;
7. task run server:bak01;
8. dispatch server:bak06;
9. cutting threshold value:Default (every 20000 files of acquiescence are syncopated as a copy process);
10. whether encrypting:It is;
11. retention time:One month;
12. daily record mail collects address:heguans@hotmail.com;
13. log recording selects:All information, including backup are summarized, and operation reports an error and warning message;
Check that source directory whether there is and readable, storage share directory whether there is and writeable, and Log Directory whether there is
And it is writeable, operation dispatching server is available.
Later, backup_agent processes start baklog processes and finddir processes on bak01.Baklog is generated
One entitled/gluster/20170218/log_20170218000300 journal file, for preserving known backup source mesh
Record, runtime server, the information such as backup grade and storage catalogue, and persistently record the review of operations that following each process will will produce
As a result, result and data copy procedure, multiple information such as alarm are thrown in operation again.Finddir processes filter file, only record/
The relevant information of each subdirectory under home, including:Unique ID of catalogue, the absolute path of catalogue, the relative depth (phase of catalogue
Corresponding to/home catalogues, for example/home/a registered depths are 2, and/home/a/b registered depths are 3) parent directory title, catalogue
Creation time is inserted into the record of the catalogue in mysql databases.For example/home/a catalogues are corresponded to, the ID in database
Number it is 46821131, catalogue absolute path is /home/a, and directories deep 2, parent directory title/home, the directory creating time is
2015/08/10.One is generated simultaneously for scanning/the scanning operation of home/a catalogues, entitled scanjob+ random numbers, than
In scanjob021 operations to shared storage catalogue/gluster/tmp/job/20170218/login_home, it is sent to work
On industry scheduler bak06.
After Bak06 machines receive job request, starts initiating task and inspect periodically process job_handle processes.According to
Algorithm obtains the state evaluation value of all machines in current backup cluster, and bak03 machines is selected to start execution/gluster/tmp/
Job/20170218/login_home/scanjob21 scripts, the file in right/home/a is scanned, and filters/home/a
In subdirectory.A file is often scanned through, a file record, including file unique ID number are inserted into mysql databases,
Filename, file absolute path, the user name of file owners, group name, file size, last modification time, final state change
Become the time.After the completion of scanjob21 operations on Bak03, job scheduler bak06 is notified, and generating simultaneously
Scanjob21.e files and scanjob21.o files are to/gluster/tmp/job/20170218/login_home catalogues.Such as
Fruit scanjob21.e file sizes are more than 0, then illustrate that the operation of scanjob21 operations is problematic, do not returned correctly
Value.
The job_handle processes being periodically executed can check all job result e files, and resubmit this
Operation, and the number of the operation submitted again is recorded, until whole reviews of operations and operation are thrown and are terminated again.Once job scheduling
When operation in device is 0, job_handle processes are out of service, and return to the backup_agent processes of a value bak01.
The backup grade 0 that backup_agent processes are obtained according to the first step, extraction/home from the library of database
All Files list under catalogue, and according to cutting threshold value, every 20,000 files just generate a copy operation.Submit all copy
In shellfish operation, such as dumpjob155 to job scheduler bak06 machines.Second of startup job_handle process of Bak06, and
Bak19 is selected to start execution/gluster/tmp/job/20170218/login_home/ according to current state evaluation of estimate
File is copied using the linux tar programs carried in dumpjob155, dumpjob155, and the backup of generation is literary
Part is preserved into/gluster/daily/login_home/20170218 catalogues.After the completion of all copy operations execute, together
Sample generates dumpjob155.e files and dumpjob155.o files, so that job_handle process checks are handled.In addition ,/
An index155 file is generated in gluster/daily/login_home/20170218/index, for recording this 20,000
The title and file size of file.After job_handle processes are out of service, the backup_agent that is equally notified that on bak01
Process.
Finally, backup_agent process initiations mailman processes call the sendmai programs of linux, will/
Content in gluster/20170218/log_20170218000300 is sent to relevant administrator with lettergram mode
heguans@hotmail.com.Backup_agent process initiation garbage processes cleaning simultaneously owns/gluster/tmp/
Each catalogue under job and file, avoid leaving garbage files.So far, this on login /home catalogues backup complete,
Backup_agent processes are out of service.
If certain files of user's needs pair/home restore, he must log in login machines in a manner of root,
Recovery orders are executed, step by step according to prompt, the directory name or filename for needing to restore is submitted, restores the date of file
(such as 20170218), the position (such as/tmp) of file access pattern.Bak01 can retrieve/gluster/daily/login_home/
Index files under 20170218/index extract his required file copy to login:In/tmp catalogues.
Claims (7)
1. a kind of concurrent job backup method for mass data backup, step are:
1) multiple stage computers are chosen and form a backup cluster as backup node, each backup node has unified configuration;
Each disk array is connect in a manner of logical volume with each backup node, and a backup database is built on logical volume;
2) it needs the terminal backed up to choose a backup node as archive management server, and is opened on the archive management server
Move the backup policy of object to be backed up;Wherein, the backup object is defined with catalogue form, i.e., each backup object corresponds to a mesh
Record;
3) archive management server chooses a backup node as operation dispatching server according to the backup policy, and checking should
The directory information table and file information table that whether there is the backup object in backup database, if it is not, establishing the backup
The directory information table and file information table of object;Then the corresponding bibliographic structure of the backup object is successively obtained, a mesh is often obtained
Record is inserted into a directory information in the directory information table, and generates a scanning operation;The scanning operation includes scanner program
Name and target directory name;
4) each scanning operation and corresponding working path are submitted to the operation dispatching server by the archive management server;The work
Industry dispatch server is chosen several backup nodes and is sent out as execution node, and by each scanning operation and its corresponding working path
Give an execution node;Each node that executes is scanned the target directory in the scanning operation that receives, records the target directory
Under All Files information, and be that each file for scanning is inserted into a fileinfo in this document information table;
5) archive management server chooses file generated to be backed up according to the backup policy, directory information table and file information table
One listed files, and cutting is carried out to this document list according to cutting threshold value, obtain several sublists;
6) archive management server generates a copy operation according to each sublist, and each copy operation is sent to the operation tune
Spend server;The copy operation includes the backup filename of program of file copy name, the filename of belt path and belt path;
7) different copy operations are sent to different backup nodes by the operation dispatching server, and backup node is copied according to what is received
Shellfish operation is by corresponding position in corresponding file copy to be backed up to the logical volume.
2. the method as described in claim 1, which is characterized in that the information in the backup policy includes the reading of backup object
Permission, backup form, as the backup node of operation dispatching server, are cut in the storage to which catalogue or medium of backup file
Divide threshold value, record and the daily record submitted and information are needed after Backup end.
3. method as claimed in claim 2, which is characterized in that when the recovery backup that the backup cluster receives that a terminal is sent out
When object requests, which first audits whether the terminal is the terminal authorized in the backup policy;If it is mandate
Terminal then prompts terminal input to need the directory name restored or filename, restore the time point of data and restore destination road
Diameter;Then according to these input informations search need restore file where backup file, and by the backup file copy to finger
Determine in path.
4. the method as described in claims 1 or 2 or 3, which is characterized in that the scanning operation, the copy operation are set
Fixed time threshold returns to failure information to operation dispatching server if being more than the time threshold if the execution time;Operation tune
Degree server is that the scanning operation of failure or the copy operation are chosen backup node and re-executed;If swept described in same
The execution number for retouching operation or the copy operation is more than setting threshold value, then stops executing corresponding operation and generate error letters
In breath write-in backup log.
5. the method as described in claims 1 or 2 or 3, which is characterized in that when the operation dispatching server is according to the execution of operation
Between and backup node state selection execute backup node;The wherein described operation is that the scanning operation or the copy are made
Industry.
6. the method as described in claims 1 or 2 or 3, which is characterized in that disk array filesystem manner in a distributed manner
Carry, and a Uniform Name is provided.
7. the method as described in claims 1 or 2 or 3, which is characterized in that will be distributed over the disk on multiple disk arrays to divide
Cloth filesystem manner virtually becomes a logical memory space, which is mounted to a share directory
On each backup node;Each backup node in backup cluster uses the disk of bottom by way of accessing the share directory
Array.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710301054.5A CN108804253B (en) | 2017-05-02 | 2017-05-02 | Parallel operation backup method for mass data backup |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710301054.5A CN108804253B (en) | 2017-05-02 | 2017-05-02 | Parallel operation backup method for mass data backup |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108804253A true CN108804253A (en) | 2018-11-13 |
CN108804253B CN108804253B (en) | 2021-08-06 |
Family
ID=64053876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710301054.5A Active CN108804253B (en) | 2017-05-02 | 2017-05-02 | Parallel operation backup method for mass data backup |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108804253B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522160A (en) * | 2018-11-29 | 2019-03-26 | 上海英方软件股份有限公司 | Compare backup method and system by saving the file information abstract progress file directory |
CN109558215A (en) * | 2018-12-10 | 2019-04-02 | 深圳市木浪云数据有限公司 | Backup method, restoration methods, device and the backup server cluster of virtual machine |
CN109901951A (en) * | 2019-03-05 | 2019-06-18 | 山东浪潮云信息技术有限公司 | A kind of storage system and method for ceph company-data |
CN109976945A (en) * | 2019-02-26 | 2019-07-05 | 深圳市买买提信息科技有限公司 | A kind of method and device of Log backup |
CN110618898A (en) * | 2019-09-11 | 2019-12-27 | 厦门鑫朗软件有限公司 | Method for forced saving file to appointed directory synchronous backup according to process |
CN110688430A (en) * | 2019-08-22 | 2020-01-14 | 阿里巴巴集团控股有限公司 | Method and device for obtaining data bypass and electronic equipment |
CN110795404A (en) * | 2019-10-31 | 2020-02-14 | 京东方科技集团股份有限公司 | Hadoop distributed file system and operation method and repair method thereof |
CN110968463A (en) * | 2019-12-19 | 2020-04-07 | 北京五八信息技术有限公司 | Method and device for determining types of data nodes in group |
CN111159313A (en) * | 2019-12-31 | 2020-05-15 | 广州鼎甲计算机科技有限公司 | Method, system, device and storage medium for database rapid synthesis backup |
CN111339037A (en) * | 2020-02-14 | 2020-06-26 | 西安奥卡云数据科技有限公司 | Efficient parallel replication method for parallel distributed file system |
CN113157645A (en) * | 2021-04-21 | 2021-07-23 | 平安科技(深圳)有限公司 | Cluster data migration method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1567198A (en) * | 2003-06-30 | 2005-01-19 | 联想(北京)有限公司 | Method for mirror backup of cluster platform cross parallel system |
US20150199243A1 (en) * | 2014-01-11 | 2015-07-16 | Research Institute Of Tsinghua University In Shenzhen | Data backup method of distributed file system |
CN105302667A (en) * | 2015-10-12 | 2016-02-03 | 国家计算机网络与信息安全管理中心 | Cluster architecture based high-reliability data backup and recovery method |
US9600487B1 (en) * | 2014-06-30 | 2017-03-21 | EMC IP Holding Company LLC | Self healing and restartable multi-steam data backup |
CN106648967A (en) * | 2016-10-14 | 2017-05-10 | 曙光信息产业(北京)有限公司 | File scanning method and system |
-
2017
- 2017-05-02 CN CN201710301054.5A patent/CN108804253B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1567198A (en) * | 2003-06-30 | 2005-01-19 | 联想(北京)有限公司 | Method for mirror backup of cluster platform cross parallel system |
US20150199243A1 (en) * | 2014-01-11 | 2015-07-16 | Research Institute Of Tsinghua University In Shenzhen | Data backup method of distributed file system |
US9600487B1 (en) * | 2014-06-30 | 2017-03-21 | EMC IP Holding Company LLC | Self healing and restartable multi-steam data backup |
CN105302667A (en) * | 2015-10-12 | 2016-02-03 | 国家计算机网络与信息安全管理中心 | Cluster architecture based high-reliability data backup and recovery method |
CN106648967A (en) * | 2016-10-14 | 2017-05-10 | 曙光信息产业(北京)有限公司 | File scanning method and system |
Non-Patent Citations (3)
Title |
---|
JAMES DA SILVA等: "The Amanda Network Backup Manager", 《LISA》 * |
张媛: "企业级开源备份软件在图书馆数据中心的应用", 《图书馆学刊》 * |
石京燕等: "基于数据库的文件系统管理工具设计与实现", 《计算机工程》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522160B (en) * | 2018-11-29 | 2020-05-05 | 上海英方软件股份有限公司 | Method and system for comparing and backing up file directory by saving file information abstract |
CN109522160A (en) * | 2018-11-29 | 2019-03-26 | 上海英方软件股份有限公司 | Compare backup method and system by saving the file information abstract progress file directory |
CN109558215B (en) * | 2018-12-10 | 2021-09-07 | 深圳市木浪云数据有限公司 | Backup method, recovery method and device of virtual machine and backup server cluster |
CN109558215A (en) * | 2018-12-10 | 2019-04-02 | 深圳市木浪云数据有限公司 | Backup method, restoration methods, device and the backup server cluster of virtual machine |
CN109976945A (en) * | 2019-02-26 | 2019-07-05 | 深圳市买买提信息科技有限公司 | A kind of method and device of Log backup |
CN109901951A (en) * | 2019-03-05 | 2019-06-18 | 山东浪潮云信息技术有限公司 | A kind of storage system and method for ceph company-data |
CN110688430A (en) * | 2019-08-22 | 2020-01-14 | 阿里巴巴集团控股有限公司 | Method and device for obtaining data bypass and electronic equipment |
CN110688430B (en) * | 2019-08-22 | 2023-01-10 | 创新先进技术有限公司 | Method and device for obtaining data bypass and electronic equipment |
CN110618898A (en) * | 2019-09-11 | 2019-12-27 | 厦门鑫朗软件有限公司 | Method for forced saving file to appointed directory synchronous backup according to process |
CN110795404A (en) * | 2019-10-31 | 2020-02-14 | 京东方科技集团股份有限公司 | Hadoop distributed file system and operation method and repair method thereof |
CN110795404B (en) * | 2019-10-31 | 2023-04-07 | 京东方科技集团股份有限公司 | Hadoop distributed file system and operation method and repair method thereof |
CN110968463A (en) * | 2019-12-19 | 2020-04-07 | 北京五八信息技术有限公司 | Method and device for determining types of data nodes in group |
CN111159313A (en) * | 2019-12-31 | 2020-05-15 | 广州鼎甲计算机科技有限公司 | Method, system, device and storage medium for database rapid synthesis backup |
CN111339037A (en) * | 2020-02-14 | 2020-06-26 | 西安奥卡云数据科技有限公司 | Efficient parallel replication method for parallel distributed file system |
CN111339037B (en) * | 2020-02-14 | 2023-06-09 | 西安奥卡云数据科技有限公司 | Efficient parallel replication method for parallel distributed file system |
CN113157645A (en) * | 2021-04-21 | 2021-07-23 | 平安科技(深圳)有限公司 | Cluster data migration method, device, equipment and storage medium |
CN113157645B (en) * | 2021-04-21 | 2023-12-19 | 平安科技(深圳)有限公司 | Cluster data migration method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108804253B (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804253A (en) | A kind of concurrent job backup method for mass data backup | |
US20240126656A1 (en) | Computerized Methods and Apparatus for Data Cloning | |
US11886406B2 (en) | Systems and methods for scalable delocalized information governance | |
JP5731000B2 (en) | Method and system for performing individual restore of a database from a differential backup | |
US7933872B2 (en) | Database backup, refresh and cloning system and method | |
US10942814B2 (en) | Method for discovering database backups for a centralized backup system | |
KR101658964B1 (en) | System and method for datacenter workflow automation scenarios using virtual databases | |
US20130339319A1 (en) | System and method for caching hashes for co-located data in a deduplication data store | |
Dwivedi et al. | Analytical review on Hadoop Distributed file system | |
US11809281B2 (en) | Metadata management for scaled and high density backup environments | |
Sinnamohideen et al. | A {Transparently-Scalable} Metadata Service for the Ursa Minor Storage System | |
US10635635B2 (en) | Metering data in distributed storage environments | |
CN115563073A (en) | Method and device for data processing of distributed metadata and electronic equipment | |
JP2004252957A (en) | Method and device for file replication in distributed file system | |
Tavares et al. | An efficient and reliable scientific workflow system | |
US20240134836A1 (en) | Systems and methods for scalable delocalized information governance | |
Ji et al. | Design and implementation of an island-based file system | |
Wagner | A virtualization approach to scalable enterprise content management | |
CN117495288A (en) | Data asset full life cycle management system and method | |
Padhy et al. | Hadoop File Management System | |
Horninger | The Real MCTS SQL Server 2008 Exam 70-432 Prep Kit: Database Implementation and Maintenance | |
Abu-Libdeh | New applications of data redundancy schemes in cloud and datacenter systems | |
Curtis | GoldenGate | |
Threats | In Cooperation: EuroSys 2012 | |
Honwadkar et al. | An integrated high-performance distributed file system implementation on existing local network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |