CN115033425A - Method for improving success rate of data backup - Google Patents

Method for improving success rate of data backup Download PDF

Info

Publication number
CN115033425A
CN115033425A CN202210604310.9A CN202210604310A CN115033425A CN 115033425 A CN115033425 A CN 115033425A CN 202210604310 A CN202210604310 A CN 202210604310A CN 115033425 A CN115033425 A CN 115033425A
Authority
CN
China
Prior art keywords
data
snapshot
backup
backed
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210604310.9A
Other languages
Chinese (zh)
Inventor
苏莉莉
陈世亮
丁涛
朱庭俊
杨梅
魏小进
卓祖金
刘畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Digital Intelligence Technology Co Ltd
Original Assignee
China Telecom Digital Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Digital Intelligence Technology Co Ltd filed Critical China Telecom Digital Intelligence Technology Co Ltd
Priority to CN202210604310.9A priority Critical patent/CN115033425A/en
Publication of CN115033425A publication Critical patent/CN115033425A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for improving the success rate of data backup, which comprises the steps of recording data at the time point of snapshot by performing snapshot operation on a source data volume before the backup is started; after the backup is started, if a write request occurs in the source initial volume, the snapshot is subjected to copy-on-write operation, whether copy-on-write operation is executed again or not is judged according to the backup condition of the changed data block and the relation between the backup condition and the snapshot, the original data needing to be backed up is stored through the cache file, then the data in the source data volume is updated, the source data volume is normally used in the backup process, and the data is subjected to write change; in the process of backup, data to be backed up is read through the snapshot, and the data at the snapshot time point is backed up in combination with the judgment condition whether the data is in the cache file or not. The invention can reduce the use space of the snapshot and improve the success rate of data backup under the condition of large machine load.

Description

Method for improving data backup success rate
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to a method for improving data backup success rate.
Background
Data backup is the basis of disaster recovery, and in order to prevent data loss caused by system failure and other reasons, it is an indispensable operation to perform full or incremental backup on data on a host. At present, data backup mainly includes file level backup, block level backup and object backup.
When the original disk data changes, the corresponding snapshot device records the change of the disk by using a redirection technology during writing. The snapshot device creates a new block, copies the data at the snapshot time point to the new block, then writes the new data to the disk, and updates the data pointer table in the snapshot to point to the new data block. This technique results in that, in the backup process, when the data of the source data volume has a large amount of changes, the space occupied by the snapshot device is larger and larger, and if the machine load is large at this time, the data backup operation may fail because the space occupied by the snapshot device is larger.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method for improving the success rate of data backup, which can reduce the usage space of snapshots and improve the success rate of data backup under the condition of large machine load, aiming at the defects of the prior art.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
a method for improving data backup success rate comprises the following steps:
s1, recording data at the snapshot time point by performing snapshot operation on the source data volume before the backup is started;
s2, after the backup is started, if a write request occurs in the source initial volume, the snapshot is subjected to copy-on-write operation, whether copy-on-write operation is executed again or not is judged according to the backup condition of the changed data block and the relation between the copy-on-write operation and the snapshot, the original data to be backed up is stored through the cache file, then the data in the source data volume is updated, the source data volume is normally used in the backup process, and the data is subjected to write change;
and S3, in the backup process, reading the data to be backed up through the snapshot, and backing up the data at the snapshot time point by combining the judgment condition of whether the data is in the cache file.
In order to optimize the technical scheme, the specific measures adopted further comprise:
the above S1 includes:
s11, before the backup is started, snapshot operation is carried out on the source data volume, and data information at the backup starting time point is recorded;
s12, starting backup, and reading data at the backup time point by accessing snapshot equipment;
s13, when the data of the source data volume changes, the snapshot device carries out copy-on-write operation, and processes the changed data;
s14, backing up the changed data and storing the data in a corresponding storage medium;
the above S2 includes:
s21, when a write request occurs in the source initial volume in the backup process, the snapshot is subjected to copy-on-write operation;
s22, checking whether the block data subjected to the write operation is backed up, if so, not executing the copy-on-write operation, directly issuing the write request, and executing S25, otherwise, executing S23;
s23, checking whether the data block with write operation exists in the snapshot, if so, not executing copy-on-write operation, directly issuing the write request, and executing S25, otherwise, executing S24;
s24, copying the snapshot when the snapshot is written, storing the original data to be backed up into a cache file by the snapshot device, and then executing S25;
and S25, writing the new data into the source data volume, and updating the data in the source data volume.
The above S3 includes:
s31, starting backup, and reading data to be backed up through the snapshot;
s32, judging whether the data needing to be backed up is stored in the cache file or not according to the offset of the data needing to be backed up, if so, executing S33, otherwise, directly reading the data in the volume;
s33, reading the data from the cache file according to the offset of the data to be backed up;
s34, deleting the node where the data needing to be backed up is located;
s35, rebalancing after the node changes;
and S36, storing the data needing to be backed up in the storage medium.
The above S33 reads the data from the cache file by the binary search method according to the offset of the data to be backed up.
The above-mentioned S35 is rebalanced through the binary tree after the node change.
The invention has the following beneficial effects:
aiming at the problems that the backup program fails to operate due to the fact that the occupied space of snapshot equipment is enlarged when the writing operation of a source data volume frequently occurs in the backup process due to large machine load, the traditional snapshot copy-on-write technology is abandoned, the idea of copy-on-write is used, and original data which are not backed up and need to be changed are recorded in a file, so that the occupied space of the snapshot equipment is reduced; after the data stored in the file is backed up, the data is deleted, so that the size of the file is reduced, the occupied space of the snapshot can be reduced, and the success rate of data backup operation under the condition of high machine load is improved.
Drawings
FIG. 1 is a schematic diagram of the method of the present invention;
FIG. 2 is a schematic diagram of a data backup process;
FIG. 3 is a schematic diagram of a copy-on-write operation for processing snapshots;
FIG. 4 is a diagram illustrating a process of reading data.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
As shown in fig. 1, a method for increasing success rate of data backup of the present invention includes:
s1, before starting backup, recording data at the snapshot (backup start) time point by performing snapshot operation on the source data volume, specifically, as shown in fig. 2, S1 includes:
s11, before the backup is started, snapshot operation is carried out on the source data volume, and data information at the backup starting time point is recorded;
s12, starting backup, and reading data at the backup time point by accessing snapshot equipment;
s13, when the data of the source data volume changes, the snapshot device performs copy-on-write operation to process the changed data;
s14, backing up the changed data and storing the data in a corresponding storage medium;
s2, after the backup is started, if a write request occurs in the source initial volume, the snapshot is subjected to copy-on-write operation, whether copy-on-write operation is executed again or not is judged according to the backup condition of the changed data block and the relation between the backup condition and the snapshot, the original data to be backed up is stored through the cache file, then the data in the source data volume is updated, the source data volume is normally used in the backup process, and the data is subjected to write change;
in the embodiment, after the backup is started, the snapshot device performs copy-on-write operation, and first, backups the read data, where a write request occurs in the source initial volume, and the snapshot will perform copy-on-write operation. Then, checking whether the block data subjected to the write operation is backed up; and if the data block is determined to be not backed up, checking the data block in which the write operation occurs, and determining whether the data block exists in the snapshot. When the changed data block is not backed up yet and exists in the snapshot, the original data to be backed up is stored in a cache file, and after the original data is stored in the cache file, the new data is written into the source data volume.
Specifically, as shown in fig. 3, the S2 includes:
s21, when a write request occurs in the source initial volume in the backup process, the snapshot is subjected to copy-on-write operation;
s22, checking whether the block data subjected to the write operation is backed up, if so, not executing the copy-on-write operation, directly issuing the write request, and executing S25, namely updating the data in the source data volume, otherwise, executing S23;
s23, checking whether the data block with write operation exists in the snapshot, namely checking whether the data in the corresponding data block in the snapshot needs to be changed, if not, not executing copy-on-write operation, directly issuing the write request, executing S25, namely updating the data in the source data volume, otherwise, executing S24;
s24, when the changed data block is not backed up yet and exists in the snapshot, due to the change of the data block, the snapshot will be copied during writing, here, the snapshot device does not create a new data block for storing the original data to be backed up, but stores the original data to be backed up into a cache file, thus reducing the size of the snapshot device, and then executing S25;
and S25, writing the new data into the source data volume, updating the data in the source data volume, and ensuring that the normal use of the source data volume is not influenced in the backup process.
And S3, in the backup process, reading the data to be backed up through the snapshot, and backing up the data at the snapshot time point by combining the judgment condition of whether the data is in the cache file.
In the embodiment, in the process of backing up, firstly, whether the data to be read currently is stored in a cache file is judged according to the offset of the read data, if the data to be backed up is subjected to copy-on-write operation, the data is read from the cache file according to the offset of the data, the data is stored in the cache file by using a data structure of a balanced binary tree, the data is read by a binary search method, and after the data is read, the node is not accessed any more, so the node is deleted.
In addition, when the node is transformed, binary tree balancing is required; if the backup data is not stored in the cache file, the data is read out directly to the volume.
Specifically, as shown in fig. 4, the S3 includes:
s31, starting backup, and reading data to be backed up through the snapshot;
s32, judging whether the data needing to be backed up is stored in a cache file or not according to the offset of the data needing to be backed up, namely, whether the data is copied during writing after the snapshot is finished or not is judged, if so, executing S33, otherwise, directly reading the data in the volume;
and S33, reading the data from the cache file by a binary search method according to the offset of the data to be backed up.
The data is stored in the cache file by using a data structure of a balanced binary tree, and the data is read by a binary search method;
s34, after the data is read out, the node can not be accessed any more, so the node where the data to be backed up is located is deleted;
s35, rebalancing through a binary tree after the nodes are changed;
and S36, storing the data needing to be backed up in the storage medium.
From this point on, snapshot the source data volume from the beginning of the backup; in the backup process, the source data volume is normally used, and data is changed in a writing mode; and the data at the snapshot time point is completely backed up.
Example 1
1. Running backup software backup on a host computer needing data backup;
2. mounting a backup driver.ko;
3. the backup software uses mount command to mount the storage device \ \ desktop-ciku69b \ SDPSHAREFORDER with mount point being mount point 1;
4. the backup software backup acquires the information of the volume/dev/sda 1 on the host;
5. the backup software backup creates snapshot equipment/dev/snapshottdev 1 according to the volume information; (S1)
6. The backup software reads the snapshot device/dev/snapshotdev 1 to obtain backup data; (S2)
7. The backup software stores the read backup data into the storage device through the mount point mount 1; (S4)
8. In the backup process, modifying the testfile on the volume/dev/sda 1; (S5)
9. After the background driver.ko is driven to check the block data change, storing the data to be cached into a/tmp/cachefile file; (S8)
10. When the backup software accesses the snapshot device/dev/snapshottdev 1 to read the block data, if the part of the data is cached, the data can be read in the/tmp/cachefile file; (S101)
11. And after all data are read and stored in the storage device, deleting 1 snapshot device/dev/snapshottdev, deleting 1 cache file/tmp/cachefile, and exiting backup software backup.
From here the backup data program is fully done.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (6)

1. A method for improving success rate of data backup is characterized by comprising the following steps:
s1, recording the data at the snapshot time point by performing snapshot operation on the source data volume before the backup starts;
s2, after the backup is started, if a write request occurs in the source initial volume, the snapshot is subjected to copy-on-write operation, whether copy-on-write operation is executed again or not is judged according to the backup condition of the changed data block and the relation between the backup condition and the snapshot, the original data to be backed up is stored through the cache file, then the data in the source data volume is updated, the source data volume is normally used in the backup process, and the data is subjected to write change;
and S3, in the backup process, reading the data to be backed up through the snapshot, and backing up the data at the snapshot time point by combining the judgment condition of whether the data is in the cache file.
2. The method as claimed in claim 1, wherein the S1 includes:
s11, before the backup is started, snapshot operation is carried out on the source data volume, and data information at the backup starting time point is recorded;
s12, starting backup, and reading data at the backup time point by accessing snapshot equipment;
s13, when the data of the source data volume changes, the snapshot device performs copy-on-write operation to process the changed data;
and S14, backing up the changed data and storing the data in a corresponding storage medium.
3. The method as claimed in claim 1, wherein the S2 includes:
s21, when a write request occurs in the source initial volume in the backup process, the snapshot is subjected to copy-on-write operation;
s22, checking whether the block data subjected to the write operation is backed up, if so, not executing the copy-on-write operation, directly issuing the write request, and executing S25, otherwise, executing S23;
s23, checking whether the data block with write operation exists in the snapshot, if so, not executing copy-on-write operation, directly issuing the write request, and executing S25, otherwise, executing S24;
s24, copying the snapshot when the snapshot is written, storing the original data to be backed up into a cache file by the snapshot device, and then executing S25;
and S25, writing the new data into the source data volume, and updating the data in the source data volume.
4. The method of claim 1, wherein the S3 comprises:
s31, starting backup, and reading data to be backed up through the snapshot;
s32, judging whether the data needing to be backed up is stored in the cache file or not according to the offset of the data needing to be backed up, if so, executing S33, otherwise, directly reading the data in the volume;
s33, reading the data from the cache file according to the offset of the data to be backed up;
s34, deleting the node where the data needing to be backed up is located;
s35, rebalancing after the node changes;
and S36, storing the data needing to be backed up in the storage medium.
5. The method as claimed in claim 4, wherein the step S33 is to read the data from the cache file by a binary search method according to the offset of the data to be backed up.
6. The method of claim 4, wherein the S35 is rebalanced through a binary tree after node change.
CN202210604310.9A 2022-05-31 2022-05-31 Method for improving success rate of data backup Pending CN115033425A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210604310.9A CN115033425A (en) 2022-05-31 2022-05-31 Method for improving success rate of data backup

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210604310.9A CN115033425A (en) 2022-05-31 2022-05-31 Method for improving success rate of data backup

Publications (1)

Publication Number Publication Date
CN115033425A true CN115033425A (en) 2022-09-09

Family

ID=83122589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210604310.9A Pending CN115033425A (en) 2022-05-31 2022-05-31 Method for improving success rate of data backup

Country Status (1)

Country Link
CN (1) CN115033425A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115509808A (en) * 2022-09-19 2022-12-23 安徽鼎甲计算机科技有限公司 Data backup method and device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115509808A (en) * 2022-09-19 2022-12-23 安徽鼎甲计算机科技有限公司 Data backup method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
JP6708929B2 (en) Storage control device, storage system, and storage control program
JP4422016B2 (en) Method and apparatus for creating a virtual data copy
US7472139B2 (en) Database recovery method applying update journal and database log
US6694413B1 (en) Computer system and snapshot data management method thereof
US8762661B2 (en) System and method of managing metadata
JP4419884B2 (en) Data replication apparatus, method, program, and storage system
US7640276B2 (en) Backup system, program and backup method
CN108431783B (en) Access request processing method and device and computer system
US20060236051A1 (en) High-speed snapshot method
CN109902034B (en) Snapshot creating method and device, electronic equipment and machine-readable storage medium
US11030092B2 (en) Access request processing method and apparatus, and computer system
WO2023116346A1 (en) Method and system for recovering trim data under abnormal power failure, and solid-state drive
US20070266215A1 (en) Computer system for managing number of writes for storage medium and control method therefor
KR20170054767A (en) Database management system and method for modifying and recovering data the same
CN112035294A (en) Security log file system, and implementation method and medium thereof
CN111414320B (en) Method and system for constructing disk cache based on nonvolatile memory of log file system
CN115033425A (en) Method for improving success rate of data backup
CA1222062A (en) Method for protecting volatile primary store in a staged storage system by circularly journaling updates into finite nonvolatile local memory
US9535796B2 (en) Method, apparatus and computer for data operation
CN111338850A (en) Method and system for improving backup efficiency based on COW mode multi-snapshot
CN114780489B (en) Method and device for realizing distributed block storage bottom layer GC
JP4841408B2 (en) Volume migration program and method
KR101686340B1 (en) Method for efficient non-volatile cache load management for high capacity storage apparatus
CN110286850B (en) Writing method and recovery method of metadata of solid state disk and solid state disk
US6266739B1 (en) Method and apparatus for ultra high-speed formatting of a disk drive volume

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination