CN112860486A - Method and device for creating data copy - Google Patents

Method and device for creating data copy Download PDF

Info

Publication number
CN112860486A
CN112860486A CN202110149931.8A CN202110149931A CN112860486A CN 112860486 A CN112860486 A CN 112860486A CN 202110149931 A CN202110149931 A CN 202110149931A CN 112860486 A CN112860486 A CN 112860486A
Authority
CN
China
Prior art keywords
data
copy
file
data block
creating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110149931.8A
Other languages
Chinese (zh)
Other versions
CN112860486B (en
Inventor
陈元强
吴健辉
李文祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Mulangyun Technology Co ltd
Original Assignee
Shenzhen Mulang Cloud Data Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Mulang Cloud Data Co ltd filed Critical Shenzhen Mulang Cloud Data Co ltd
Priority to CN202110149931.8A priority Critical patent/CN112860486B/en
Publication of CN112860486A publication Critical patent/CN112860486A/en
Application granted granted Critical
Publication of CN112860486B publication Critical patent/CN112860486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for creating a data copy. Wherein, the method comprises the following steps: receiving a data copy establishing request; in response to the data copy creation request, determining a type of copy mode to be employed for creating the data copy, wherein the type of copy mode comprises at least one of: a fast replica mode and an incremental replica mode; creating the data copy based on the type of copy mode. The invention solves the technical problem of excessive storage space consumption in the prior art, and has the beneficial effect of saving the storage space.

Description

Method and device for creating data copy
Technical Field
The invention relates to the field of data storage, in particular to a method and a device for creating a data copy.
Background
When creating a copy of data using a conventional file system, the file system, such as XFS, needs to completely clone all the data of a file and then write it over. When the file size comes in, it is very slow and requires the same amount of data space as the original. In this way, it is not acceptable, either in time or space.
In order to solve the problems existing in the traditional file system for creating the data copy, a scheme for creating the copy by deleting the storage is also provided. The relationship between the data fragment and the data fingerprint is recorded in a deduplication module of the deduplication storage. The description of a file is composed of metadata and data blocks. The metadata consists of inodes (file inodes) and bitmaps (data block index sets of the current file). When a copy is created for a file, the simplest way is to copy a copy of the metadata.
In actual business, many virtual machines of customers have their disks at the upper TB (terabyte) level as the base of the backup system. When a file size reaches 1TB, the entire metadata space occupation is also large. As a base of a backup system, many upper-layer services rely on the technology of quick copy, and particularly incremental backup needs to copy one piece of data as a reference. Frequent copying of such metadata causes a great pressure on the hard disk IO and also a great consumption of storage space. Therefore, such a design is not acceptable as a backup system base.
In order to solve the above problems, a scheme for creating a copy by a ceprbd module volume cloning function is also proposed. The Cephrbd module provides a volume snapshot based clone that snapshots the specified volume and then quickly generates a new volume based on the snapshot. However, the original snapshot may not be deleted. When deletion is required, the cloned volume must also be unhooked from the original volume. Furthermore, none of the operations are store auto-complete.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for creating a data copy, which at least solve the technical problem of excessive storage space consumption in the prior art.
According to an aspect of the embodiments of the present invention, there is provided a method for creating a data copy, including: receiving a data copy establishing request; in response to the data copy creation request, determining a type of copy mode to be employed for creating the data copy, wherein the type of copy mode comprises at least one of: a fast replica mode and an incremental replica mode; creating the data copy based on the type of copy mode.
According to another aspect of the embodiments of the present invention, there is also provided a method for creating a data copy, the method including: under the condition that an original data copy exists in an original file, creating a data block reference set as a data block reference set of the data copy; pointing the data block reference set to a data block reference set positioned at the chain tail of a copy chain of the latest data copy of an original file and pointing to a data block which is newly added to a file corresponding to the data copy relative to a file corresponding to the most received data copy; creating the data copy based on the set of data block references.
According to another aspect of the embodiments of the present invention, there is also provided a method for creating a data copy, the method including: under the condition that an original data copy exists in the original file, copying a data block reference set in the latest data copy of the original file as a data block reference set of the data copy; pointing the data block reference set to the data block reference set of the original file and pointing to a data block which is newly added to the file corresponding to the data copy relative to the original file; creating the data copy based on the set of data block references.
According to another aspect of the embodiments of the present invention, there is also provided a method for backing up data, including: receiving a data backup request; in response to the data backup request, performing a method of creating a data copy as described in any of the above.
According to another aspect of the embodiments of the present invention, there is also provided a data recovery method, including: receiving a data recovery request; and responding to the data recovery request, and executing the data copy creation according to the method for creating the data copy for data recovery.
In the embodiment of the invention, the adopted mode solves the technical problem of excessive storage space consumption in the prior art through a quick copy mode or an incremental copy mode, and has the beneficial effect of saving the data storage space.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1A is a schematic diagram of a file data structure according to an embodiment of the present disclosure;
FIG. 1B is a schematic diagram of writing and reading data based on octopus according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of a method of data backup according to an embodiment of the present disclosure;
FIG. 3 is a flow diagram of a method of data recovery according to an embodiment of the present disclosure;
FIG. 4 is a schematic flow diagram of a method for creating a data copy based on an original file, according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a data structure in a process of creating a data copy based on an original file according to an embodiment of the present disclosure;
FIG. 6 is a schematic flow diagram of a method for creating a data copy in a fast copy mode according to an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a data structure in the process of creating a data copy in a fast copy mode according to an embodiment of the present disclosure;
FIG. 8 is a schematic flow diagram of a method for creating a data copy in incremental copy mode according to an embodiment of the present disclosure;
FIG. 9 is a schematic diagram of a data structure in the process of creating a data copy in the incremental copy mode according to an embodiment of the present disclosure;
FIG. 10 is a schematic flow chart diagram of a data structure in a process of creating a data copy in an incremental copy mode according to an embodiment of the present disclosure;
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention, and it is apparent that the described embodiments are only partial embodiments of the present invention, but not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Embodiment mode 1
When external data is written into a specified directory, the external data is written into an octopus storage via vfs (virtual File System), wherein the octopus is a data storage System developed by the wooden wave cloud.
In octopus, as shown in fig. 1A, one file is composed of three parts, which are an inode (file inode), a bitMap (data block reference set), and a datablock (real data block), respectively. bitMap is a set of pointers to real data blocks and datalock is a real saved data block.
FIG. 1B is a schematic diagram of writing and reading data based on octopus according to an embodiment of the present disclosure. As shown in FIG. 1B, after receiving the request to read and write data, the file system calls vfs to read or write data from or to the octapus based on the request to read and write data.
Embodiment mode 2
Fig. 2 is a flow chart of a method for creating a data copy according to an embodiment of the present disclosure, as shown in fig. 2, the method includes the following steps:
in step S201, a data backup request is received.
Step S202, judging the backup type.
The backup types include full backup and incremental backup. The full backup is to backup all definition sets of the data object, no matter whether the data object is modified since the last backup, the incremental backup is to backup only files which are increased or modified compared with the previous backup after one full backup or the last incremental backup.
Step S203 is executed if the backup type is full backup, and step S205 is executed if the backup type is incremental backup.
Step S203, full backup.
In step S204, a disk file is generated.
After the disk file is generated, the flow ends.
And step S205, incremental backup.
And in the case that the backup type is incremental backup, preparing to adopt an incremental mode for data backup.
Step S206, a disk copy is created for incremental backup.
When incremental backup is performed, the data backed up last time and the incremental data this time need to be merged together to form a new backup record. Thus, a copy of the data is created based on the last successfully backed up disk. The particular backup program generates a quick copy or incremental copy as needed.
Step S207, determining the copy mode.
The copy mode includes a fast copy mode and an incremental copy mode. If the copy mode is the fast copy mode, step S208 is executed, and if the copy mode is the incremental copy mode, step S209 is executed.
Step S208, performing incremental backup in the fast copy mode.
In step S209, incremental backup is performed in the incremental copy mode.
Embodiment 3
Fig. 3 is a flow chart of a method of data recovery according to an embodiment of the present disclosure, as shown in fig. 3, the method includes the following steps:
step S302, data recovery.
A data recovery request is received.
Step S304, create a data copy to nfs server export directory.
When data recovery is carried out, only a specified path is developed by nfs of the backup system based on safety consideration. In this way, only one quick copy needs to be created to the nfs server export directory, and the backed up data content is also prevented from being modified.
In step S305, the recovery form is determined.
Step S306 is executed if the recovery form is data write-back, and step S307 is executed if the recovery form is mount migration.
Step S306, data write back.
And creating a blank disk with the same size as the data copy in the user production storage, and copying all data in the data copy to the blank disk just created through nfs. And after the data is successfully written back, connecting the virtual machine with the new disk, and turning on the power supply of the virtual machine.
In step S307, mount migration.
The virtualization platform accesses the copy data through nfs and then starts the virtual machine. In the running process of the virtual machine, a user clicks to transfer, and the virtualization platform transfers the copy data back to the production storage of the user under the condition that the virtual machine is not closed.
Embodiment 4
FIG. 4 is a schematic flow chart diagram of a method for creating a data copy based on an original file according to an embodiment of the disclosure.
The original file a, as shown in (a) of fig. 5, is composed of inode a, bitmappa, and datablock associated with bitmappa.
As shown in fig. 4, the method for creating a data copy b based on an original file a includes the following steps:
in step S401, the inode a is copied and filename related information is modified to form an inode b.
Step S402, creating a bitMap B pointing to bitMap A and a bitMap B pointing to in inode B, thereby forming a data copy B. The structure of the data copy b is shown in fig. 5 (c).
In step S403, the bitMap AA is created to point to the bitMap a, and the record pointing to the bitMap a in the inode a is modified to point to the bitMap AA, and the formed data structure is as shown in (b) in fig. 5. Wherein, dataBlock pointed by bitMap AA and bitMap AA are both empty. The structure is adopted, so that the bitMap A is not directly referred by the inode, and the subsequent processing of the data copy is facilitated.
The data copy b created directly on the basis of the original file a is also referred to as original data copy.
Embodiment 5
Fast copy mode
FIG. 6 is a flowchart illustrating a method for creating a data copy in a fast copy mode according to an embodiment of the present disclosure. In this embodiment, a fast copy mode is employed to create a data copy.
With the fast copy mode, creation of a copy of data can be done on the order of seconds. To achieve second level data copy creation, the metadata of the replicated file should be reduced as much as possible. A data copy is created for the new file, the file index node (inode) of the file is directly copied and a new incremental data block reference (bitMap) is generated. This bitMap points directly to the bitMap of the original file. The whole data amount is constant and then is dozens of KB, so that the aim of creating the copy at the level of seconds is fulfilled.
Whether in the fast copy mode or the incremental copy mode, the method for creating a first data copy of an original file is performed according to the steps described in the embodiment in fig. 4, and therefore, the creation of a data copy in the fast copy mode described herein is performed on the basis of the data copy b of the original file.
As shown in FIG. 6, the method for creating a data copy in the fast copy mode comprises the following steps:
in step S601, the inode b is copied and related information such as a file name is modified to form an inode c.
Step S602, create a bitmapC to point to bitmapB and inode c to point to bitmapC to form a data copy c, as shown in FIG. 7 (b).
Step S603, create bitMapB point to bitMapB and the record of inode b point to bitMapB is modified to point to bitMapBB so that bitMapB is not directly referenced by inode b, as shown in (a) of FIG. 7. Such a structure facilitates subsequent copy processing.
Through the above steps, a copy chain is formed, for example, a copy chain of data copy c: bitMap C- > bitMap B- > bitMap A. The chain of copies may be shortened in length as the case may be.
In the present embodiment, when creating the data copy c, the data copy b is both the original data copy and the latest data copy of the original file. After data copy c is created, data copy c becomes the most recent data copy the next time a new data copy is created.
Embodiment 6
Incremental copy mode
Fast replication is fast and takes up little space, but the references between data are correspondingly complicated, which is positively correlated by the length of the chain of copies and the number of times the file is replicated. Such a structure has a certain influence on the read data. When better read performance requirements for the data are desired, an option may be to create incremental copies. The incremental copy has the characteristics that the speed of creating the copy is moderate, the capacity occupied by the metadata is not large, and the reference relationship among the metadata is very clear.
Whether in the fast copy mode or the incremental copy mode, the method for creating a first data copy of an original file is performed according to the steps described in the embodiment in fig. 4, and therefore, the creation of a data copy in the incremental copy mode described herein is performed on the basis of the data copy b of the original file.
When creating incremental copies based on the original file, the principle is no different from that of a fast copy, and the steps are performed as described in the embodiment of fig. 4. Incremental copies differ from fast copies in that a new copy of data is created again based on the data copy. Thus, described herein is a process for creating replica file c using incremental replica mode after data replica b is created based on the method in FIG. 4. As shown in fig. 8, the method comprises the following steps:
in step S802, the inode b is copied and the filename related information is modified to form an inode c.
Step S804, copying the bitMap B with the name of bitMap C and pointing the inode C to the bitMap C, thereby forming a data copy C.
The data copy file b and the data copy file c are directly based on the bitMap A at the same time, and the length of the copy chain is fixed to be 2. In this process, the derivation of the data structure is shown in FIG. 9.
Embodiment 7
Merging chains of data replicas
In the incremental copy mode, as the incremental data increases, the size of the bitMap B itself may be larger, so that it is not less consumed to copy the bitMap B.
The quick copy mode and the created copy have the advantages that only inode files are copied and new increment reference bitMaps are created, the data volume is only dozens of KB, the time required by the process of creating the data copy is very short, and the occupied space is very small. But creating copies many times, the reference relationships of the fast copies can be very complex and can have an impact on read performance as the chain of copies gets longer. And the complex reference relationship is not suitable for the service remote copy function and the data cloud function.
Therefore, the present disclosure further proposes a technical solution for merging chains of data copies, and the specific data structure refers to fig. 10.
When one quick copy needs to be on the cloud, and needs to be copied in different places, or the length of a snapshot chain of the quick copy reaches a set value, the quick copy can be converted into an incremental copy mode.
When the size of the increment reference bit map pointed by the increment copy directly exceeds half of the most basic bit map, the increment copy is triggered to be merged into a full copy in consideration of the memory use and read performance influence of a subsequent copy chain. When the merging operation is triggered, the set chain referenced by the recorded data blocks of the data copy is merged into a new bitMap, so that the copy file is converted into a non-data copy state, namely, a full copy mode is adopted when a new copy is created.
Further optimization, the sum of the sizes of the bitmaps referenced by the increments in the copy chain can be directly calculated, if half of the most basic bitmaps are reached, the state of conversion into the incremental copy is skipped, and the state of conversion into the non-data copy is directly converted, namely, when a new copy is created, a full copy mode is adopted.
Embodiment 8
The present embodiment provides a data backup method, which includes: receiving a data backup request; and responding to the data backup request, and executing the method for creating the data copy in any embodiment.
The present embodiment also provides a data recovery method, including: receiving a data recovery request; in response to the data recovery request, the method for creating a data copy in any of the above embodiments is executed to create a data copy for data recovery.
For example, when a backup system performs virtual machine data recovery, two schemes are generally adopted, namely, firstly, a data copy is created, then data is read from the data copy and written into a production environment of a user, and finally a virtual machine is created. Both of these schemes employ the method of creating a copy of the data as described above.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Embodiment 9
This embodiment provides an apparatus for creating a data copy, the apparatus including: a receiving module configured to receive a data copy creation request; a determining module configured to determine, in response to the data copy creation request, a type of copy mode to be employed for creating the data copy, wherein the type of copy mode includes at least one of: a fast replica mode and an incremental replica mode; a creation module configured to create the copy of the data based on a type of the copy mode.
In addition, the device for creating the data copy can also realize all the methods for creating the data copy.
Embodiment 10
The present embodiment provides another apparatus for creating a copy, where in a case where an original file has an original data copy, the apparatus is configured to: a Bitmap creation module configured to create a data block reference set as a data block reference set of the data copy; a pointing module configured to point the data block reference set to a data block reference set located at a chain end of a copy chain of a latest data copy of an original file and point to a data block added to a file corresponding to the data copy relative to a file corresponding to the most received data copy; a fast replica creation module configured to create the data replica based on the set of data block references.
In addition, the device for creating the data copy can also realize all the methods for creating the data copy in the fast copy mode.
Embodiment 11
This embodiment provides another apparatus for creating a copy, where, when an original data copy exists in an original file, the apparatus includes: the copying module is configured to copy a data block reference set in the latest data copy of the original file as a data block reference set of the data copy; the pointer module is configured to point the data block reference set to a data block reference set of the original file and point to a data block which is newly added to the file corresponding to the data copy relative to the original file; an incremental copy creation module configured to create the data copy based on the set of data block references.
In addition, the device for creating the data copy can also realize all the methods for creating the data copy in the incremental copy mode.
Embodiment 12
This embodiment provides a data backup apparatus, including: a write request receiving module configured to receive a data backup request; and the data backup module is configured to respond to the data backup request and execute any method for creating the data copy.
Embodiment 13
This embodiment provides an apparatus for data recovery, including: a restoration request receiving module configured to receive a data restoration request; and the data recovery module is configured to respond to the data recovery request, execute any method for creating the data copy to create the data copy, and create the data copy to the export directory.
Embodiment 14
This embodiment provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program for use in any one of the methods for creating a copy of data described above.
Embodiment 15
This embodiment provides a readable storage medium, in which a computer program is stored, wherein the computer program is used for any one of the above methods for creating a data copy when executed by a processor.
Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Embodiments of the present disclosure also provide the following configurations:
1. a method of creating a copy of data, comprising:
receiving a data copy establishing request;
in response to the data copy creation request, determining a type of copy mode to be employed for creating the data copy, wherein the type of copy mode comprises at least one of: a fast replica mode and an incremental replica mode;
creating the data copy based on the type of copy mode.
2. The method of item 1, wherein creating the data replica based on the type of replica mode comprises:
under the condition that the copy mode is the quick copy mode and an original file has an original data copy, creating a data block reference set for the data copy, and pointing the created data block reference set to a data block reference set located at the chain tail of a copy chain of a latest data copy of the original file and to a data block which is newly added to the file corresponding to the data copy relative to the file corresponding to the latest data copy to create the data copy; under the condition that the copy mode is the incremental copy mode and an original file has an original data copy, copying a data block reference set in a latest data copy of the original file to serve as a data block reference set of the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block newly added to the file corresponding to the data copy relative to the original file to create the data copy;
and under the condition that the copy mode is the quick copy mode or the incremental copy mode, and only an original file exists but no original data copy of the original file exists, creating a data block reference set for the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block newly added to the file corresponding to the data copy relative to the original file.
3. The method of item 2, wherein the chain of replicas includes a plurality of data chunk reference sets formed as a chain, each of the plurality of data chunk reference sets except for a data chunk reference set at a head of the chain as an original file pointing to a previous data chunk reference set in the chain and also pointing to a data chunk added with respect to a file to which the previous data chunk reference set corresponds.
4. The method of item 2, wherein creating the data copy based on the type of copy mode further comprises:
under the condition that an original data copy exists in the original file, copying a file index node in the latest data copy, and modifying a file name and a file path of the file index node to be used as a file index node of the data copy;
under the condition that the original file does not have an original data copy, copying a file index node in the original file, and modifying a file name and a file path of the file index node to be used as a file index node of the data copy.
5. The method of item 2, wherein creating a set of data block references for the data replica and pointing the set of data block references to the set of data block references of the original file and to data blocks of the file corresponding to the data replica that are newly added with respect to the original file comprises:
creating a bitMapB, setting bitMap B to point to bitMap A and a new incremental data block, and setting inode B to point to bitMap B;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are respectively a file index node and a data block reference set of a file corresponding to the data copy.
6. The method of item 5, wherein after inode B is set to point to bitMap B, the method further comprises:
and creating a bitMap AA, pointing the bitMap AA to the bitMap A, and setting the inode a to point to the bitMap AA.
7. The method of item 2, wherein creating a data chunk reference set for the data replica, and pointing the data chunk reference set to a data chunk reference set located at a chain end of a chain of replicas of a latest data replica of the original file, and pointing to a data chunk added to a file corresponding to the data replica relative to a file corresponding to the latest data replica comprises:
creating a bitMapC, setting the bitMapC to point to the bitMapB and new incremental data blocks, and setting the inode c to point to the bitMapC;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data copy, and the inode c and the bitMapC are file index nodes and data block reference sets of the data copy.
8. The method of item 7, wherein after setting inode c to point to bitMapC, the method further comprises: create a bitMapBB, point the bitMapBB to the bitMapBB, and set inode b to point to the bitMapBB.
9. The method of item 2, wherein copying the set of data block references in the most recent data copy of the original file as the set of data block references for the data copy, and pointing the set of data block references to the set of data block references for the original file and to data blocks that are newly added to the file corresponding to the data copy relative to the original file comprises:
copying bitMap B and renameing the same as bitMap C;
pointing inode C to bitMap C, and pointing bitMap C to bitMap A and the newly added data block;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data replica, and the inode c and the bitMapC are file index nodes and data block reference sets of the data replica.
10. The method of any of items 1 to 9, wherein, in a case where the copy mode is the fast copy mode, after creating the data copy, the method further comprises:
closing the created data copy;
and when the length of the copy chain of the created data copy is larger than or equal to a first preset value, adopting the incremental copy mode to create a new data copy instead of the fast copy mode.
11. The method of any of items 1 to 9, wherein, in a case where the replica mode is the incremental replica mode, after creating the data replica, the method further comprises:
and when the data block reference set in the latest data copy is larger than or equal to a second preset value, combining the data block reference sets in the copy chain into a new data block reference set, and creating a new data copy by adopting a full copy mode.
12. The method of any of items 1 to 9, wherein, in a case where the copy mode is the fast copy mode, after creating the data copy, the method further comprises: and calculating the sum of the sizes of all data block reference sets in the copy chain, and if the sum is equal to or greater than a third preset value, adopting a full copy mode to create a new data copy.
13. The method of item 11 or 12, wherein the second and third preset values are half the size of the most basic set of data chunk references.
14. A method of creating a copy of data, wherein the method comprises: in the case of an original file having a copy of the original data:
creating a data block reference set as a data block reference set of the data copy;
pointing the data block reference set to a data block reference set positioned at the chain tail of a copy chain of the latest data copy of an original file and pointing to a data block which is newly added to a file corresponding to the data copy relative to a file corresponding to the most received data copy;
creating the data copy based on the set of data block references.
15. The method of item 14, wherein the chain of replicas includes a plurality of data chunk reference sets formed as a chain, each of the plurality of data chunk reference sets except for a data chunk reference set at a head of the chain as an original file pointing to a previous data chunk reference set in the chain and also pointing to a data chunk added with respect to a file to which the previous data chunk reference set corresponds.
16. The method of item 14, wherein the method further comprises: and under the condition that the original file does not have an original data copy, creating a data block reference set for the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block of a file corresponding to the data copy, which is newly added relative to the original file.
17. The method of item 14, wherein the method further comprises: and copying a file index node in the latest data copy, and modifying the file name and the file path of the file index node to be used as the file index node of the data copy.
18. The method of item 14, wherein creating a set of data block references for the data replica and pointing the set of data block references to the set of data block references of the original file and to data blocks of the file to which the data replica corresponds that are added relative to the original file comprises:
creating a bitMapB, setting bitMap B to point to bitMap A and a new incremental data block, and setting inode B to point to bitMap B;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are respectively a file index node and a data block reference set of a file corresponding to the data copy.
19. The method of item 18, wherein after inode B is set to point to bitMap B, the method further comprises:
and creating a bitMap AA, pointing the bitMap AA to the bitMap A, and setting the inode a to point to the bitMap AA.
20. The method of item 14, wherein creating a set of data block references as the set of data block references for the data replica, pointing the set of data block references to the set of data block references at the end of the chain of copies of the most recent data replica of the original file, and pointing to the data block that the file corresponding to the data replica is newly added with respect to the file corresponding to the most received data replica comprises:
creating a bitMapC, setting the bitMapC to point to the bitMapB and new incremental data blocks, and setting the inode c to point to the bitMapC;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data copy, and the inode c and the bitMapC are file index nodes and data block reference sets of the data copy.
21. The method of item 20, wherein after inode c is set to point to bitMapC, the method further comprises: create a bitMapBB, point the bitMapBB to the bitMapBB, and set inode b to point to the bitMapBB.
22. The method of any of claims 14 to 21, wherein after creating the data copy, the method further comprises:
closing the created data copy;
and when the length of the created copy chain of the data copy is larger than or equal to a first preset value, adopting an incremental copy mode to create a new data copy instead of the fast copy mode.
23. The method of any of claims 14 to 21, wherein after creating the data copy, the method further comprises: and calculating the sum of the sizes of all data block reference sets in the copy chain, and if the sum is equal to or greater than a third preset value, adopting a full copy mode to create a new data copy.
24. The method of item 23, wherein the third preset value is half the size of the most basic set of data chunk references.
25. A method of creating a copy of data, wherein the method comprises: in the case where there is a copy of the original data in the original file,
copying a data block reference set in the latest data copy of an original file as a data block reference set of the data copy;
pointing the data block reference set to the data block reference set of the original file and pointing to a data block which is newly added to the file corresponding to the data copy relative to the original file;
creating the data copy based on the set of data block references.
26. The method of item 25, wherein the method further comprises: and under the condition that the original file does not have an original data copy, creating a data block reference set for the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block of a file corresponding to the data copy, which is newly added relative to the original file.
27. The method of item 25, wherein the method further comprises: and copying a file index node in the latest data copy, and modifying the file name and the file path of the file index node to be used as the file index node of the data copy.
28. The method of item 25, wherein creating a set of data block references for the data replica and pointing the set of data block references to the set of data block references of the original file and to data blocks of the file to which the data replica corresponds that are added relative to the original file comprises:
creating a bitMapB, setting bitMap B to point to bitMap A and a new incremental data block, and setting inode B to point to bitMap B;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are respectively a file index node and a data block reference set of a file corresponding to the data copy.
29. The method of item 28, wherein after inode B is set to point to bitMap B, the method further comprises: and creating a bitMap AA, pointing the bitMap AA to the bitMap A, and setting the inode a to point to the bitMap AA.
30. The method of item 25, wherein copying the set of data block references in the most recent data copy of the original file as the set of data block references for the data copy, and pointing the set of data block references to the set of data block references for the original file and to data blocks that are newly added to the file corresponding to the data copy relative to the original file comprises:
copying bitMap B and renameing the same as bitMap C;
pointing inode C to bitMap C, and pointing bitMap C to bitMap A and the newly added data block;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data replica, and the inode c and the bitMapC are file index nodes and data block reference sets of the data replica.
31. The method of any of claims 25 to 30, wherein after creating the data copy, the method further comprises:
and when the data block reference set in the latest data copy is larger than or equal to a second preset value, combining the data block reference sets in the copy chain into a new data block reference set, and creating a new data copy by adopting a full copy mode.
32. The method of item 31, wherein the second preset value is half the size of the most basic set of data chunk references.
33. A method of data backup, comprising:
receiving a data backup request;
in response to the data backup request, the method of creating a data copy as described in any of items 1 to 32 is performed to create a data copy for data backup.
34. A method of data recovery, comprising:
receiving a data recovery request;
in response to the data recovery request, performing the method for creating a data copy according to any one of items 1 to 32 to create a data copy for data recovery.
35. An apparatus for creating a copy of data, comprising:
a receiving module configured to receive a data copy creation request;
a determining module configured to determine, in response to the data copy creation request, a type of copy mode to be employed for creating the data copy, wherein the type of copy mode includes at least one of: a fast replica mode and an incremental replica mode;
a creation module configured to create the copy of the data based on a type of the copy mode.
36. An apparatus for creating a copy of data, wherein, in the case of an original file having an original copy of data, the apparatus comprises:
a Bitmap creation module configured to create a data block reference set as a data block reference set of the data copy;
a pointing module configured to point the data block reference set to a data block reference set located at a chain end of a copy chain of a latest data copy of an original file and point to a data block added to a file corresponding to the data copy relative to a file corresponding to the most received data copy;
a fast replica creation module configured to create the data replica based on the set of data block references.
37. An apparatus for creating a data copy, wherein in the case of an original data copy existing in an original file, the apparatus comprises:
the copying module is configured to copy a data block reference set in the latest data copy of the original file as a data block reference set of the data copy;
the pointer module is configured to point the data block reference set to a data block reference set of the original file and point to a data block which is newly added to the file corresponding to the data copy relative to the original file;
an incremental copy creation module configured to create the data copy based on the set of data block references.
38. An apparatus for data backup, comprising:
a write request receiving module configured to receive a data backup request;
a data backup module configured to perform the method of creating a data copy as described in any of items 1 to 32 to create a data copy for data backup in response to the data backup request.
39. An apparatus for data recovery, comprising:
a restoration request receiving module configured to receive a data restoration request;
a data recovery module configured to perform the method of creating a data copy according to any one of items 1 to 32 to create a data copy in response to the data recovery request for data recovery.
40. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor executing the computer program for implementing the method of any of items 1-32.
41. A readable storage medium, storing a computer program, wherein the computer program is adapted to be executed by a processor to implement the method according to any of the claims 1-32.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method of creating a copy of data, comprising:
receiving a data copy establishing request;
in response to the data copy creation request, determining a type of copy mode to be employed for creating the data copy, wherein the type of copy mode comprises at least one of: a fast replica mode and an incremental replica mode;
creating the data copy based on the type of copy mode.
2. The method of claim 1, wherein creating the data copy based on the type of copy mode comprises:
under the condition that the copy mode is the quick copy mode and an original file has an original data copy, creating a data block reference set for the data copy, and pointing the created data block reference set to a data block reference set located at the chain tail of a copy chain of a latest data copy of the original file and to a data block which is newly added to the file corresponding to the data copy relative to the file corresponding to the latest data copy to create the data copy;
under the condition that the copy mode is the incremental copy mode and an original file has an original data copy, copying a data block reference set in a latest data copy of the original file to serve as a data block reference set of the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block newly added to the file corresponding to the data copy relative to the original file to create the data copy;
and under the condition that the copy mode is the quick copy mode or the incremental copy mode, and only an original file exists but no original data copy of the original file exists, creating a data block reference set for the data copy, and pointing the data block reference set to the data block reference set of the original file and to a data block newly added to the file corresponding to the data copy relative to the original file.
3. The method of claim 2, wherein the chain of replicas includes a plurality of data block reference sets formed as a chain, each of the plurality of data block reference sets except for a data block reference set at a head of the chain as an original file pointing to a previous data block reference set in the chain and also pointing to a data block added with respect to a file to which the previous data block reference set corresponds.
4. The method of claim 2, wherein creating the data copy based on the type of copy mode further comprises:
under the condition that an original data copy exists in the original file, copying a file index node in the latest data copy, and modifying a file name and a file path of the file index node to be used as a file index node of the data copy;
under the condition that the original file does not have an original data copy, copying a file index node in the original file, and modifying a file name and a file path of the file index node to be used as a file index node of the data copy.
5. The method of claim 2, wherein creating a set of data block references for the data copy, and pointing the set of data block references to the set of data block references of the original file and to data blocks of the file corresponding to the data copy that are newly added relative to the original file comprises:
creating a bitMapB, setting bitMap B to point to bitMap A and a new incremental data block, and setting inode B to point to bitMap B;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are respectively a file index node and a data block reference set of a file corresponding to the data copy.
6. The method of claim 5, wherein after inode B is set to point to bitMap B, the method further comprises:
and creating a bitMap AA, pointing the bitMap AA to the bitMap A, and setting the inode a to point to the bitMap AA.
7. The method of claim 2, wherein creating a set of data block references for the data copy, and pointing the set of data block references to a set of data block references at the end of a chain of copies of a most recent data copy of the original file, and pointing to a data block that is newly added to the file corresponding to the data copy relative to the file corresponding to the most recent data copy comprises:
creating a bitMapC, setting the bitMapC to point to the bitMapB and new incremental data blocks, and setting the inode c to point to the bitMapC;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data copy, and the inode c and the bitMapC are file index nodes and data block reference sets of the data copy.
8. The method of claim 7, wherein after setting inode c to point to bitMapC, the method further comprises: create a bitMapBB, point the bitMapBB to the bitMapBB, and set inode b to point to the bitMapBB.
9. The method of claim 2, wherein copying a set of data block references in the latest data copy of the original file as the set of data block references of the data copy, and pointing the set of data block references to the set of data block references of the original file and to a data block corresponding to the data copy that is newly added to the original file comprises:
copying bitMap B and renameing the same as bitMap C;
pointing inode C to bitMap C, and pointing bitMap C to bitMap A and the newly added data block;
wherein the content of the first and second substances,
the original file is composed of an inode a, a bit Map A and a data block associated with the bit Map A, wherein the inode a and the bit Map A are respectively a file index node and a data block reference set of the original file;
the inode b and the bitMapB are file index nodes and data block reference sets of the latest data replica, and the inode c and the bitMapC are file index nodes and data block reference sets of the data replica.
10. An apparatus for creating a copy of data, comprising:
a receiving module configured to receive a data copy creation request;
a determining module configured to determine, in response to the data copy creation request, a type of copy mode to be employed for creating the data copy, wherein the type of copy mode includes at least one of: a fast replica mode and an incremental replica mode;
a creation module configured to create the copy of the data based on a type of the copy mode.
CN202110149931.8A 2021-02-03 2021-02-03 Method and device for creating data copy Active CN112860486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110149931.8A CN112860486B (en) 2021-02-03 2021-02-03 Method and device for creating data copy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110149931.8A CN112860486B (en) 2021-02-03 2021-02-03 Method and device for creating data copy

Publications (2)

Publication Number Publication Date
CN112860486A true CN112860486A (en) 2021-05-28
CN112860486B CN112860486B (en) 2024-01-12

Family

ID=75987676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110149931.8A Active CN112860486B (en) 2021-02-03 2021-02-03 Method and device for creating data copy

Country Status (1)

Country Link
CN (1) CN112860486B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040030727A1 (en) * 2002-08-06 2004-02-12 Philippe Armangau Organization of multiple snapshot copies in a data storage system
US8117160B1 (en) * 2008-09-30 2012-02-14 Emc Corporation Methods and apparatus for creating point in time copies in a file system using reference counts
US20150081993A1 (en) * 2013-09-13 2015-03-19 Vmware, Inc. Incremental backups using retired snapshots
US20150261776A1 (en) * 2014-03-17 2015-09-17 Commvault Systems, Inc. Managing deletions from a deduplication database
US9152628B1 (en) * 2008-09-23 2015-10-06 Emc Corporation Creating copies of space-reduced files in a file server having a redundant data elimination store
US20160314046A1 (en) * 2015-04-21 2016-10-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
CN109359087A (en) * 2018-06-15 2019-02-19 深圳市木浪云数据有限公司 Instant file index and searching method, apparatus and system
CN110362425A (en) * 2019-06-05 2019-10-22 黄疆 Based on the data copy guard method and system for writing copy
US10853186B1 (en) * 2017-10-13 2020-12-01 EMC IP Holding Company LLC Content indexed integrated copy data by integrating elastic search and storage system snapshot differential data
US20200401489A1 (en) * 2019-06-24 2020-12-24 Commvault Systems, Inc. Content indexing of files in virtual disk block-level backup copies
CN115480704A (en) * 2022-09-21 2022-12-16 曙光信息产业(北京)有限公司 Method, device and equipment for constructing data block group migration mapping table and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040030727A1 (en) * 2002-08-06 2004-02-12 Philippe Armangau Organization of multiple snapshot copies in a data storage system
US9152628B1 (en) * 2008-09-23 2015-10-06 Emc Corporation Creating copies of space-reduced files in a file server having a redundant data elimination store
US8117160B1 (en) * 2008-09-30 2012-02-14 Emc Corporation Methods and apparatus for creating point in time copies in a file system using reference counts
US20150081993A1 (en) * 2013-09-13 2015-03-19 Vmware, Inc. Incremental backups using retired snapshots
US20150261776A1 (en) * 2014-03-17 2015-09-17 Commvault Systems, Inc. Managing deletions from a deduplication database
US20160314046A1 (en) * 2015-04-21 2016-10-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US10853186B1 (en) * 2017-10-13 2020-12-01 EMC IP Holding Company LLC Content indexed integrated copy data by integrating elastic search and storage system snapshot differential data
CN109359087A (en) * 2018-06-15 2019-02-19 深圳市木浪云数据有限公司 Instant file index and searching method, apparatus and system
CN110362425A (en) * 2019-06-05 2019-10-22 黄疆 Based on the data copy guard method and system for writing copy
US20200401489A1 (en) * 2019-06-24 2020-12-24 Commvault Systems, Inc. Content indexing of files in virtual disk block-level backup copies
CN115480704A (en) * 2022-09-21 2022-12-16 曙光信息产业(北京)有限公司 Method, device and equipment for constructing data block group migration mapping table and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HYONG SHIM等: "Characterization of Incremental Data Changes for Efficient Data Protection", 《2013 USENIX ANNUAL TECHNICAL CONFERENCE》, pages 157 - 168 *
敖莉等: "重复数据删除技术", 《软件学报》, vol. 10, no. 5, pages 916 - 929 *
游历: "云端数据完整性验证和数据恢复机制研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 1, pages 138 - 50 *
王栋;边根庆;李睿尧;: "一种基于增量存储的多副本文件版本控制方法", 物联网技术, no. 09, pages 73 - 75 *

Also Published As

Publication number Publication date
CN112860486B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
US9122692B1 (en) Systems and methods for reducing file-system fragmentation when restoring block-level backups utilizing an identification module, an optimization module, and a restore module
US9483511B2 (en) Stubbing systems and methods in a data replication environment
US8725698B2 (en) Stub file prioritization in a data replication system
EP2494456B1 (en) Backup using metadata virtual hard drive and differential virtual hard drive
EP3159796B1 (en) System and method for generating backups of a protected system from a recovery system
US8904125B1 (en) Systems and methods for creating reference-based synthetic backups
US11093387B1 (en) Garbage collection based on transmission object models
US8825602B1 (en) Systems and methods for providing data protection in object-based storage environments
CN109144416B (en) Method and device for querying data
US10628298B1 (en) Resumable garbage collection
CN110941514B (en) Data backup method, data recovery method, computer equipment and storage medium
CN104360914A (en) Incremental snapshot method and device
CN112800019A (en) Data backup method and system based on Hadoop distributed file system
CN113918385A (en) Method and application for online incremental backup and recovery of local storage virtual machine
CN107506466B (en) Small file storage method and system
US9251020B1 (en) Systems and methods for file-level replication
CN105493080A (en) Method and apparatus for context aware based data de-duplication
CN113419897A (en) File processing method and device, electronic equipment and storage medium thereof
US8595271B1 (en) Systems and methods for performing file system checks
CN112912853B (en) Anytime point copy to the cloud
US7865472B1 (en) Methods and systems for restoring file systems
CN112860486B (en) Method and device for creating data copy
CN115658391A (en) Backup recovery method of WAL mechanism based on QianBase MPP database
US9111015B1 (en) System and method for generating a point-in-time copy of a subset of a collectively-managed set of data items
CN115422135A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211027

Address after: 518103 514, building 1, hengtaiyu building, Tangwei community, Fenghuang street, Guangming District, Shenzhen, Guangdong

Applicant after: Shenzhen mulangyun Technology Co.,Ltd.

Address before: No.405, Phoenix building, No.15, Keji North 1st Road, Nanshan District, Shenzhen, Guangdong 518052

Applicant before: SHENZHEN MULANG CLOUD DATA Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant