CN108845895B - Streaming incremental backup method and device based on virtual disk file - Google Patents

Streaming incremental backup method and device based on virtual disk file Download PDF

Info

Publication number
CN108845895B
CN108845895B CN201810736684.XA CN201810736684A CN108845895B CN 108845895 B CN108845895 B CN 108845895B CN 201810736684 A CN201810736684 A CN 201810736684A CN 108845895 B CN108845895 B CN 108845895B
Authority
CN
China
Prior art keywords
metadata
virtual disk
disk file
file
offset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810736684.XA
Other languages
Chinese (zh)
Other versions
CN108845895A (en
Inventor
罗亭
陶杰
钱振宇
马晓峰
许广彬
谭瑞忠
濮天晖
张银滨
郭晓
张欢
刘庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huayun data holding group Co., Ltd
WUXI METRO GROUP Co.,Ltd.
Original Assignee
Wuxi Metro Group Co ltd
Wuxi Huayun Data Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Metro Group Co ltd, Wuxi Huayun Data Technology Service Co Ltd filed Critical Wuxi Metro Group Co ltd
Priority to CN201810736684.XA priority Critical patent/CN108845895B/en
Publication of CN108845895A publication Critical patent/CN108845895A/en
Application granted granted Critical
Publication of CN108845895B publication Critical patent/CN108845895B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a streaming incremental backup method and a device based on a virtual disk file.A virtual disk file metadata in an incremental file chain is scanned, and the difference data volume is calculated in advance so as to determine the offset of the metadata in the whole output virtual disk file stream; opening up a continuous area in the memory for storing metadata; when the difference data is output, the metadata is not output, only the metadata in the memory is updated, and the metadata in the memory is output after all scanning is completed and all the difference data is output. The invention directly uploads the file in a streaming mode without file system transfer, thereby solving the problems of unreliable file system, large capacity occupation and low performance caused by the prior file system transfer.

Description

Streaming incremental backup method and device based on virtual disk file
Technical Field
The invention relates to the technical field of cloud computing, virtualization, incremental backup and disaster recovery, in particular to a streaming incremental backup method and device based on a virtual disk file.
Background
A virtual disk file is generally composed of metadata and a data portion, the metadata manages data in units of blocks, and generally one metadata unit can manage a plurality of data blocks.
Incremental backup refers to that after a full backup or a last incremental backup, each backup needs only to backup data which is increased or modified compared with the previous backup, and the current incremental technology usually identifies the first difference data block allocation and outputs a metadata unit, and then if the data block belongs to the data block managed by the metadata unit again, the data block is output and the metadata unit which has been output before is modified, so that the output target end needs to have the characteristic of random writing, which is usually a file system.
However, if the output target end is a stream type (such as object storage, access through http network protocol), random writing is not available, and only the function of additional writing is available, the random writing cannot be satisfied, or the random writing can only be completely output to a file system and then uploaded in a stream type.
The prior art scheme has the disadvantage that the file system becomes a bottleneck of reliability, performance and capacity by completely outputting the file system and then streaming uploading the file system, and is particularly serious in the case of large concurrency.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a streaming incremental backup method and device based on a virtual disk file, and aims to solve the problems of unreliability, large capacity occupation and low performance of a file system caused by transfer of the existing file system.
The purpose of the invention is realized by adopting the following technical scheme:
a streaming incremental backup method based on virtual disk files comprises the following steps:
the method comprises the steps of determining, scanning metadata of a virtual disk file in an incremental file chain to be backed up, calculating a difference data amount, and determining a metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
a storage step, opening up a continuous area in the memory for storing metadata;
an output step, scanning metadata in the memory, if the metadata display data are different, reading the difference data according to the metadata information and outputting the difference data to the virtual disk file stream, and only updating the metadata in the memory without outputting the metadata when outputting the difference data; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
On the basis of the foregoing embodiment, preferably, after the determining step, the method further includes:
and a header step of outputting the header of the incremental file chain to the virtual disk file stream, wherein the output header contains metadata offset.
On the basis of any of the above embodiments, preferably, the storing step further includes:
the continuous area opened up is initialized to zero.
On the basis of any of the foregoing embodiments, preferably, in the determining step, the metadata offset satisfies: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
Or, preferably, in the determining step, each metadata corresponds to a plurality of data blocks, and the metadata offset satisfies: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata.
A streaming incremental backup device based on virtual disk files comprises:
the determining module is used for scanning the metadata of the virtual disk file in the incremental file chain to be backed up, calculating the difference data amount and determining the metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
the storage module is used for opening up a continuous area in the memory to store the metadata;
the output module is used for scanning metadata in the memory, reading the difference data according to the metadata information and outputting the difference data to the virtual disk file stream if the metadata display data are different, and only updating the metadata in the memory without outputting the metadata when outputting the difference data; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
On the basis of the above embodiment, it is preferable that the method further includes:
and the head module is used for outputting the head of the incremental file chain to the virtual disk file stream, and the output head contains metadata offset.
On the basis of any of the above embodiments, preferably, the storage module is further configured to:
the continuous area opened up is initialized to zero.
On the basis of any of the above embodiments, it is preferable that the metadata offset satisfies: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
Or, preferably, each metadata corresponds to a plurality of data blocks, and the metadata offset satisfies: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata.
Compared with the prior art, the invention has the beneficial effects that:
the invention discloses a streaming incremental backup method and a device based on a virtual disk file.A virtual disk file metadata in an incremental file chain is scanned, and the difference data volume is calculated in advance so as to determine the offset of the metadata in the whole output virtual disk file stream; opening up a continuous area in the memory for storing metadata; when the difference data is output, the metadata is not output, only the metadata in the memory is updated, and the metadata in the memory is output after all scanning is completed and all the difference data is output. The invention directly uploads the file in a streaming mode without file system transfer, thereby solving the problems of unreliable file system, large capacity occupation and low performance caused by the prior file system transfer.
Drawings
The invention is further illustrated with reference to the following figures and examples.
Fig. 1 is a schematic flowchart illustrating a streaming incremental backup method based on a virtual disk file according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating an application scenario of a streaming incremental backup method based on a virtual disk file according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram illustrating a streaming incremental backup apparatus based on a virtual disk file according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
Detailed description of the preferred embodiment
The embodiment of the invention provides a streaming incremental backup method based on a virtual disk file, which comprises the following steps:
the method comprises the steps of determining, scanning metadata of a virtual disk file in an incremental file chain to be backed up, calculating a difference data amount, and determining a metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
a storage step, opening up a continuous area in the memory for storing metadata;
an output step, scanning metadata in the memory, if the metadata display data are different, reading the difference data according to the metadata information and outputting the difference data to the virtual disk file stream, and only updating the metadata in the memory without outputting the metadata when outputting the difference data; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
The embodiment of the invention firstly scans the metadata of the virtual disk file in the incremental file chain, calculates the difference data volume in advance, and determines the offset of the metadata in the whole output virtual disk file stream; opening up a continuous area in the memory for storing metadata; when the difference data is output, the metadata is not output, only the metadata in the memory is updated, and the metadata in the memory is output after all scanning is completed and all the difference data is output. The embodiment of the invention directly uploads the file in a streaming mode without file system transfer, thereby solving the problems of unreliable file system, large capacity occupation and low performance caused by the prior file system transfer.
Preferably, after the determining step, the method may further include: and a header step of outputting the header of the incremental file chain to the virtual disk file stream, wherein the output header contains metadata offset. This has the advantage that the metadata offset is output in advance when the file header is output.
Preferably, the storing step may further include: the continuous area opened up is initialized to zero. This has the advantage that it prevents other data from contaminating the delta virtual disk file data.
The embodiment of the present invention does not limit the metadata offset, and preferably, in the determining step, the metadata offset may satisfy: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
Or, each metadata corresponds to a plurality of data blocks, and the metadata offset may satisfy: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata. This has the advantage that each output metadata is located after its corresponding number of data blocks in the virtual disk file stream.
As shown in fig. 1, a preferred case of the embodiment of the present invention may be:
giving a group of incremental virtual disk files with chain relations, and determining head and tail virtual disk file increments 1 and virtual disk file increments N of an incremental file chain;
scanning the metadata of the virtual disk file in the incremental file chain, and calculating the incremental data amount to determine the offset of the metadata in the whole output virtual disk file stream;
outputting the incremental virtual disk file head needing to be finally exported to a stream, wherein the stream contains metadata offset;
a continuous space is pre-opened in the memory (the space size depends on the virtual size of the virtual disk file), and is initialized to be zero;
scanning metadata of a virtual disk file increment 1 and a virtual disk file increment N (including an increment file in the middle of the virtual disk file increment), if the metadata display data are different, reading the difference data according to metadata information and outputting the difference data to a stream, and updating a metadata designated area in a memory at the same time until all scanning is finished;
and outputting the metadata in the memory to the stream to form a complete incremental virtual disk file stream.
As shown in fig. 2, one application scenario of the present invention may be: the cloud platform comprises a virtual machine 0 and a local disk D which correspond to a Qcow2Active file (a virtual disk file format) in a computing node, wherein a plurality of incremental files and last backup points (full files) exist in a chain of the virtual disk file.
In the foregoing embodiment a, a streaming incremental backup method based on a virtual disk file is provided, and correspondingly, the present application further provides a streaming incremental backup apparatus based on a virtual disk file. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
Detailed description of the invention
As shown in fig. 3, an embodiment of the present invention provides a streaming incremental backup apparatus based on a virtual disk file, including:
the determining module 201 is configured to scan metadata of a virtual disk file in an incremental file chain to be backed up, calculate a difference data amount, and determine a metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
a storage module 202, configured to open up a continuous area in the memory for storing the metadata;
the output module 203 is configured to scan metadata in the memory, and if the metadata display data are different, read the difference data according to the metadata information and output the difference data to the virtual disk file stream, and when the difference data is output, the metadata is not output, and only the metadata in the memory is updated; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
The embodiment of the invention firstly scans the metadata of the virtual disk file in the incremental file chain, calculates the difference data volume in advance, and determines the offset of the metadata in the whole output virtual disk file stream; opening up a continuous area in the memory for storing metadata; when the difference data is output, the metadata is not output, only the metadata in the memory is updated, and the metadata in the memory is output after all scanning is completed and all the difference data is output. The embodiment of the invention directly uploads the file in a streaming mode without file system transfer, thereby solving the problems of unreliable file system, large capacity occupation and low performance caused by the prior file system transfer.
Preferably, the embodiment of the present invention may further include: and the head module is used for outputting the head of the incremental file chain to the virtual disk file stream, and the output head contains metadata offset.
Preferably, the storage module 202 is further configured to: the continuous area opened up is initialized to zero.
The embodiment of the present invention does not limit the metadata offset, and preferably, the metadata offset satisfies the following condition: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
Or, preferably, each metadata corresponds to a plurality of data blocks, and the metadata offset satisfies: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata.
The present invention has been described in terms of its practical application, and it is to be understood that the above description and drawings are only illustrative of the presently preferred embodiments of the invention and are not to be considered as limiting, since all changes that come within the meaning and range of equivalency of the claims are to be embraced within their scope.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. Although the present invention has been described to a certain extent, it is apparent that appropriate changes in the respective conditions may be made without departing from the spirit and scope of the present invention. It is to be understood that the invention is not limited to the described embodiments, but is to be accorded the scope consistent with the claims, including equivalents of each element described. Various other modifications and changes may be made by those skilled in the art based on the above-described technical solutions and concepts, and all such modifications and changes should fall within the scope of the claims of the present invention.

Claims (10)

1. A streaming incremental backup method based on a virtual disk file is characterized by comprising the following steps:
the method comprises the steps of determining, scanning metadata of a virtual disk file in an incremental file chain to be backed up, calculating a difference data amount, and determining a metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
a storage step, opening up a continuous area in the memory for storing metadata;
an output step, scanning metadata in the memory, if the metadata display data are different, reading the difference data according to the metadata information and outputting the difference data to the virtual disk file stream, and only updating the metadata in the memory without outputting the metadata when outputting the difference data; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
2. The streaming incremental backup method based on virtual disk files according to claim 1, wherein after the determining step, the method further comprises:
and a header step of outputting the header of the incremental file chain to the virtual disk file stream, wherein the output header contains metadata offset.
3. The streaming incremental backup method based on the virtual disk file according to claim 1 or 2, wherein the storing step further comprises:
the continuous area opened up is initialized to zero.
4. The streaming incremental backup method based on the virtual disk file as claimed in claim 1 or 2, wherein in the determining step, the metadata offset satisfies: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
5. The streaming incremental backup method based on the virtual disk file as claimed in claim 1 or 2, wherein in the determining step, each metadata corresponds to a plurality of data blocks, and the metadata offset satisfies: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata.
6. A streaming incremental backup device based on virtual disk files is characterized by comprising:
the determining module is used for scanning the metadata of the virtual disk file in the incremental file chain to be backed up, calculating the difference data amount and determining the metadata offset according to the difference data amount; the incremental file chain is an incremental virtual disk file with a chain relation; the metadata offset comprises the offset of each metadata in the finally output virtual disk file stream;
the storage module is used for opening up a continuous area in the memory to store the metadata;
the output module is used for scanning metadata in the memory, reading the difference data according to the metadata information and outputting the difference data to the virtual disk file stream if the metadata display data are different, and only updating the metadata in the memory without outputting the metadata when outputting the difference data; and finally outputting the metadata in the memory to the virtual disk file stream according to the metadata offset after finishing outputting the difference data.
7. The streaming incremental backup device based on virtual disk files according to claim 6, further comprising:
and the head module is used for outputting the head of the incremental file chain to the virtual disk file stream, and the output head contains metadata offset.
8. The streaming incremental backup device based on virtual disk files according to claim 6 or 7, wherein the storage module is further configured to:
the continuous area opened up is initialized to zero.
9. The streaming incremental backup device based on virtual disk files according to claim 6 or 7, wherein the metadata offset satisfies: the metadata is located at the end of the stream of virtual disk files that is ultimately output.
10. The streaming incremental backup device based on virtual disk files of claim 6 or 7, wherein each metadata corresponds to a plurality of data blocks, and the metadata offset satisfies: and enabling each metadata to be positioned in the finally output virtual disk file stream and behind a plurality of data blocks corresponding to the metadata.
CN201810736684.XA 2018-07-06 2018-07-06 Streaming incremental backup method and device based on virtual disk file Active CN108845895B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810736684.XA CN108845895B (en) 2018-07-06 2018-07-06 Streaming incremental backup method and device based on virtual disk file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810736684.XA CN108845895B (en) 2018-07-06 2018-07-06 Streaming incremental backup method and device based on virtual disk file

Publications (2)

Publication Number Publication Date
CN108845895A CN108845895A (en) 2018-11-20
CN108845895B true CN108845895B (en) 2022-02-08

Family

ID=64201255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810736684.XA Active CN108845895B (en) 2018-07-06 2018-07-06 Streaming incremental backup method and device based on virtual disk file

Country Status (1)

Country Link
CN (1) CN108845895B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110351386B (en) * 2019-07-23 2022-09-16 华云工业互联网有限公司 Increment synchronization method and device between different copies

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609419A (en) * 2009-06-29 2009-12-23 北京航空航天大学 The data back up method and the device of the migration of virtual machine consistently online
CN102609365A (en) * 2012-02-15 2012-07-25 合一网络技术(北京)有限公司 Virtual disk system and file storage method based on virtual disk system
CN104899071A (en) * 2015-04-29 2015-09-09 深圳市深信服电子科技有限公司 Recovery method and recovery system of virtual machine in cluster
CN106844095A (en) * 2016-12-27 2017-06-13 上海爱数信息技术股份有限公司 File backup method, system and the client with the system
CN107092538A (en) * 2017-03-14 2017-08-25 平安科技(深圳)有限公司 Virtual-machine data backup method and system
US9946603B1 (en) * 2015-04-14 2018-04-17 EMC IP Holding Company LLC Mountable container for incremental file backups

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609419A (en) * 2009-06-29 2009-12-23 北京航空航天大学 The data back up method and the device of the migration of virtual machine consistently online
CN102609365A (en) * 2012-02-15 2012-07-25 合一网络技术(北京)有限公司 Virtual disk system and file storage method based on virtual disk system
US9946603B1 (en) * 2015-04-14 2018-04-17 EMC IP Holding Company LLC Mountable container for incremental file backups
CN104899071A (en) * 2015-04-29 2015-09-09 深圳市深信服电子科技有限公司 Recovery method and recovery system of virtual machine in cluster
CN106844095A (en) * 2016-12-27 2017-06-13 上海爱数信息技术股份有限公司 File backup method, system and the client with the system
CN107092538A (en) * 2017-03-14 2017-08-25 平安科技(深圳)有限公司 Virtual-machine data backup method and system

Also Published As

Publication number Publication date
CN108845895A (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN109543455B (en) Data archiving method and device for block chain
CN109508246A (en) Log recording method, system and computer readable storage medium
CN111444196B (en) Method, device and equipment for generating Hash of global state in block chain type account book
SG11201901608VA (en) Method for accessing distributed storage system, related apparatus, and related system
CN108255989B (en) Picture storage method and device, terminal equipment and computer storage medium
CN111309245B (en) Hierarchical storage writing method and device, reading method and device and system
CN112748877A (en) File integration uploading method and device and file downloading method and device
CN103473258A (en) Cloud storage file system
CN113495889A (en) Distributed object storage method and device, electronic equipment and storage medium
CN108845895B (en) Streaming incremental backup method and device based on virtual disk file
JP2016066285A (en) Storage system, control method for storage system, and virtual tape device control program
CN110381128B (en) Uploading method and cloud storage model suitable for streaming media file
CN108230487A (en) The method and apparatus of shared camera resource
CN113448946B (en) Data migration method and device and electronic equipment
CN107122140A (en) A kind of file intelligent storage method based on metadata information
CN102651674B (en) Data transmission method of reflective memory network
CN113127438B (en) Method, apparatus, server and medium for storing data
CN111435323B (en) Information transmission method, device, terminal, server and storage medium
CN114428764B (en) File writing method, system, electronic device and readable storage medium
CN111400056A (en) Message queue-based message transmission method, device and equipment
EP3731491A1 (en) Method and device for downloading resources
CN111552575A (en) Message queue-based message consumption method, device and equipment
CN113360095B (en) Hard disk data management method, device, equipment and medium
CN111625502B (en) Data reading method and device, storage medium and electronic device
CN112148220B (en) Method, device, computer storage medium and terminal for realizing data processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 214000 Qingyang Road 228, Wuxi City, Jiangsu Province

Patentee after: WUXI METRO GROUP Co.,Ltd.

Patentee after: Huayun data holding group Co., Ltd

Address before: 214000 Qingyang Road 228, Wuxi City, Jiangsu Province

Patentee before: WUXI METRO GROUP Co.,Ltd.

Patentee before: Wuxi Huayun Data Technology Service Co., Ltd