CN116204356A - Data synthesis method, device, equipment and storage medium based on index redirection - Google Patents

Data synthesis method, device, equipment and storage medium based on index redirection Download PDF

Info

Publication number
CN116204356A
CN116204356A CN202310078328.4A CN202310078328A CN116204356A CN 116204356 A CN116204356 A CN 116204356A CN 202310078328 A CN202310078328 A CN 202310078328A CN 116204356 A CN116204356 A CN 116204356A
Authority
CN
China
Prior art keywords
data
backup data
full
backup
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310078328.4A
Other languages
Chinese (zh)
Inventor
朱箫鸣
冀国威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310078328.4A priority Critical patent/CN116204356A/en
Publication of CN116204356A publication Critical patent/CN116204356A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data synthesis method, device, equipment and storage medium based on index redirection. The method comprises the following steps: executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data; according to the updated stored data and executing incremental backup, acquiring incremental backup data; identifying unchanged data from the initial full-volume backup data according to bitmap information of the changed data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data; when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again. The method can fully utilize the previous unchanged data, ensure the safety of the data and save the storage space.

Description

Data synthesis method, device, equipment and storage medium based on index redirection
Technical Field
The present disclosure relates to the field of cloud computing technologies, and in particular, to a data synthesis method, apparatus, computer device, and storage medium based on index redirection.
Background
With the rise of cloud computing, more and more data of government, enterprises and institutions begin to migrate to the cloud, but with the inevitable of lux viruses and man-made misoperation, more and more customers begin to pay attention to the data security problem, and data backup becomes the last barrier of data security.
In order to ensure that data can be rolled back, the current data backup mode generally adopts a full-quantity- > incremental- > full-quantity mode to store data copies. However, in the process of the second full-size data backup, due to the change of data, the partial data needs to be copied or calculated again in the second full-size data backup, so that a large number of data blocks need to be moved in a disk, and a large amount of temporary storage space is occupied.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data synthesis method, apparatus, computer device and storage medium based on index redirection that can reduce disk movement of data blocks and occupation of temporary space during data synthesis backup by an index redirection technique according to … ….
In one aspect, a method for synthesizing data based on index redirection is provided, the method comprising:
executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
In one embodiment, the method further comprises:
synchronously acquiring an initial directory index when the initial full-quantity backup data is acquired;
after forming new full-volume backup data, updating the initial directory index according to the new full-volume backup data to form a new directory index;
After the new full-backed up data is formed again, the last generated directory index is updated to form a new directory index.
In one embodiment, when forming an initial or new directory index, it includes:
setting a plurality of consecutive data blocks as one block section device;
performing disk space management on the interval block equipment through an interval tree;
and setting a super block corresponding to the interval tree, and carrying out data retrieval according to the directory index stored in the super block.
In one embodiment, the step of updating the initial directory index according to the new full-scale backup data to form a new directory index includes:
dividing the incremental backup data in the new full backup data according to interval block equipment;
correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
the block equipment data nodes and the associated upper nodes of the bottom layer of the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
Newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
and pointing the super block to a new root node to form a new directory index corresponding to the new full backup data, and simultaneously carrying out snapshot on newly added node information.
In one embodiment, the step of identifying no change data from the initial full back-up data according to bitmap information of the change data includes:
identifying deleted data in the initial full-backup data according to bitmap information of the changed data;
and removing the deleted data from the initial full-backup data to form unchanged data.
In one embodiment, the step of integrating the unchanged data and the acquired incremental backup data to form new full backup data includes:
identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
integrating unchanged data, which is opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and executing full-quantity backup with the rest unchanged data in the initial full-quantity backup data and the changed data to form new full-quantity backup data.
In one embodiment, after each formation of the new full back-up data, the method further comprises:
and deleting the incremental backup data.
In another aspect, there is provided a data synthesis apparatus based on index redirection, the apparatus comprising:
the initial full-volume backup data management module is used for executing full-volume backup according to the current stored data to obtain initial full-volume backup data;
the incremental backup data acquisition module is used for acquiring incremental backup data according to the updated storage data and executing incremental backup;
the new full-volume backup data management module is used for acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
and the iteration integration data module is used for acquiring the corresponding incremental backup data and bitmap information of the change data when the storage data is updated again, identifying unchanged data from the total backup data generated last time according to the bitmap information of the change data, integrating the unchanged data and the acquired incremental backup data, and forming new total backup data again.
In yet another aspect, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:
executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
In yet another aspect, a computer readable storage medium is provided, having stored thereon a computer program which when executed by a processor performs the steps of:
Executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
According to the data synthesis method, the device, the computer equipment and the storage medium based on index redirection, after each data updating, the unchanged data and the acquired incremental backup data are integrated to generate the new full backup data, so that the previous unchanged data are fully utilized, repeated backup of the part of data is reduced, data errors caused by the copying process are avoided, the data safety is ensured, and the storage space is saved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a conventional data backup synthesis method;
FIG. 2 is a schematic diagram of a data synthesis method architecture based on index redirection in one embodiment of the present application;
FIG. 3 is a logic diagram of a data synthesis method based on index redirection in one embodiment of the present application;
FIG. 4 is a flow chart of a data synthesis method based on index redirection in one embodiment of the present application;
FIG. 5 is a diagram illustrating retrieval of data blocks using a tree-shaped directory index in one embodiment of the present application;
FIG. 6 is a diagram of forming a full back-up data directory index after a new synthesis in one embodiment of the present application;
FIG. 7 is an application environment diagram of a data synthesis method based on index redirection in one embodiment of the present application;
FIG. 8 is a flow chart of a data synthesis method based on index redirection according to another embodiment of the present application;
FIG. 9 is a flowchart illustrating steps for identifying unchanged data from initial full-backed-up data according to bitmap information of the changed data, and integrating the unchanged data with the obtained incremental backup data to form new full-backed-up data according to one embodiment of the present application;
FIG. 10 is a flowchart illustrating steps in forming an initial or new index in one embodiment of the present application;
FIG. 11 is a flowchart illustrating a step of updating the initial directory index according to the new full-size backup data to form a new directory index according to an embodiment of the present application;
FIG. 12 is a block diagram of a data synthesizing device based on index redirection in one embodiment of the present application;
fig. 13 is an internal structural diagram of a computer device in one embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
As described in the background art, in the process of data backup, a full-volume data backup or a full-volume data backup is generally selected, and then an incremental backup is performed on the basis of the previous day every day, where the incremental backup refers to that the data with the current backup time and the data with the previous backup time are compared and changed to perform the backup. When full-volume backup is performed again according to the setting of the backup policy, a synthetic backup mode is generally adopted to combine new full-backup data, and the conventional method is shown in fig. 1.
In the traditional data synthesis method, when the second full-backup data is synthesized, the unchanged data part and the incremental backup data in the first full-backup data are required to be copied, so that the twice full-backup data and the once incremental backup data occupy a large amount of storage space, and meanwhile, the load of a temporary server and the temporary storage space are required to be occupied for moving and copying related data blocks.
Example 1
In order to solve the above-mentioned problems, the embodiment 1 of the present invention creatively proposes a data synthesis method based on index redirection, in which the problems of the conventional method can be effectively avoided, and the architecture of the data synthesis method based on index redirection is shown in fig. 2.
The synthetic backup method based on index redirection does not really copy or move data when the second full backup data is performed, but points to a virtual part of the second full backup data by modifying the index pointer of the file. Therefore, each incremental data can be synthesized into the last generated full-volume backup data to become new full-volume backup data, and the new full-volume backup data can replace the original full-volume backup data along with the transition of a backup strategy, so that only one brand new full-volume backup data can be reserved in a backup space, and the occupation of a storage space can be greatly reduced.
As shown in fig. 3 and fig. 4, the data synthesis method based on index redirection provided in the present application includes the following steps:
s1, performing full-volume backup for the first time, and performing full-volume initial backup on data to form a basic full-volume backup data.
S2, performing incremental backup for the second time, acquiring incremental backup data, and simultaneously reading bitmap information of the changed data. And reading unchanged data from the initial full-volume backup data according to the information in the change bitmap, and integrating the unchanged data with the acquired incremental backup data to form new full-volume backup data.
And S3, after each incremental backup is executed, the incremental backup data and bitmap information of the change data are acquired, and the background is automatically integrated into new full-volume backup data by combining the full-volume backup data generated last time.
S4, deleting the obtained incremental data after the synthesis of the new full-quantity backup data is completed, and reducing the occupation of the storage space.
And in order to better restore the backup data to any previous time point, the file-level snapshot processing is carried out on the total backup data after the synthesized total backup data is redirected by each index. When data loss occurs, the backup data can be used for recovery, and the data is mainly read in the whole recovery process, so that the index is used for quick query in order to speed up the data reading.
Firstly, as shown in fig. 5, in the management of the backup set data Block, an interval (extension) Block device is used to replace a single data Block (Block) to manage backup data, each interval (extension) Block device is a continuous data Block with a certain length, and interval Tree (extension Tree) provides disk space management, and the data Block is supported by Tree directory index (BTree) stored in Super Block (Super Block), so that the retrieval speed of the data Block can be improved, and the expense of metadata can be reduced.
Next, as shown in fig. 6, snapshot generation is performed based on a copy-on-write (COW) transaction technique when incremental data is synthesized. The incremental data is divided according to interval block devices and then inserted into block device data nodes of the bottom layer in a tree directory index (BTre), the associated upper layer nodes are added or modified, and the like, a chain reaction is initiated, each layer adds or modifies the associated nodes until the Root of the top layer is reached, and a new Root node is generated. When the whole transaction processing of the incremental data merging is completed, the Super Block points to the newly added root node to form the latest synthesized full-volume backup data directory index, and the newly established node information is subjected to snapshot.
After the synthesis of the incremental data is finished, node information before synthesis is not deleted, including block equipment information, backup data files before the synthesis is started are completely reserved, and snapshot processing is carried out on the directory indexes after the synthesis of each time, so that the directory indexes of the full-quantity backup data at a plurality of time points can be saved, and when the data is restored, a user can select the full-quantity backup data to restore according to the backup time points.
Meanwhile, in the storage of the backup data, only one initial full-quantity backup data and multiple incremental data are reserved in the storage space, and snapshot information generated after each synthetic processing is compared with the traditional backup in which multiple full-quantity data are required to be stored, so that the occupation of the storage space is greatly reduced.
The data synthesis backup method based on the index redirection technology provided by the invention can bring practical benefits in the following aspects:
1. by modifying the index pointer direction of the file, the disk movement of a large number of data blocks is avoided during the synthesis processing, and the load of the backup server is reduced.
2. And virtual full backup data is adopted, so that the data backup time is reduced, and the backup efficiency is improved.
3. A large amount of temporary storage space is not required to be reserved in advance for file data synthesis processing, and the utilization rate of the backup space is improved.
Example 2
The data synthesis method based on index redirection provided in embodiment 2 of the present application may be applied to an application environment as shown in fig. 7. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers. The terminal 102 stores data in the server 104.
The data merging principle when storing data is as follows: and carrying out first full backup on the production data of the first day to form first full backup data. And the production data on the second day is changed relative to the production data on the first day, incremental data backup is carried out, no change data in the production data on the first day is identified in the first full backup data according to the change bitmap information, deleted data in the production data on the first day is eliminated, new data of the production data on the second day relative to the production data on the first day is identified in the incremental data backup, and the no change data in the first full backup data on the first day and the new data in the incremental data backup on the second day are integrated to form new full backup data. And then, each time of incremental backup is executed, the incremental backup data and bitmap information of the changed data are acquired, and the background is automatically integrated into new full-volume backup data by combining the full-volume backup data generated last time. The incremental data obtained each time are deleted after the synthesis of the new full-volume backup data is completed, so that the occupation of the storage space is reduced.
In the management of the backup set data blocks, interval (extension) Block devices are adopted to replace single data blocks (blocks) to manage backup data, each interval (extension) Block device is a continuous data Block with a certain length, the interval Tree (extension Tree) provides disk space management, and the data blocks are supported by Tree directory indexes (BTrees) stored in Super blocks (Super blocks), so that the retrieval speed of the data blocks can be improved, and the expense of metadata is reduced.
Snapshot generation is based on copy-on-write (COW) transaction techniques as the delta data is synthesized. The incremental data is divided according to interval block devices and then inserted into block device data nodes of the bottom layer in a tree directory index (BTre), the associated upper layer nodes are added or modified, and the like, a chain reaction is initiated, each layer adds or modifies the associated nodes until the Root of the top layer is reached, and a new Root node is generated. When the whole transaction processing of the incremental data merging is completed, the Super Block points to the newly added root node to form the latest synthesized full-volume backup data directory index, and the newly established node information is subjected to snapshot.
And in order to better restore the backup data to any previous time point, the file-level snapshot processing is carried out on the total backup data after the synthesized total backup data is redirected by each index. When data loss occurs, the backup data can be used for recovery, and the data is mainly read in the whole recovery process, so that the index is used for quick query in order to speed up the data reading.
In one embodiment, as shown in fig. 8, a data synthesis method based on index redirection is provided, and the method is applied to the server 104 in fig. 7 for illustration, and includes the following steps:
step S10, executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
step S20, incremental backup is carried out according to the updated storage data, and incremental backup data is obtained;
step S30, obtaining bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the obtained incremental backup data to form new full-volume backup data;
step S40, when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are obtained, no change data is identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the obtained incremental backup data are integrated to form new total backup data again.
As shown in fig. 9, in this embodiment, the step of identifying unchanged data from the initial full-scale backup data according to bitmap information of the changed data includes:
Step S31, the deleted data is identified in the initial full-backup data according to the bitmap information of the changed data;
and step S32, removing the deleted data from the initial full-backup data to form unchanged data.
As shown in fig. 9, in this embodiment, the step of integrating unchanged data and the acquired incremental backup data to form new full backup data includes:
step S33, identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
step S34, integrating unchanged data, which are opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and step S35, executing full-scale backup with the rest unchanged data in the initial full-scale backup data and the changed data to form new full-scale backup data.
Therefore, as shown in fig. 9, step S30 of the present application includes the above-described steps S31 to S35.
As shown in fig. 8, in this embodiment, after each formation of new full-backed-up data, it further includes:
and S50, deleting the incremental backup data.
The incremental backup data obtained each time are deleted after the synthesis of the new full backup data is completed, so that the occupation of the storage space is reduced.
In this embodiment, the method further includes:
synchronously acquiring an initial directory index when the initial full-quantity backup data is acquired;
after forming new full-volume backup data, updating the initial directory index according to the new full-volume backup data to form a new directory index;
after the new full-backed up data is formed again, the last generated directory index is updated to form a new directory index.
In other words, as shown in fig. 8, there is provided a data synthesis method based on index redirection, including the steps of:
step S10, executing full-quantity backup according to the current stored data, acquiring initial full-quantity backup data, and synchronously acquiring an initial directory index;
step S20, incremental backup is carried out according to the updated storage data, and incremental backup data is obtained;
step S30, obtaining bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the obtained incremental backup data to form new full-volume backup data; updating the initial directory index according to the new full backup data to form a new directory index;
Step S40, when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are obtained, no change data is identified from the total backup data generated last time according to the bitmap information of the change data, the no change data and the obtained incremental backup data are integrated, new total backup data is formed again, and the last generated catalog index is updated to form a new catalog index;
and S50, deleting the incremental backup data.
According to the method, the index pointer is modified to point, original data files are associated, a brand new virtual full backup data is constructed, and a full backup data snapshot is utilized to generate a data recovery index. The method combines the unchanged data and the acquired incremental backup data to generate new full backup data after updating the data each time, fully utilizes the previous unchanged data, reduces repeated backup of the part of data, avoids data errors caused by the copying process, ensures the safety of the data and saves the storage space. The method and the device generate the recovery index (namely the directory index) of the subsequent data by utilizing the snapshot of the previous full-volume backup data, and carry out snapshot processing on the directory index after each synthesis, so that the directory index of the full-volume backup data at a plurality of time points can be saved, and when the data is recovered, a user can select the full-volume backup data to recover according to the backup time points.
As shown in fig. 10, in the present embodiment, when forming an initial directory index or a new directory index, it includes:
step S11, setting a plurality of continuous data blocks as a block interval device;
step S12, disk space management is carried out on the interval block equipment through an interval tree;
and S13, setting a super block corresponding to the interval tree, and performing data retrieval according to the directory index stored in the super block.
As shown in fig. 11, in the present embodiment, the step of updating the initial directory index according to the new full-scale backup data to form a new directory index includes:
step S21, dividing the incremental backup data in the new full backup data according to interval block equipment;
step S22, correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
step S23, the block equipment data nodes and the associated upper nodes of the bottom layer in the directory index of the rest data except the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
step S24, newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
Step S25, the super block is pointed to the new root node to form a new directory index corresponding to the new full backup data, and the newly added node information is subjected to snapshot.
It is understood that, in step S30 and step S40, as long as there is a new directory index formed by updating the directory index, the above steps S21 to S25 are included.
In the data synthesis method based on index redirection, after data are updated each time, the unchanged data and the acquired incremental backup data are integrated to generate a new full backup data, so that the previous unchanged data are fully utilized, repeated backup of the part of data is reduced, data errors caused by the copying process are avoided, the data safety is ensured, and the storage space is saved.
It should be understood that, although the steps in the flowcharts of fig. 8 to 11 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps of fig. 8-11 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps or stages are performed necessarily occur in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps or stages of other steps.
Example 3
In embodiment 3, as shown in fig. 12, there is provided a data synthesizing apparatus 10 based on index redirection, comprising: the system comprises an initial full-volume backup data management module 1, an incremental backup data acquisition module 2, a new full-volume backup data management module 3, an iterative integration data module 4 and an incremental backup data deletion module 5.
The initial full-volume backup data management module 1 is configured to perform full-volume backup according to current stored data, and obtain initial full-volume backup data.
The incremental backup data acquisition module 2 is configured to acquire incremental backup data according to updated stored data and perform incremental backup.
The new full-volume backup data management module 3 is configured to obtain bitmap information of the changed data, identify no changed data from the initial full-volume backup data according to the bitmap information of the changed data, and integrate the no changed data and the obtained incremental backup data to form new full-volume backup data.
The iterative integration data module 4 is configured to obtain the corresponding incremental backup data and bitmap information of the change data when updating the stored data again, identify unchanged data from the total backup data generated last time according to the bitmap information of the change data, integrate the unchanged data with the obtained incremental backup data, and form new total backup data again.
The incremental backup data deleting module 5 is configured to delete the incremental backup data after each new full backup data is formed.
In this embodiment, the new full back-up data management module 3 is specifically configured to:
identifying deleted data in the initial full-backup data according to bitmap information of the changed data;
removing deleted data from the initial full-backup data to form unchanged data;
identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
integrating unchanged data, which is opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and executing full-quantity backup with the rest unchanged data in the initial full-quantity backup data and the changed data to form new full-quantity backup data.
As shown in fig. 12, in the present embodiment, the data synthesis apparatus 10 based on index redirection further includes a catalog index management module 6. The catalog index management module 6 is used for synchronously acquiring an initial catalog index when acquiring initial full-volume backup data; after the new full-volume backup data is formed, the method is further used for updating the initial directory index according to the new full-volume backup data to form a new directory index; after the new full-volume backup data is formed again, the method is also used for updating the last generated directory index to form a new directory index.
In this embodiment, when forming an initial directory index or a new directory index, it includes:
setting a plurality of consecutive data blocks as one block section device;
performing disk space management on the interval block equipment through an interval tree;
and setting a super block corresponding to the interval tree, and carrying out data retrieval according to the directory index stored in the super block.
In this embodiment, the step of updating the initial directory index according to the new full-scale backup data to form a new directory index includes:
dividing the incremental backup data in the new full backup data according to interval block equipment;
correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
the block equipment data nodes and the associated upper nodes of the bottom layer of the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
and pointing the super block to a new root node to form a new directory index corresponding to the new full backup data, and simultaneously carrying out snapshot on newly added node information.
In the data synthesis device based on index redirection, after each data updating, the unchanged data and the acquired incremental backup data are integrated to generate a new full backup data, so that the previous unchanged data are fully utilized, repeated backup of the part of data is reduced, data errors caused by the copying process are avoided, the data safety is ensured, and the storage space is saved.
For specific limitations on the data synthesizing apparatus based on index redirection, reference may be made to the above limitation on the data synthesizing method based on index redirection, and no further description is given here. The various modules in the index-based redirection data compositing apparatus described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
Example 4
In embodiment 4, there is provided a computer device which may be a server, and the internal structural diagram thereof may be as shown in fig. 13. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing index-based redirected data composition data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a data synthesis method based on index redirection.
It will be appreciated by those skilled in the art that the structure shown in fig. 13 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:
executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
In one embodiment, the processor when executing the computer program further performs the steps of:
the step of identifying unchanged data from the initial full-scale backup data according to the bitmap information of the changed data comprises the following steps:
identifying deleted data in the initial full-backup data according to bitmap information of the changed data;
and removing the deleted data from the initial full-backup data to form unchanged data.
In one embodiment, the processor when executing the computer program further performs the steps of:
the step of integrating the unchanged data and the acquired incremental backup data to form new full backup data comprises the following steps:
identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
integrating unchanged data, which is opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and executing full-quantity backup with the rest unchanged data in the initial full-quantity backup data and the changed data to form new full-quantity backup data.
In one embodiment, the processor when executing the computer program further performs the steps of:
After each formation of the new full back-up data, further comprising:
and deleting the incremental backup data.
In one embodiment, the processor when executing the computer program further performs the steps of:
synchronously acquiring an initial directory index when the initial full-quantity backup data is acquired;
after forming new full-volume backup data, updating the initial directory index according to the new full-volume backup data to form a new directory index;
after the new full-backed up data is formed again, the last generated directory index is updated to form a new directory index.
In one embodiment, the processor when executing the computer program further performs the steps of:
when forming an initial or new directory index, it includes:
setting a plurality of consecutive data blocks as one block section device;
performing disk space management on the interval block equipment through an interval tree;
and setting a super block corresponding to the interval tree, and carrying out data retrieval according to the directory index stored in the super block.
In one embodiment, the processor when executing the computer program further performs the steps of:
the step of updating the initial directory index according to the new full-volume backup data to form a new directory index comprises the following steps:
Dividing the incremental backup data in the new full backup data according to interval block equipment;
correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
the block equipment data nodes and the associated upper nodes of the bottom layer of the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
and pointing the super block to a new root node to form a new directory index corresponding to the new full backup data, and simultaneously carrying out snapshot on newly added node information.
Specific limitations regarding implementation steps of the processor when executing the computer program may be found in the above limitations of the method of data synthesis based on index redirection, which are not described in detail herein.
Example 6
In embodiment 6, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
Executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
when the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the step of identifying unchanged data from the initial full-scale backup data according to the bitmap information of the changed data comprises the following steps:
identifying deleted data in the initial full-backup data according to bitmap information of the changed data;
and removing the deleted data from the initial full-backup data to form unchanged data.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the step of integrating the unchanged data and the acquired incremental backup data to form new full backup data comprises the following steps:
identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
integrating unchanged data, which is opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and executing full-quantity backup with the rest unchanged data in the initial full-quantity backup data and the changed data to form new full-quantity backup data.
In one embodiment, the computer program when executed by the processor further performs the steps of:
after each formation of the new full back-up data, further comprising:
and deleting the incremental backup data.
In one embodiment, the computer program when executed by the processor further performs the steps of:
synchronously acquiring an initial directory index when the initial full-quantity backup data is acquired;
after forming new full-volume backup data, updating the initial directory index according to the new full-volume backup data to form a new directory index;
After the new full-backed up data is formed again, the last generated directory index is updated to form a new directory index.
In one embodiment, the computer program when executed by the processor further performs the steps of:
when forming an initial or new directory index, it includes:
setting a plurality of consecutive data blocks as one block section device;
performing disk space management on the interval block equipment through an interval tree;
and setting a super block corresponding to the interval tree, and carrying out data retrieval according to the directory index stored in the super block.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the step of updating the initial directory index according to the new full-volume backup data to form a new directory index comprises the following steps:
dividing the incremental backup data in the new full backup data according to interval block equipment;
correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
the block equipment data nodes and the associated upper nodes of the bottom layer of the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
Newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
and pointing the super block to a new root node to form a new directory index corresponding to the new full backup data, and simultaneously carrying out snapshot on newly added node information.
Specific limitations regarding implementation steps of the computer program when executed by the processor may be found in the above limitations of the method of data synthesis based on index redirection, which are not described in detail herein.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (10)

1. A data synthesis method based on index redirection, comprising:
executing full-quantity backup according to the current stored data to obtain initial full-quantity backup data;
according to the updated stored data and executing incremental backup, acquiring incremental backup data;
acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
When the storage data is updated again, the corresponding incremental backup data and the bitmap information of the change data are acquired, no change data are identified from the total backup data generated last time according to the bitmap information of the change data, and the no change data and the acquired incremental backup data are integrated to form new total backup data again.
2. The index-redirection-based data synthesis method of claim 1, further comprising:
synchronously acquiring an initial directory index when the initial full-quantity backup data is acquired;
after forming new full-volume backup data, updating the initial directory index according to the new full-volume backup data to form a new directory index;
after the new full-backed up data is formed again, the last generated directory index is updated to form a new directory index.
3. The index-redirection-based data synthesis method of claim 2, wherein when forming an initial or new directory index, comprising:
setting a plurality of consecutive data blocks as one block section device;
performing disk space management on the interval block equipment through an interval tree;
And setting a super block corresponding to the interval tree, and carrying out data retrieval according to the directory index stored in the super block.
4. The index-redirection-based data synthesis method of claim 3, wherein the step of updating the initial catalog index based on new full-scale backup data to form a new catalog index comprises:
dividing the incremental backup data in the new full backup data according to interval block equipment;
correspondingly inserting the divided incremental backup data into the block equipment data nodes of the bottom layer in the directory index, and adding the associated upper layer nodes;
the block equipment data nodes and the associated upper nodes of the bottom layer of the incremental backup data in the new full backup data are reserved, and the upper nodes irrelevant to the new full backup data are deleted in the directory index;
newly adding or modifying the associated nodes layer by layer until the root of the topmost layer is generated correspondingly to generate a new root node;
and pointing the super block to a new root node to form a new directory index corresponding to the new full backup data, and simultaneously carrying out snapshot on newly added node information.
5. The index-redirection-based data synthesis method of claim 1, wherein the step of identifying no change data from the initial full-scale backup data according to bitmap information of the change data comprises:
identifying deleted data in the initial full-backup data according to bitmap information of the changed data;
and removing the deleted data from the initial full-backup data to form unchanged data.
6. The index-redirection-based data synthesis method of claim 5, wherein the step of integrating unchanged data and the acquired incremental backup data to form new full backup data comprises:
identifying unchanged data relative to the incremental backup data in the initial full backup data according to bitmap information of the changed data;
integrating unchanged data, which is opposite to the incremental backup data, in the initial full backup data with the incremental backup data to form variable data;
and executing full-quantity backup with the rest unchanged data in the initial full-quantity backup data and the changed data to form new full-quantity backup data.
7. The index-redirection-based data synthesis method of claim 1, further comprising, after each formation of a new full-volume backup data:
And deleting the incremental backup data.
8. An index-redirection-based data synthesis apparatus, the apparatus comprising:
the initial full-volume backup data management module is used for executing full-volume backup according to the current stored data to obtain initial full-volume backup data;
the incremental backup data acquisition module is used for acquiring incremental backup data according to the updated storage data and executing incremental backup;
the new full-volume backup data management module is used for acquiring bitmap information of the change data, identifying unchanged data from the initial full-volume backup data according to the bitmap information of the change data, and integrating the unchanged data and the acquired incremental backup data to form new full-volume backup data;
and the iteration integration data module is used for acquiring the corresponding incremental backup data and bitmap information of the change data when the storage data is updated again, identifying unchanged data from the total backup data generated last time according to the bitmap information of the change data, integrating the unchanged data and the acquired incremental backup data, and forming new total backup data again.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202310078328.4A 2023-01-31 2023-01-31 Data synthesis method, device, equipment and storage medium based on index redirection Pending CN116204356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310078328.4A CN116204356A (en) 2023-01-31 2023-01-31 Data synthesis method, device, equipment and storage medium based on index redirection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310078328.4A CN116204356A (en) 2023-01-31 2023-01-31 Data synthesis method, device, equipment and storage medium based on index redirection

Publications (1)

Publication Number Publication Date
CN116204356A true CN116204356A (en) 2023-06-02

Family

ID=86516654

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310078328.4A Pending CN116204356A (en) 2023-01-31 2023-01-31 Data synthesis method, device, equipment and storage medium based on index redirection

Country Status (1)

Country Link
CN (1) CN116204356A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117435403A (en) * 2023-12-21 2024-01-23 成都云祺科技有限公司 Processing index merging method, system and invalid data processing method in persistent backup

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117435403A (en) * 2023-12-21 2024-01-23 成都云祺科技有限公司 Processing index merging method, system and invalid data processing method in persistent backup
CN117435403B (en) * 2023-12-21 2024-03-12 成都云祺科技有限公司 Processing index merging method, system and invalid data processing method in persistent backup

Similar Documents

Publication Publication Date Title
US11740974B2 (en) Restoring a database using a fully hydrated backup
US6665815B1 (en) Physical incremental backup using snapshots
JP6495568B2 (en) Method, computer readable storage medium and system for performing incremental SQL server database backup
EP3796174B1 (en) Restoring a database using a fully hydrated backup
CN113918385B (en) Method, device, electronic equipment and medium for online incremental backup and recovery of virtual machine
CN103049539A (en) Method and device for storing file data in file system
US8914325B2 (en) Change tracking for multiphase deduplication
CN116204356A (en) Data synthesis method, device, equipment and storage medium based on index redirection
CN109753381B (en) Continuous data protection method based on object storage
US8732135B1 (en) Restoring a backup from a deduplication vault storage
US11669545B2 (en) Any point in time replication to the cloud
CN113419897B (en) File processing method and device, electronic equipment and storage medium thereof
CN104484402B (en) A kind of method and device of deleting duplicated data
CN113312309B (en) Snapshot chain management method, device and storage medium
US11620056B2 (en) Snapshots for any point in time replication
CN115357429B (en) Method, device and client for recovering data file
US20140250078A1 (en) Multiphase deduplication
CN114924911A (en) Method, device, equipment and storage medium for backing up effective data of Windows operating system
CN115617580B (en) Incremental backup and recovery method and system based on Shared SST (SST) file
CN116257531B (en) Database space recovery method
CN113342751B (en) Metadata processing method, device, equipment and readable storage medium
EP3991045B1 (en) Snapshots for any point in time replication
CN117539690B (en) Method, device, equipment, medium and product for merging and recovering multi-disk data
CN112596679B (en) RAID implementation method and device of solid state disk, computer equipment and storage medium
US11934349B2 (en) Refreshing multiple target copies created from a single source

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination