CN111338845B - Fine-grained local data protection method - Google Patents

Fine-grained local data protection method Download PDF

Info

Publication number
CN111338845B
CN111338845B CN202010094584.9A CN202010094584A CN111338845B CN 111338845 B CN111338845 B CN 111338845B CN 202010094584 A CN202010094584 A CN 202010094584A CN 111338845 B CN111338845 B CN 111338845B
Authority
CN
China
Prior art keywords
data
storage system
tree
lun
data storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010094584.9A
Other languages
Chinese (zh)
Other versions
CN111338845A (en
Inventor
郭景锐
李安亚
姚娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orca Data Technology Xian Co Ltd
Original Assignee
Orca Data Technology Xian Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Orca Data Technology Xian Co Ltd filed Critical Orca Data Technology Xian Co Ltd
Priority to CN202010094584.9A priority Critical patent/CN111338845B/en
Publication of CN111338845A publication Critical patent/CN111338845A/en
Application granted granted Critical
Publication of CN111338845B publication Critical patent/CN111338845B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Abstract

The invention discloses a fine-grained local data protection method, which comprises the steps of S1, writing user IO into a data storage system cluster; step S2, carrying out data splitting on IO in the data storage system cluster; step S3, storing the address information of each data block in the corresponding leaf node; step S4, the daemon process judges whether a user triggers a rollback task; step S5, the data storage system traverses all leaf nodes of the LUN-TREE; step S8, when the time granularity or the serial number is successfully matched, sorting the address information according to the time or serial number sequence; and step S9, obtaining the required data state by the sorted address information, and rolling back the stored data to the data state, thereby finishing the rolling back. The method realizes data rollback with minimum time granularity and data rollback according to the IO stream sequence.

Description

Fine-grained local data protection method
[ technical field ] A method for producing a semiconductor device
The invention belongs to the technical field of data processing, and particularly relates to a fine-grained local data protection method.
[ background of the invention ]
Enterprise-level storage systems typically require redundant protection of local data. Local data protection has several typical characteristics, such as snapshot, clone, etc.
A snapshot is a moment (T0) when the storage system solidifies the specified data on the storage system at that moment, as if the camera had pressed the shutter to leave the image. At any time thereafter, the storage system may restore the data to the state at T0 by way of a snapshot rollback. Snapshots mainly have copy-on-write (COW-copyonwrite) and redirect-on-write (ROW-redirect write). The principle of copy-on-write (COW) implementation is: when the snapshot is created, data copy is not performed, and the snapshot creation is basically completed instantly. When a host has a new write request, the new data is not written in, but the data in the source volume is copied firstly, then the new data is written in, and the data copying is needed only when the data of the source volume is updated for the first time after the snapshot is created.
The principle of implementation of the redirect-on-write (ROW) is as follows: the data storage address is divided into a logical address and a real address. When the snapshot is not created, the logical address is consistent with the actual address; when the snapshot is created, data copying is not carried out, and the snapshot creation is basically completed instantly; after the snapshot is created, when the host has a new write request, the new write request is written to a new storage area, and the modified data logical address pointer points to the new write area. Therefore, when the snapshot is required to be rolled back, the data of the snapshot can be obtained only by modifying the logical address to return to the original position, and the new data is not influenced.
Copy-on-write (COW) has the advantages of short snapshot generation time and small required storage space. But because of the more one-time reading and writing operation, the performance of the writing source volume is greatly influenced. In addition, when the hierarchical relationship exists between the snapshots, the copy-on-write data volume is greatly increased, which can greatly reduce the performance of the storage system. At the same time, since the algorithm complexity increases drastically in this case, the risk of data loss increases accordingly.
The above disadvantages can be better dealt with by the ROW-redirect-on-write (ROW) technique, but the snapshot created by the ROW-redirect-on-write technique is in units of time points, for example: a snapshot is created at the time of T0, T1 and T2, the snapshot rollback can only select the time of T0, T1 and T2, and if the time granularity is too small, the system can store a large number of useless snapshots; if the time granularity is too large, it is difficult to meet the requirement of data fine management. In addition, the redirection technology during writing cannot create the snapshot according to the IO stream sequence, and naturally cannot roll back according to the IO stream sequence.
[ summary of the invention ]
The invention aims to provide a fine-grained local data protection method, which realizes data rollback with minimum time granularity and data rollback according to an IO stream sequence.
The invention adopts the following technical scheme: a fine-grained local data protection method comprises the following steps:
step S1, writing user IO into the data storage system cluster;
the data storage system cluster comprises a plurality of LUN-TREE, each LUN-TREE consists of LUNs with a multilayer TREE structure, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node comprises address information for storing a data block with the size of 4 k; each leaf node also comprises a member, and each member is used for corresponding to the IO written-in timestamp or serial number;
the cluster of the data storage system also comprises a daemon process which is used for judging whether a user triggers a rollback task.
Step S2, carrying out data splitting on IO in the data storage system cluster, and splitting the IO into data blocks with the size of 4 k;
step S3, storing the address information of each data block in step S2 in the corresponding leaf node;
step S4, the daemon process judges whether a user triggers a rollback task, if not, the process is ended; if the user triggers the rollback task, sequentially executing the steps S5-S9;
step S5, the data storage system traverses all leaf nodes of the LUN-TREE;
step S6, matching the sequence numbers in the leaf nodes according to the time granularity or IO write sequence number selected by the user;
step S7, judging whether the sequence number in the leaf node is successfully matched with the time granularity or IO write sequence number required by the user;
step S8, when the time granularity or the serial number is successfully matched, sorting the address information according to the time or serial number sequence;
and step S9, obtaining the required data state by the sorted address information, and rolling back the stored data to the data state, thereby finishing the rolling back.
Further, the cluster of the data storage system is formed by aggregating storage spaces in a plurality of storage devices into a storage pool capable of providing a uniform access interface and a management interface for the application server.
The invention also discloses a data storage system cluster, which is a storage pool formed by aggregating storage spaces in a plurality of storage devices into one storage pool capable of providing a uniform access interface and a management interface for an application server, and comprises a plurality of LUN-TREE, each LUN-TREE is composed of LUNs with a multilayer TREE structure, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node is used for storing a data block with the size of 4 k; each leaf node comprises a member, and each member is used for corresponding to a timestamp or a serial number written by IO;
the data storage system cluster also comprises a daemon process used for judging whether a user triggers a rollback task.
The invention has the beneficial effects that: according to the user requirements, the method can selectively identify and finish efficient data rollback in real time during IO writing, ask for data as required and save resources; in specific implementation, a specific written data block of each IO of a current cluster is compared in a fine-grained manner of a time sequence, and whether to rollback to a certain IO or a specific 4K data block of the certain IO is determined according to the requirement of rollback of a user, so that the method is simple to operate and easy to implement; when a user triggers a rollback request, traversal positioning of target leaf nodes can be carried out through the LUN-TREE data management structure and the time sequence, and the method is simple, convenient, flexible, accurate and efficient.
[ description of the drawings ]
Fig. 1 is a schematic flow chart of a fine-grained local data protection method according to the present invention.
Fig. 2 is a schematic diagram of the connection of the components.
FIG. 3 is a schematic diagram of the LUN-TREE structure according to the present invention.
[ detailed description ] embodiments
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. (each 4k data block is allocated, i.e. a timestamp or a sequence number is obtained, and when rolling back data, the granularity of the rolled back data is very fine according to the timestamp or the sequence number, and is represented on the granularity of time or the writing sequence.)
The embodiment of the invention discloses a fine-grained local data protection method, which comprises the following steps:
step S1, writing user IO into the data storage system cluster;
the data storage system cluster comprises a plurality of LUN-TREE, each LUN-TREE consists of LUNs with a multilayer TREE structure, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node comprises address information for storing a data block with the size of 4 k; each leaf node also includes a member, and each member is used for corresponding to a timestamp or sequence number written by the IO.
The cluster of the data storage system also comprises a daemon process which is used for judging whether a user triggers a rollback task.
Step S2, carrying out data splitting on IO in the data storage system cluster, and splitting the IO into data blocks with the size of 4 k;
step S3, storing the address information of each data block in step S2 in the corresponding leaf node;
step S4, the daemon process judges whether a user triggers a rollback task, if not, the process is ended; if the user triggers the rollback task, sequentially executing the steps S5-S9;
step S5, the data storage system traverses all leaf nodes of the LUN-TREE;
step S6, matching the sequence numbers in the leaf nodes according to the time granularity or IO write sequence number selected by the user;
step S7, judging whether the sequence number in the leaf node is successfully matched with the time granularity or IO write sequence number required by the user;
step S8, when the time granularity or the serial number is successfully matched, sorting the address information according to the time or serial number sequence;
and step S9, obtaining the required data state by the sorted address information, and rolling back the stored data to the data state, thereby finishing the rolling back.
Each 4k data block is allocated, i.e. a time stamp or a sequence number is obtained, when rolling back data, the data grain size of rolling back is very fine based on the time stamp or the sequence number, or the rolling back data is reflected on the time grain size, or the writing order. Therefore, data rollback with minimal time granularity can be achieved.
Further, the cluster of the data storage system is formed by aggregating storage spaces in a plurality of storage devices into a storage pool capable of providing a uniform access interface and a management interface for the application server.
The invention also discloses a data storage system cluster, which is a storage pool formed by aggregating storage spaces in a plurality of storage devices into one storage pool capable of providing a uniform access interface and a management interface for an application server, and comprises a plurality of LUN-TREE, each LUN-TREE is composed of LUNs with multilayer TREE structures, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node is used for storing data blocks with the size of 4 k; each leaf node comprises a member, and each member is used for corresponding to a timestamp or a serial number written by IO;
the cluster of the data storage system also comprises a daemon process which is used for judging whether a user triggers a rollback task.
The present invention includes the following components, as shown in fig. 2:
and (3) daemon process: if the daemon process obtains a request of triggering rollback by a user, the system triggers rollback, and performs comparison according to the sequence numbers in the leaf nodes, and can perform rollback according to the time granularity required by the user.
Leaf node: each leaf node of the storage data structure corresponds to a 4k data block, a serial number member in each leaf node corresponds to IO writing sequence and time, and when data change occurs, the number of the leaf nodes is increased.
Time series: a complete, verifiable piece of data, usually a sequence of characters, that can represent a piece of data that existed before a particular time, uniquely identifying the time of the moment.
A data storage system cluster: the storage spaces in the storage devices are aggregated into a storage pool which can provide a uniform access interface and a management interface for the application server, and the application can transparently access and utilize the disks on all the storage devices through the access interface, so that the performance of the storage devices and the utilization rate of the disks can be fully exerted. Data is stored and read from a plurality of storage devices according to certain rules so as to obtain higher concurrent access performance.
LUN: the name is logical Unit number, and Chinese name is logical unit number. A LUN is an independent storage unit on a storage device that can be recognized by an application server. The space of one LUN is derived from the Pool of storage Pool, and the space of Pool is derived from several blocks of hard disks constituting the disk array. From the perspective of the application server, a LUN can be viewed as a usable hard disk. The method for managing and organizing logical addresses in the form of a TREE is called LUN-TREE, and as shown in fig. 3, LUNs can be constructed by adopting a basic three-layer TREE structure, with the number of layers being at most seven.
Each leaf node of the tree structure in fig. 3 corresponds to a 4k data block. In the process of writing data, each newly added leaf node plus each leaf node of the tree structure in the relative time series write graph. Through the numerical values of the time series, the data block which is written in each IO of the current cluster can be compared, and the data block which is rolled back to a certain IO or a specific 4K data block of the certain IO can be determined according to the requirement of the user for rolling back, such as the granularity of the time series.

Claims (2)

1. A fine-grained local data protection method is characterized by comprising the following steps:
step S1, writing user IO into the data storage system cluster;
the data storage system cluster comprises a plurality of LUN-TREE, each LUN-TREE consists of LUNs with a multilayer TREE structure, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node comprises address information used for storing a data block with the size of 4 k; each leaf node further comprises a member, and each member is used for corresponding to a timestamp or a serial number written by IO;
the data storage system cluster also comprises a daemon process used for judging whether a user triggers a rollback task;
step S2, carrying out data splitting on IO in the data storage system cluster, and splitting the IO into data blocks with the size of 4 k;
step S3, storing the address information of each data block in step S2 in the corresponding leaf node;
step S4, the daemon process judges whether a user triggers a rollback task, if not, the daemon process is ended; if the user triggers the rollback task, sequentially executing the steps S5-S9;
step S5, the data storage system traverses all leaf nodes of the LUN-TREE;
step S6, matching the sequence numbers in the leaf nodes according to the time granularity or IO write sequence number selected by the user;
step S7, judging whether the sequence number in the leaf node is successfully matched with the time granularity or IO write sequence number required by the user;
step S8, when the time granularity or the serial number is successfully matched, sorting the address information according to the time or serial number sequence;
step S9, obtaining the required data state by the sorted address information, and rolling back the stored data to the data state so as to finish the rolling back;
the data storage system cluster is formed by aggregating storage spaces in a plurality of storage devices into a storage pool capable of providing a uniform access interface and a management interface for an application server.
2. The fine-grained local data protection method data storage system cluster according to claim 1, wherein the cluster is a storage pool formed by aggregating storage spaces in a plurality of storage devices into one storage pool capable of providing a unified access interface and management interface for application servers, and comprises a plurality of LUN-TREE,
each LUN-TREE consists of LUNs with a multilayer TREE structure, each LUN-TREE comprises a plurality of leaf nodes, and each leaf node is used for storing a data block with the size of 4 k; each leaf node comprises a member, and each member is used for corresponding to a timestamp or a serial number written by IO;
the data storage system cluster also comprises a daemon process used for judging whether a user triggers a rollback task.
CN202010094584.9A 2020-02-16 2020-02-16 Fine-grained local data protection method Active CN111338845B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010094584.9A CN111338845B (en) 2020-02-16 2020-02-16 Fine-grained local data protection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010094584.9A CN111338845B (en) 2020-02-16 2020-02-16 Fine-grained local data protection method

Publications (2)

Publication Number Publication Date
CN111338845A CN111338845A (en) 2020-06-26
CN111338845B true CN111338845B (en) 2021-05-07

Family

ID=71183423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010094584.9A Active CN111338845B (en) 2020-02-16 2020-02-16 Fine-grained local data protection method

Country Status (1)

Country Link
CN (1) CN111338845B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101187948A (en) * 2007-12-20 2008-05-28 中国科学院计算技术研究所 A continuous data protection system and its realization method
CN101329642A (en) * 2008-06-11 2008-12-24 华中科技大学 Method for protecting and recovering continuous data based on time stick diary memory
CN101430657A (en) * 2008-11-17 2009-05-13 华中科技大学 Continuous data protection method
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
CN106610876A (en) * 2015-10-23 2017-05-03 中兴通讯股份有限公司 Method and device for recovering data snapshot
US10346260B1 (en) * 2015-09-30 2019-07-09 EMC IP Holding Company LLC Replication based security
CN110134551A (en) * 2019-05-21 2019-08-16 上海英方软件股份有限公司 A kind of continuous data protection method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101187948A (en) * 2007-12-20 2008-05-28 中国科学院计算技术研究所 A continuous data protection system and its realization method
CN101329642A (en) * 2008-06-11 2008-12-24 华中科技大学 Method for protecting and recovering continuous data based on time stick diary memory
CN101430657A (en) * 2008-11-17 2009-05-13 华中科技大学 Continuous data protection method
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
US10346260B1 (en) * 2015-09-30 2019-07-09 EMC IP Holding Company LLC Replication based security
CN106610876A (en) * 2015-10-23 2017-05-03 中兴通讯股份有限公司 Method and device for recovering data snapshot
CN110134551A (en) * 2019-05-21 2019-08-16 上海英方软件股份有限公司 A kind of continuous data protection method and device

Also Published As

Publication number Publication date
CN111338845A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
KR101833114B1 (en) Fast crash recovery for distributed database systems
KR101827239B1 (en) System-wide checkpoint avoidance for distributed database systems
CN104040481B (en) Method and system for merging, storing and retrieving incremental backup data
CN107798130B (en) Method for storing snapshot in distributed mode
JP5411250B2 (en) Data placement according to instructions to redundant data storage system
US7647449B1 (en) Method, system, and computer readable medium for maintaining the order of write-commands issued to a data storage
US7761431B2 (en) Consolidating session information for a cluster of sessions in a coupled session environment
JP2005528684A5 (en)
CN110413444B (en) Snapshot set to enable consistency groups for storage volumes
US20100169289A1 (en) Two Phase Commit With Grid Elements
US10452286B2 (en) Leveraging continuous replication to copy snapshot backup image
CN104937564B (en) The data flushing of group form
CN109407985B (en) Data management method and related device
CN106484313A (en) Data message backup method, data back up method and device
CN110121694B (en) Log management method, server and database system
CN106528338A (en) Remote data replication method, storage equipment and storage system
CN112380071B (en) Method for quickly backing up NTFS file system
CN111522514B (en) Cluster file system, data processing method, computer equipment and storage medium
CN111338845B (en) Fine-grained local data protection method
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN109582235A (en) Manage metadata storing method and device
CN109164988A (en) Processor-based virtual machine snapshot method and system
JP2024506524A (en) Publication file system and method
US11074003B2 (en) Storage system and restoration method
US11341159B2 (en) In-stream data load in a replication environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A fine-grained local data protection method

Effective date of registration: 20221207

Granted publication date: 20210507

Pledgee: Xianyang financing guarantee Limited by Share Ltd.

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2022610000796

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20231206

Granted publication date: 20210507

Pledgee: Xianyang financing guarantee Limited by Share Ltd.

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2022610000796

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A fine-grained local data protection method

Effective date of registration: 20231211

Granted publication date: 20210507

Pledgee: Xianyang financing guarantee Limited by Share Ltd.

Pledgor: Xi'an Okayun Data Technology Co.,Ltd.

Registration number: Y2023610000758

PE01 Entry into force of the registration of the contract for pledge of patent right