CN104714755B - Snapshot management method and device - Google Patents

Snapshot management method and device Download PDF

Info

Publication number
CN104714755B
CN104714755B CN201310690529.6A CN201310690529A CN104714755B CN 104714755 B CN104714755 B CN 104714755B CN 201310690529 A CN201310690529 A CN 201310690529A CN 104714755 B CN104714755 B CN 104714755B
Authority
CN
China
Prior art keywords
snapshot
directory
file
data
file system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310690529.6A
Other languages
Chinese (zh)
Other versions
CN104714755A (en
Inventor
叶茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201310690529.6A priority Critical patent/CN104714755B/en
Publication of CN104714755A publication Critical patent/CN104714755A/en
Application granted granted Critical
Publication of CN104714755B publication Critical patent/CN104714755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a snapshot management method and device. The snapshot management method comprises the following steps: the server receives a snapshot identification application from the client, wherein the snapshot identification is used for identifying a snapshot generated by a file system of the client; generating the latest snapshot number of the whole file system, returning the latest snapshot number as a snapshot identifier to the client, and updating the file descriptor structure of the file system; and creating a tracking file corresponding to the snapshot generated by the file system, and recording the change information of the file system after the snapshot is generated by the tracking file. By the mode, the method and the device can realize any directory snapshot aiming at the distributed file system without the file layout and the global file descriptor structure table.

Description

Snapshot management method and device
Technical Field
The present invention relates to the field of data protection technologies, and in particular, to a snapshot management method and apparatus.
Background
In NAS (Network Attached Storage) applications, snapshots are increasingly important as a means of local backup. An administrator can perform snapshot protection on data of a file system in a mode of manually taking a snapshot or timing a task, once a current system fails, the current system can be quickly rolled back to a previous time point, the probability of data loss is reduced, and a plurality of applications copied remotely need to be completed based on the snapshot, so that a NAS system and a snapshot technology are indispensable.
The current snapshot implementation methods mainly include two types, one is a system level snapshot supporting a ROW (Redirect On Write) mode, but the system level snapshot is not flexible enough, and the current NAS products are generally shared out in a directory mode for users to use, and the system snapshot is too coarse in granularity and affects all users to use during rollback. The other is a snapshot at any directory level supporting a ROW/COW (Copy On Write) mode, which is flexible to use, but is mainly based On a distributed file system that only scans files with specified paths, and the snapshot implementation depends On file layout and a global file descriptor structure Table (Inode Table).
Therefore, at present, no processing method for any directory snapshot of the distributed file system without file layout and global Inode Table exists.
Disclosure of Invention
The invention mainly solves the technical problem of providing a snapshot management method and a snapshot management device, which can realize any directory snapshot aiming at a distributed file system without a file layout and a global file descriptor structure table.
In a first aspect, the present invention provides a snapshot management method, including: the method comprises the steps that a server receives a snapshot identification application from a client, wherein the snapshot identification is used for identifying a snapshot generated by a client file system, and the server is used for managing the file system and the snapshot of the file system; generating a latest snapshot number of the whole file system, returning the latest snapshot number as the snapshot identifier to a client, and updating a file descriptor structure of the file system, wherein the file descriptor structure of the file system at least comprises the identifier of the file system, the generation time of the file system, the latest snapshot identifier of the file system, a standard form of the file system and a snapshot linked list of the file system, and the file system comprises at least one of a file and a directory; and creating a tracking file corresponding to the snapshot generated by the file system, wherein the tracking file records the change information of the file system after the snapshot is generated.
With reference to the first aspect, in a first possible implementation manner of the first aspect: the method further comprises the following steps: receiving a directory snapshot access request from a client, wherein the directory snapshot access request comprises a snapshot identifier and a directory name which are requested to be accessed; finding a directory data object of a directory snapshot and a directory data object of a directory according to the snapshot access request, wherein the directory data object comprises a file descriptor and a record item corresponding to the file descriptor; searching a directory data object of the directory snapshot, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than that of the snapshot identifier requested to be accessed, and the snapshot identifier is greater than that of the snapshot identifier requested to be accessed, and returning the record item to the client; and searching a directory data object of the directory, reading a record item corresponding to the file descriptor of the snapshot identifier of which the generation time is less than or equal to the access request, and returning the record item to the client.
With reference to the first aspect, in a second possible implementation manner of the first aspect: the method further comprises the following steps: receiving a file snapshot access request from a client, wherein the file snapshot access request comprises a snapshot identifier and a file name which are requested to be accessed; determining a data block to be read according to the snapshot requested to be accessed by the snapshot access request; for each data block needing to be read, finding the data block from the current snapshot to the next according to the sequence of generating the snapshot until the data block needing to be read is read; and returning all the read data blocks needing to be read to the client.
With reference to the first aspect, in a third possible implementation manner of the first aspect: the method further comprises the following steps: receiving a snapshot deleting request of a client; searching a tracking file of the snapshot, and checking whether the snapshot requested to be deleted has a previous snapshot; when the snapshot requested to be deleted has a previous snapshot, if the snapshot requested to be deleted has data shared with the previous snapshot, copying the shared data to the previous snapshot, recording change information into a tracking file corresponding to the previous snapshot, and then deleting the snapshot requested to be deleted and the tracking file corresponding to the snapshot; if the snapshot requested to be deleted does not have data shared with the previous snapshot, directly deleting the snapshot requested to be deleted and the tracking file corresponding to the snapshot; and if the snapshot requested to be deleted has no previous snapshot, directly deleting the snapshot requested to be deleted and the corresponding tracking file.
With reference to the first aspect, in a fourth possible implementation manner of the first aspect: the file system comprises a first directory and a second directory, the first directory and the second directory generate snapshots, and the method further comprises the following steps: receiving a moving instruction of a client, wherein the moving instruction indicates that data in a first directory is moved to a second directory, and the data comprises at least one of sub-directories and sub-files in the first directory; deleting entries of the data of the first directory; searching all snapshot information upwards, and recording all the found snapshot information in a snapshot linked list of a file descriptor of the data; generating a snapshot of a record item of the data in a directory data object of the first directory, and recording change information of the first directory into a corresponding tracking file, wherein the directory data object of the first directory comprises a file descriptor of the first directory and a record item corresponding to the file descriptor; copying and generating a record item of the data under the second directory, and collecting all snapshot information upwards to update a snapshot chain table of a file descriptor of the data and the generation time of the data; and recording the change information of the second directory into a tracking file corresponding to the second directory snapshot.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect: the file system includes a hard-linked file having a plurality of parent directories, each of the plurality of parent directories having generated a snapshot, and the method further includes: for the hard link file, recording the relationship between the hard link file extension and the parent directory extension thereof through a birth table; when a request for accessing the hard link file through a specified path is received, all father directories of the hard link file are found according to the birth table; and generating all paths for all the parent directories to access the hard link file and snapshots corresponding to all the paths.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect: the method further comprises the following steps: receiving a primary directory rollback instruction, locking a primary directory to which the instruction needs to be rolled back, and determining a snapshot to which the instruction needs to be rolled back; checking a snapshot linked list of the primary directory to which the instruction is to be rolled back, and finding out all snapshots after the snapshot to which the instruction is to be rolled back; and rolling back the files of the primary directory one by one from large to small according to the snapshot identification of the snapshot until the files are rolled back to the snapshot to which the instruction needs to be rolled back, and then unlocking the primary directory to be rolled back.
With reference to the sixth possible implementation manner of the first aspect, in a seventh possible implementation manner of the first aspect: the step of rolling back the files of the primary directory one by one from large to small according to the snapshot identifier of the snapshot until the files are rolled back to the snapshot to which the instruction is to be rolled back includes: finding out the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back, scanning the tracking file of the snapshot with the maximum snapshot identifier, and judging whether the snapshot with the maximum snapshot identifier is a file snapshot or a directory snapshot; when the snapshot identified by the maximum snapshot is a file snapshot, deleting the strip object of the snapshot identified by the maximum snapshot, and renaming the corresponding strip object in the file of the primary directory as the strip object of the snapshot identified by the maximum snapshot; deleting the trace file of the snapshot identified by the maximum snapshot; and when the snapshot of the maximum snapshot identifier is the snapshot to which the instruction is to be rolled back, ending the rolling back operation, otherwise, returning the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back and the subsequent steps.
With reference to the seventh possible implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect: when the snapshot identified by the maximum snapshot is a directory snapshot, scanning a directory data object of the primary directory snapshot, wherein the directory data object comprises a file descriptor and a record item corresponding to the file descriptor; finding the snapshot with the same snapshot identifier as the maximum snapshot identifier in the directory data object, and then judging whether the size of the snapshot file descriptor of the maximum snapshot identifier changes relative to the file descriptor in the current version of the primary directory; when the change occurs, the current version of the primary directory is cut down to the size of the snapshot of the maximum snapshot identifier, otherwise, the snapshot record item of the maximum snapshot identifier is copied and generated to the directory data object of the current version of the primary directory; scanning a directory data object of the current version of the primary directory, wherein the directory data object comprises a file descriptor and a record item corresponding to the file descriptor; deleting the record item with the generation time larger than the maximum snapshot identification; and when the snapshot of the maximum snapshot identifier is the snapshot to which the instruction is to be rolled back, ending the rolling back operation, otherwise, returning the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back and the subsequent steps.
With reference to the first aspect, in a ninth possible implementation manner of the first aspect: the method further comprises the following steps: receiving a modification operation instruction for modifying metadata of a file system which has generated a snapshot, wherein the metadata of the file system comprises at least one of directory metadata and file metadata of the file system; and copying the metadata of the file system to be modified when writing, modifying the file descriptor of the file system, and recording the modification information into the trace file of the snapshot of the file system.
With reference to the first aspect, in a tenth possible implementation manner of the first aspect: the method further comprises the following steps: receiving a modification operation instruction for modifying data of a file system which has generated a snapshot, wherein the data of the file system comprises at least one of directory data and file data of the file system; and performing a redirection process during writing of the modified data of the file system, then performing data modification of the file system, and recording the modification information into a trace file of the snapshot of the file system.
With reference to the first aspect, in an eleventh possible implementation manner of the first aspect: the method further comprises the following steps: and obtaining the increment through the inter-snapshot comparison, wherein the increment comprises the increment of data and the increment of metadata.
With reference to the eleventh possible implementation manner of the first aspect, in a twelfth possible implementation manner of the first aspect: the data increment comprises an increment generated by modified writing and an increment generated by additional writing, wherein the increment generated by modified writing is obtained by scanning a tracking file of the snapshot, and the increment generated by additional writing is obtained by comparing the file descriptor of the file with the file descriptor of the corresponding snapshot.
With reference to the eleventh possible implementation manner of the first aspect, in a twelfth possible implementation manner of the first aspect: and obtaining the metadata increment by comparing the generation time in the file descriptor of the snapshot and the corresponding non-snapshot version or checking the mark in the file descriptor of the snapshot.
In a second aspect, a snapshot processing apparatus is provided: the device comprises a receiving module, a returning module and a creating module, wherein: the receiving module is used for receiving a snapshot identification application from the client, and the snapshot identification is used for identifying a snapshot generated by a file system of the client; the return module is used for responding to the snapshot identification application received by the receiving module, generating the latest snapshot number of the whole file system, returning the latest snapshot number as the snapshot identification to the client, and updating a file descriptor structure of the file system, wherein the file descriptor structure at least comprises the identification of the file system, the generation time of the file system, the latest snapshot identification of the file system, the standard of the file system and a snapshot linked list of the file system, and the file system comprises at least one of a file and a directory; the creating module is used for creating a tracking file corresponding to the snapshot generated by the file system, and the tracking file records the change information of the file system after the snapshot is generated.
With reference to the second aspect, in a first possible implementation manner of the second aspect: the device further comprises a snapshot access processing module, wherein the snapshot access processing module is used for receiving a directory snapshot access request from a client, the directory snapshot access request comprises a snapshot identifier and a directory name which are requested to be accessed, a directory data object of a directory snapshot and a directory data object of a directory are found according to the snapshot access request, and the directory data object comprises a file descriptor and a record item corresponding to the file descriptor; searching a directory data object of the directory snapshot, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than the snapshot identifier requested to be accessed, and the snapshot identifier is greater than the snapshot identifier requested to be accessed, returning the record item to the client, searching the directory data object of the directory, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than or equal to the snapshot identifier requested to be accessed, and returning the record item to the client; or the snapshot access processing module is further configured to receive a file snapshot access request from a client, where the file snapshot access request includes a snapshot identifier and a file name that request access, determine, according to a snapshot that the snapshot access request requests to access, a data block that needs to be read, and, for each data block that needs to be read, find, from the current snapshot to the next according to a snapshot generation sequence, until the data block that needs to be read is read; and returning all the read data blocks needing to be read to the client.
With reference to the second aspect, in a second possible implementation manner of the second aspect: the device further comprises a snapshot deleting module, wherein the snapshot deleting module is used for receiving a snapshot deleting request of the client, searching a tracking file of the snapshot, checking whether the snapshot requested to be deleted has a previous snapshot or not, when the snapshot requested to be deleted has a previous snapshot, if the snapshot requested to be deleted has data shared with the previous snapshot, copying the shared data to the previous snapshot, and recording change information into a tracking file corresponding to the previous snapshot, and then deleting the snapshot requested to be deleted and the corresponding tracking file, if the snapshot requested to be deleted does not have data shared with the previous snapshot, directly deleting the snapshot requested to be deleted and the corresponding tracking file, and if the snapshot requested to be deleted does not have the previous snapshot, directly deleting the snapshot requested to be deleted and the corresponding tracking file.
With reference to the second aspect, in a third possible implementation manner of the second aspect: the file system comprises a first directory and a second directory, the first directory and the second directory generate snapshots, the apparatus further comprises a moving module, the moving module is configured to receive a moving instruction of a client, the moving instruction instructs to move data under the first directory to the second directory, the data includes at least one of sub-directories and sub-files under the first directory, delete a record item of the data of the first directory, look up all snapshot information, record all found snapshot information in a snapshot linked list of a file descriptor of the data, generate a snapshot of the record item of the data in a directory data object of the first directory, and record change information of the first directory in a corresponding trace file of the first directory, the directory data object of the first directory includes the file descriptor of the first directory and a record item corresponding to the file descriptor, and copying and generating a record item of the data under the second directory, collecting all snapshot information upwards to update a snapshot chain table of a file descriptor of the data and the generation time of the data, and recording the change information of the second directory into a tracking file corresponding to the snapshot of the second directory.
With reference to the second aspect, in a fourth possible implementation manner of the second aspect: the file system comprises a hard link file, the hard link file is provided with a plurality of father directories, the father directories generate snapshots, the device further comprises a hard link management module, the hard link management module is used for recording the relation between the hard link file extension name and the father directory extension name thereof through a birth table, searching all father directories of the hard link file according to the birth table when receiving a request for accessing the hard link file through a specified path, and generating all paths for accessing the hard link file by all the father directories and snapshots corresponding to all the paths.
With reference to the second aspect, in a fifth possible implementation manner of the second aspect: the device further comprises a modification module, wherein the modification module is used for receiving a modification operation instruction for modifying the metadata of the file system of which the snapshot has been generated, the metadata of the file system comprises at least one of directory metadata and file metadata of the file system, copying the metadata of the file system to be modified when writing, modifying the file descriptor of the file system, and recording the modification information into the tracking file of the snapshot of the file system; or the modification module is used for receiving a modification operation instruction for modifying the data of the file system which has generated the snapshot, the data of the file system comprises at least one of directory data and file data of the file system, the process of redirection is carried out when the data of the modified file system is written, then the data of the file system is modified, and the modification information is recorded into a tracking file of the snapshot of the file system.
With reference to the second aspect, in a sixth possible implementation manner of the second aspect: the device further comprises an increment acquisition module, wherein the increment acquisition module is used for acquiring an increment through inter-snapshot comparison, and the increment comprises an increment of data and an increment of metadata.
With reference to the sixth possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect: the data increment comprises an increment generated by modified writing and an increment generated by additional writing, wherein the increment acquisition module acquires the increment generated by modified writing by scanning a tracking file of the snapshot and acquires the increment generated by additional writing by comparing the file descriptor of the file with the file descriptor of the corresponding snapshot.
With reference to the seventh possible implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect: the increment obtaining module obtains the metadata increment by comparing the snapshot with the generation time in the file descriptor of the corresponding non-snapshot version or checking the mark in the file descriptor of the snapshot.
The invention has the beneficial effects that: the invention provides a snapshot management method and a snapshot management device, which are different from the prior art, and the method and the device can realize any directory snapshot aiming at a distributed file system without a file layout and a global file descriptor structure table by taking a monotonically increasing global latest snapshot number as a snapshot identifier to return to a client and adding a tracking file for recording the change information of a snapshot file system in a server.
Drawings
FIG. 1 is a schematic diagram of a file system to which the snapshot management method of the present invention is applied;
FIG. 2 is another schematic diagram of the file system to which the snapshot management method of the present invention is applied;
FIG. 3 is a flow chart of a first embodiment of a snapshot management method of the present invention;
FIG. 4 is a diagram illustrating storage of file system inode in an embodiment of a snapshot management method according to the invention;
FIG. 5 is a process flow diagram of directory snapshot access in a second embodiment of a snapshot management method in accordance with the present invention;
FIG. 6 is a schematic view of an inode storage of a directory in a second embodiment of a snapshot management method according to the present invention;
FIG. 7 is a flow chart of a process for accessing a file snapshot in a second embodiment of a snapshot management method in accordance with the present invention;
FIG. 8 is a diagram illustrating snapshot of a file data block in a second embodiment of a snapshot management method according to the present invention;
FIG. 9 is a flow chart of snapshot deletion in a third embodiment of the snapshot management method of the present invention;
FIG. 10 is a flowchart of a sub-directory or sub-file movement process in a fourth embodiment of the snapshot management method of the present invention;
FIG. 11 is a diagram illustrating a file system according to a fifth embodiment of the snapshot management method of the present invention;
FIG. 12 is a flowchart of a process for accessing a file on a corresponding hard link according to a fifth embodiment of the snapshot management method of the present invention;
fig. 13 is a flowchart of a snapshot rollback process in the sixth embodiment of the snapshot management method of the present invention;
fig. 14 is a flowchart of a specific implementation of the snapshot management method according to the sixth embodiment of the present invention, in a snapshot rollback process, rolling back files in a primary directory one by one from large to small according to snapshot identifiers of snapshots until the files are rolled back to a snapshot to which an instruction needs to be rolled back;
FIG. 15 is a flow chart of data modification in a seventh embodiment of the snapshot management method of the present invention;
FIG. 16 is a schematic structural diagram of an embodiment of a snapshot processing apparatus according to the present invention;
fig. 17 is a schematic structural diagram of another embodiment of the snapshot processing apparatus of the present invention;
fig. 18 is a schematic structural diagram of a snapshot processing apparatus according to still another embodiment of the present invention.
Detailed Description
Referring to fig. 1 and 2, fig. 1 and 2 are schematic structural diagrams of a file system applied by the snapshot management method of the present invention, the file system applied by the snapshot management method of the present invention is a distributed file system, as shown in fig. 1, the distributed file system of the present invention includes a Client (CA), a plurality of MetaData servers (MDS), and a Data Server (DS), wherein the MDS is responsible for managing MetaData information of the file system, and the DS is responsible for managing Data.
A directory may include multiple subdirectories or subfiles, and each directory and its subdirectories or subfiles may be managed by different MDSs, which manage different directories or files, respectively, as shown in fig. 2. In the following embodiments, MDS and DS are collectively referred to as a server.
Referring to fig. 3, fig. 3 is a flowchart of a snapshot management method according to a first embodiment of the present invention, wherein the snapshot management method according to the present embodiment includes:
s101: the server receives a snapshot identification application from the client, wherein the snapshot identification is used for identifying a snapshot generated by a file system of the client;
after the client generates the snapshot, a snapshot identifier for identifying the snapshot generated by the client file system is applied to the server, specifically the metadata server. And the metadata server receives a snapshot identification application from the client.
S102: generating the latest snapshot number of the whole file system, returning the latest snapshot number as a snapshot identifier to the client, and updating the file descriptor structure of the file system;
the metadata server generates a latest Snapshot number of the entire file system, for example, adds 1 to the current Snapshot number of the entire file system as the latest Snapshot number (Global Last Snapshot ID, GLS), and returns the latest Snapshot number as a Snapshot identifier to the client. The GLS in the file system is monotonically increasing, that is, after a snapshot is generated by any directory or file of the file system, the GLS adds 1 to the original base as a snapshot identifier of the currently generated snapshot (of course, the GLS may be monotonically increasing in other manners such as adding 2 and adding 3, which are preset by the client). For example, if the current GLS is snap12, if the current client generates a snapshot for a certain file or directory, and applies for a snapshot identifier from the metadata server, the metadata server adds 1 to the snap12, that is, returns the snapshot identifier to the client as the current snapshot through snap 13.
After returning the client snapshot identifier, the metadata server needs to update the file descriptor (inode) structure of the file system. The inode structure of the file system at least includes an identifier (FID) of the file system, a generating time (byte _ movement) of the file system, a latest snapshot identifier (Current _ snapshot id) of the file system, Flags (Flags) of the file system, and a snapshot linked List (Snap _ List) of the file system, and in addition, the inode structure of the file system may further include a renaming field (Rename) and a temporary field (Change _ movement). Wherein the file system comprises at least one of a directory and a file
The functions of the fields are described as follows:
bit _ Moment: when the name of the record item is changed or the inode is changed, the global GLS value at the creating time of the directory/file is changed into the GLS value at the changing time;
recording the Current managed snapshot identifier of the directory/file, if a new snapshot is made to a directory/file and the file is not changed, recording the new snapshot identifier of the directory/file by the Current _ SnapID;
change _ Moment: a temporary field, which records the current GLS value when the snapshot attribute is refreshed, and assigns the value to a dirty _ Moment (updated when the snapshot version is refreshed) after the snapshot version is generated;
flags which is generated due to which modification of modification/deletion/renaming when the snapshot version is generated;
rename, recording a renamed new name in a snapshot version generated due to renaming;
and the Snap _ List is used for recording the information of the snapshot identifier which is historically typed by the inode, collecting the information in the process of backtracking upwards during modification, wherein the field can be stored in the extended attribute area of the inode, and an object can be extended to be stored when the extended attribute area is not stored enough.
The metadata server updates the inode structure of the file system, specifically the inode structure of the snapshot root, and records snapshot root information. The snapshot root refers to a directory or a file to which the generated snapshot belongs. For example, Current _ snapshot id in the inode of the snapshot root is updated to be a new snapshot identifier, and is recorded in the snapshot List, and the Current GLS of the file system = the latest snapshot identifier.
The inode storage of the distributed file system of the invention does not store the inode separately as the traditional file system, but stores the inode in a directory data object (entry list object) of a parent directory. For example, as shown in fig. 4, fig. 4 is a schematic view of storing inodes of file systems in an embodiment of the snapshot management method of the present invention, a part of fig. 4A is a structure of a directory, a part of fig. 4B is a schematic view of storing inodes under the directory structure, an inode of each file system is stored in a directory data object of its parent directory, and an inode of its child directory or child file and a record entry corresponding to the inode are stored in a directory data object of each file system.
S103: creating a tracking file corresponding to the snapshot generated by the file system, and recording the change information of the file system after the snapshot is generated by the tracking file;
after a client generates a snapshot for a file system, a metadata server generates a trace file (Track file) corresponding to the snapshot, and the trace file is used for recording change information of the file system after the snapshot is generated. For example, after the snapshot snap22 is generated, the file 1 is modified or deleted, or the file2 is created, and the information of the changes is recorded in the trace file corresponding to the snap 22.
The Track file has the following main functions: 1) when rolling back, finding out which files need to be processed, namely finding out files which are modified, deleted and added; 2) when the snapshot is deleted, finding out which files need to be deleted, namely finding out modified and deleted files; 3) during incremental backup, the difference data between the two snapshots is found, namely, the files which are modified, deleted and added are found.
In the embodiment of the invention, for performance consideration, a Track file of a snapshot can be split into a plurality of sub-files and bound with MDS, the local Track file is updated after each MDS operation is finished, and the object number distribution rule of the Track sub-table is snapshot identification + metadata server identification + mark, namely Snap ID + MDS ID + Flags. In addition, in order to simplify the design of the Track file, only the change information of the directory file can be recorded in the Track file, and the specific change can be acquired only by scanning the directory.
The above is an embodiment of the snapshot management method of the present invention, and it can be understood from the description of the above embodiment that, the snapshot management method of the present invention returns the latest snapshot number that is monotonically increased as the snapshot identifier to the client, and adds a new trace file that records the change information of the snapshot file system in the metadata server, thereby implementing any directory snapshot for the distributed file system without file layout and global file descriptor structure table, and facilitating the management of data.
In the second embodiment, for the snapshot that has been generated, the snapshot management method of the present invention may further receive a snapshot access request from the client, read snapshot data requested by the client, and return the snapshot data to the client. That is, after the steps of the first embodiment are executed, the snapshot management method of the present invention may further include snapshot access processing. In the present embodiment, a specific processing flow of snapshot access is further provided. The snapshot access request may be a directory snapshot access request or a file snapshot access request.
Referring to fig. 5, fig. 5 is a schematic view of a processing flow of directory snapshot access in a second embodiment of the snapshot management method according to the present invention, where the directory snapshot access processing in this embodiment includes the following steps:
s201: finding a directory data object of the directory snapshot and a directory data object of the directory according to the snapshot access request, wherein the directory data object comprises an inode and a record item corresponding to the inode;
the directory snapshot access request includes a snapshot identification and a directory name requesting access.
S202, searching a directory data object of the directory snapshot, reading a record item corresponding to an inode of which the generation time is less than that of the snapshot identifier requested to be accessed and the snapshot identifier is greater than that of the snapshot identifier requested to be accessed, and returning the record item to the client;
s203: searching a directory data object of the directory, reading a record item corresponding to the inode of the snapshot identifier of which the generation time is less than or equal to the access request time, and returning the record item to the client;
taking the directory structure as shown in fig. 4A as an example, assume that snapshot snap81 is generated for directory 2, then file2 under subdirectory 1 is modified, then snapshot 92 is generated for directory 2, the 2 nd stripe of file3 under subdirectory 1 is modified, and file4 is created under subdirectory 1.
Referring to fig. 6, assuming that data of snap92 in the/data/dir 1 directory is to be read, the specific processing flow is as follows:
1) finding the directory data object of the/data/dir 1 snapshot and searching item by item;
2) when the File2@ snap81entry is searched, a snapshot version is found, the snapshot _ Moment of the snapshot version is smaller than snap92, the snapshot version number is also smaller than snap92 (generated before snap 92), and the snapshot version number belongs to the version before snap92 snapshot and is skipped;
3) when finding File3@ snap92entry, finding that the snap number is equal to snap92, and meanwhile, the dirty _ Moment is smaller than snap92 (since File3 is modified after snap 92), is a snap version of snap92, and reading;
4) searching a directory data object of the/data/dir 1 item by item;
5) searching the File2entry, finding that the byte _ Moment of the entry is smaller than snap92, which indicates that the entry is not modified and read when the snap92 is taken;
6) find File3entry, find that its dirty _ comment is greater than snap92 (because modified after snap 92), skip;
7) find File4entry, find that its dirty _ comment is larger than snap92 (because it was created after snap 92), skip.
Referring to fig. 7, fig. 7 is a flowchart illustrating a process of accessing a file snapshot in a second embodiment of the snapshot management method according to the present invention, because a data block is Row, and the data object of the file snapshot needs to be read from the current version to the latest version according to the order of generating the snapshot. The file snapshot access processing in the embodiment includes the following steps:
s301: determining a data block to be read according to the size of the snapshot requested to be accessed by the snapshot access request;
when the snapshot access request is a file snapshot access request, the snapshot access request comprises a snapshot identifier and a file name which are requested to be accessed. The server specifically is a data server receiving a client file snapshot access request, and calculating a data block to be read according to the size (size) of a snapshot requested to be accessed by the snapshot access request.
S302: for each data block needing to be read, finding the data block from the current snapshot to the next according to the sequence of generating the snapshot until the data block needing to be read is read;
s303: returning all read data blocks needing to be read to the client
Referring to fig. 8, fig. 8 is a schematic diagram illustrating snapshot processing of file data blocks in this embodiment, assuming that to read data of snap0, the size of snap0 needs to be known, four data blocks K1 to K4 need to be read according to size calculation, and a server reads the data blocks in the following manner when reading:
1) reading K1.snap0, not present, reading K1.snap1, not present, reading K1.snap2, not present, reading K1;
2) read k2.snap0, present;
3) reading K3.snap0, not existing, reading K3.snap1, existing;
4) read K4.snap0, not present, read K4.snap1, not present, read K4.snap2, present.
Finally, the read K1, K2.snap0, K3.snap1, and K4.snap2 are returned to the client.
In the third embodiment of the present invention, for a snapshot that has already been generated, the snapshot management method of the present invention may further receive a snapshot deletion request from the client, and delete the snapshot that is deleted according to the request. That is, after the steps of the first embodiment of the present invention are executed, the snapshot management method of the present invention may further respond to the snapshot deletion request of the client to execute the snapshot deletion. In the present embodiment, a specific processing flow of snapshot deletion is further provided.
Referring to fig. 9, fig. 9 is a flowchart of snapshot deletion in a third embodiment of the snapshot management method according to the present invention, where the snapshot deletion in this embodiment includes the following steps:
s401: searching a tracking file of the snapshot;
s402: checking whether the snapshot requested to be deleted has a previous snapshot;
s403: judging whether the snapshot requested to be deleted has data shared with the previous snapshot or not;
s404: copying the shared data to the previous snapshot, and recording the change information into a tracking file corresponding to the previous snapshot;
s405: and deleting the snapshot requested to be deleted and the corresponding tracking file.
In a specific processing process, a server receives a snapshot deletion request of a client, wherein the snapshot deletion request comprises a snapshot identifier of a snapshot requested to be deleted. The server searches the Track files of all snapshots, finds out the snapshot requested to be deleted, checks whether the snapshot requested to be deleted has a previous snapshot version, and can acquire the previous snapshot version and a corresponding snapshot number by checking a snapshot linked list recorded in an inode of the snapshot requested to be deleted. If the snapshot requested to be deleted has the previous snapshot version, whether the snapshot requested to be deleted is a file snapshot or a directory snapshot is further judged, if the snapshot requested to be deleted is the file snapshot, the processing is carried out according to the processing flow of deleting the file snapshot, and if the snapshot requested to be deleted is the directory snapshot, the processing is carried out according to the processing flow of the directory snapshot. And if the snapshot requested to be deleted has no previous snapshot version, directly deleting the snapshot requested to be deleted.
The processing flow of deleting the file snapshot mainly comprises the following steps: judging whether the strip data of the snapshot requiring deletion exists in the previous snapshot version, if not, copying the strip data of the snapshot requiring deletion to the previous snapshot version, and recording the change information to a Track file corresponding to the previous snapshot. And if the previous snapshot version has the strip data of the snapshot requested to be deleted, directly deleting the snapshot requested to be deleted and the corresponding Track file thereof.
The processing flow of deleting the directory snapshot mainly comprises the following steps: searching all record items of the directory snapshot requested to be deleted, deleting the record items of the directory snapshot requested to be deleted when the snapshot record item copy of the previous snapshot version exists, and generating the snapshot record item of the previous version by COW copy when the snapshot record item copy of the previous snapshot version does not exist. After the entry item of the directory is processed, the directory itself needs to be processed, whether the last snapshot version of the directory itself exists or not is further judged, if yes, the snapshot is deleted, if not, the COW generates the entry of the directory, and the directory information of the directory is added to the Track file of the previous snapshot version.
The snapshot management method of the present invention may further include obtaining an increment by comparing the generated snapshots. Wherein the delta may include a delta of data and a delta of metadata.
The comparison between snapshots adopts an increment superposition comparison method between two adjacent snapshots, and if the increment between < snap1 and snap7> needs to be compared, three increment comparison processes of superposition < snap1, snap3>, < snap3, snap5>, < snap5 and snap7> are carried out in the following processes; in the process of increment comparison, the increment of the three parts of modification, deletion and new addition needs to be identified; the increment is divided into data increment and metadata increment, the data increment is mainly generated by modification writing and additional writing of file data, the metadata increment is mainly generated by modifying inode or entry items, and the increment acquisition process is also divided into the following two parts:
1. incremental portion of data
1) The incremental quantity generated by the modification writing can acquire which strips are modified by scanning the Track file of the snapshot;
2) the increment generated by the additional writing can be known by comparing the size attributes of the inode and the snapshot version inode.
2. Metadata delta section
1) Creating, by comparing the generation time of the non-snapshot version inode, whether the creation is based on the current snapshot, for example, looking at the increment between the < N, N1> snapshots, and needing to look at whether the generation time of the increment is between the generation times of the < N, N1> snapshots;
2) deleting, wherein the generated snapshot version can be known by looking up the Flags in the snapshot version inode;
3) modifications (inode modifications or entry renaming), by looking at the Flags in the snapshot version inode, it can be known that the snapshot version was generated due to the modification.
In a fourth embodiment, the file system of the present invention includes a first directory and a second directory, where the first directory and the second directory both generate snapshots and perform all the steps of the first embodiment of the present invention for both directories, and after performing the steps, the snapshot management method of the present invention further includes a step of receiving a move instruction from a client and moving data in the first directory to the second directory. The data in the first directory includes at least one of a sub-directory and a sub-file in the first directory, and in this embodiment, a specific processing flow for moving the data in the first directory is further provided.
Referring to fig. 10, fig. 10 is a flowchart illustrating a data movement process under a first directory in a fourth embodiment of the snapshot management method according to the present invention, where the data movement process under the first directory in the present embodiment includes:
s501: deleting entries of the data of the first directory;
s502: searching all snapshot information upwards, and recording all found snapshot information in a snapshot linked list of the data inode;
s503: generating a snapshot of a record item of data in a directory data object of a first directory, and recording change information of the first directory into a corresponding tracking file, wherein the directory data object of the first directory comprises an inode of the first directory and a record item corresponding to the inode;
s504: copying a record item of the generated data under a second directory, and upwards collecting all snapshot information to update a snapshot link table of the data inode and the generation time of the data;
s505: and recording the change information of the second directory into a trace file corresponding to the snapshot of the second directory.
The execution of the two actions generated by deleting the first directory and copying the second directory does not strictly distinguish the sequence, and the execution of the two actions can be executed in parallel in order to improve the processing efficiency.
A detailed description is given below of a data moving process under a first directory by taking a specific example as an example of a directory structure shown in fig. 4A, where a directory 2data is used as the first directory in the present embodiment, data generates a snapshot 81, a directory 1junk is used as the second directory in the present embodiment, and junk generates a snapshot 92, and it is assumed that a subdirectory 1dir1 of data is to be moved under the junk directory, and the specific processing steps are as follows:
1) the first directory/data carries out the operation of deleting the dir1 record items;
2) upwards searching snapshot information of all parent directories, and recording the snapshot information in a snapshot linked list of/data/dir 1, wherein the snapshot found in the/data direction is only snap81, so that the snapshot information of snap81 is recorded in the snapshot linked list of/data/dir 1;
3) generating a snapshot version dir1@ snap81entry of a dir1 record item in a directory data object of the directory/data as shown in fig. 6, and recording directory information of/data in a track file of the snap 81;
4) copying the record items of the directory dir1 under the directory/junk, simultaneously collecting snapshot information upwards, updating Snap _ List (81, 92) in the inode of the dir1 and the generation time, and recording the change information of the directory of junk into the track file of the Snap92 because a new record item is created under the directory/junk; in fact, the dir1@ snap81entry under the original directory is the same as the directory data object of the dir1 under the new directory.
Subsequent modifications under directory/junk/dir 1 are still protected by the original path snapshot version snap81 when looking up the snapshot (since dir1 is a more specific directory and is the root of move). For example, deleting the file2, tracing back upwards to find that dir1 is a move root, firstly comparing the generation time of dir1 with the maximum snapshot number on the current path, if the generation time of dir1 is greater than the maximum snapshot number on the current path, indicating that no updated snapshot is taken after the movement of dir1, and the current modification is protected by snap81, so as to generate a snapshot version of the file2@ snap81 entry.
In a fifth embodiment, the file system of the present invention includes a hard-linked file having a plurality of parent directories, each of the plurality of parent directories having generated a snapshot. The snapshot management method of the invention also comprises the step of recording the relationship between the hard link file and the parent directory of the hard link file through the birth table for the hard link file.
A hard-linked file has multiple parents, such as the file system structure shown in FIG. 11, in which/junk/hl 2 is a hard link, chain to/data/dir 1/file 2; meanwhile, the/tmp/hl 1 is also a hard link, and the same chain is connected to the/data/dir 1/file 2; snapshot snap92 is typed on/junk, snapshot snap79 is typed on/tmp, and snapshot snap81 is typed on/data.
The server records the relationship between the file extension (fid) and the parent directory fid through a birth table (birth table) for the processing of the hard link file. That is, all the parent directories fid of the file2 can be obtained by looking up the fid of the file2, and the original parent directory (numbered 0) can be found according to the number of the parent directory.
Since a hard link does not have its own inode, it shares one inode with its linked file, and when accessing a file on the hard link, it needs to consider whether there is snapshot management on its parent directory path. Therefore, this embodiment further provides a processing method for accessing a file on a hard link, as shown in fig. 12, where fig. 12 is a flowchart of processing corresponding to accessing a file on a hard link in a fifth embodiment of the snapshot management method of the present invention, and the processing method for accessing a file on a hard link in this embodiment includes:
s601: receiving a request for accessing the hard link file through a specified path, and searching all father directories of the hard link file according to the birth table;
s602: and generating all paths of all parent directories for accessing the hard link file and snapshots corresponding to all paths.
For example, as shown in fig. 11, if file2 is accessed through/tmp/hl 1, all parent directories dir1 and junk of file2 need to be found through table lookup, and then whether snapshot management exists on the parent directory path is traced back, i.e. directory path/tmp,/data/dir 1,/junk is traced back, and corresponding snapshot versions of all paths are generated. Because the data of the file2 is managed by multiple snapshots snap79, snap81, and snap92 simultaneously.
Considering that more users share and use the primary directories, setting is performed on some primary directories in the system to support the rollback operation, whether the primary directories generate snapshots or not needs to be judged during setting, and if the snapshots are generated, the setting is unsuccessful.
In a sixth embodiment of the present invention, the snapshot management method of the present invention further includes a snapshot rollback process, that is, after all the steps of the first embodiment of the present invention are performed, the snapshot management method of the present invention further includes a snapshot rollback in response to a snapshot rollback request of the client. In this embodiment, a specific processing flow of snapshot rollback is further provided.
When the rollback operation of the first-level directory is performed, only the rollback operation of the first-level directory needs to be processed because the snapshot nesting of the subdirectories does not exist. Referring to fig. 13, fig. 13 is a flowchart illustrating a snapshot rollback process in a sixth embodiment of the snapshot management method according to the present invention, where the snapshot rollback process in the present embodiment includes the following steps:
s701: receiving a first-level directory rollback instruction, locking a first-level directory to which the instruction needs to be rolled back, and determining a snapshot to which the instruction needs to be rolled back;
the primary directory rollback instruction comprises a snapshot identifier of a snapshot to be rolled back, and the primary directory to be rolled back is locked according to the rollback instruction, and no business operation is allowed.
S702: checking a snapshot linked list of a first-level directory to which the instruction needs to be rolled back, and finding out all snapshots after the snapshot to which the instruction needs to be rolled back;
and looking up a snapshot linked list of the primary directory to which the instruction is to be rolled back, and finding out all snapshots after the snapshot to which the instruction is to be rolled back. For example, the following snapshots are under the primary directory: n, N1, N2, … … M (M is the latest snapshot), assuming the instruction requires a roll-back to N1, when all snapshots after N1 are found to be N2, … … M, respectively.
S703: and rolling back the files of the primary directory one by one from large to small according to the snapshot identification of the snapshot until the files are rolled back to the snapshot to which the instruction needs to be rolled back.
And rolling back the files of the primary directory one by one from large to small according to the snapshot identifier of the snapshot, namely rolling back to M, and then rolling back one by one until N1.
Referring to fig. 14, fig. 14 is a flowchart of a specific implementation of the embodiment of rolling back files of the primary directory one by one from large to small according to snapshot identifiers of snapshots until the files are rolled back to a snapshot to which an instruction needs to be rolled back, and the flowchart includes the following steps:
s7011: finding out the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back, and scanning the tracking file of the snapshot with the maximum snapshot identifier;
after finding all snapshots after the snapshot to which the instruction is to be rolled back, finding the snapshot identified by the largest snapshot from these snapshots, taking the above-mentioned found snapshots as N2, … … M as an example, where the snapshot identified by the largest snapshot is M, and taking M as the snapshot to be rolled back currently (because the snapshots are rolled back one by one from back to front when rolling back). The trace file for the snapshot to which rollback is currently pending is scanned.
S7012: judging whether the snapshot identified by the maximum snapshot is a file snapshot or a directory snapshot;
when the snapshot with the maximum snapshot identifier is judged to be the file snapshot, executing a file snapshot rollback operation, namely steps S7013-S7014; and when the snapshot identified by the maximum snapshot is judged to be the directory snapshot, performing a directory snapshot rollback operation, namely steps S7015-S7020. After the file snapshot rollback operation or the directory rollback operation is performed, execution is shifted to S7021.
S7013: deleting the strip object of the snapshot identified by the maximum snapshot, and renaming the corresponding strip object in the file of the primary directory as the strip object of the snapshot identified by the maximum snapshot;
and when the snapshot M is a file snapshot, deleting the strip object of the M, and renaming the corresponding strip object in the file of the primary directory to be the strip object of the M.
S7014: deleting the trace file of the snapshot identified by the maximum snapshot;
and deleting the trace file corresponding to the snapshot M.
S7015: scanning a directory data object of the first-level directory snapshot;
when the snapshot M is the directory snapshot, the directory data objects of the first-level directory snapshot are scanned.
S7016: finding the snapshot with the same snapshot identifier as the maximum snapshot identifier;
and finding the snapshot with the same snapshot identifier as that of the M in the re-directory data object, namely indicating that the M is the exclusive snapshot, and executing S7017.
S7017: whether the inode of the snapshot identified by the maximum snapshot changes relative to the inode in the current version of the primary directory;
s7019 is performed when the inode of snapshot M changes with respect to the inode in the current version of the level one directory, otherwise S7018 is performed.
S7018: copying the snapshot record item generating the maximum snapshot identifier to a directory data object of the current version of the primary directory;
s7019: reducing the current version of the primary catalog to the size of the snapshot of the maximum snapshot identifier;
s7020: scanning a directory data object of a current version of a primary directory, and deleting a record item of which the generation time is greater than the maximum snapshot identifier;
s7021: judging whether the snapshot identified by the maximum snapshot is a snapshot to which the instruction is to be rolled back or not;
and executing the step S7022 when the snapshot identified by the maximum snapshot is judged to be the snapshot to which the instruction is to be rolled back, otherwise, returning to and continuing to execute the step S7012 and the following steps. That is, it is determined whether M is the snapshot to which the instruction is to be rolled back, and if M is the snapshot to which the instruction is to be rolled back, the roll-back is completed this time, and the roll-back operation is ended. If M is not the snapshot to which the instruction is to be rolled back, rolling back continues forward one by one until the snapshot to which the instruction is to be rolled back is reached.
S7022: the rollback operation is ended.
In the seventh embodiment of the snapshot management method of the present invention, the snapshot management method of the present invention may further modify data of the file system that has generated the snapshot. That is, after all the steps of the first embodiment of the snapshot management method of the present invention are executed, a modification operation instruction for modifying the data of the file system that has generated the snapshot may be further received, and the data is modified according to the modification operation instruction.
The modification operation instruction comprises a metadata modification operation instruction and a file data modification operation instruction. The present embodiment further provides a specific implementation flow for data modification.
Referring to fig. 15, fig. 15 is a flowchart of data modification in the present embodiment, where the data modification includes the following steps:
s801: receiving a modification operation instruction;
s802: judging whether the modification operation instruction is a metadata modification operation instruction or a file data modification operation instruction;
step S803 is executed when it is determined that the modification operation instruction is the metadata modification operation instruction, and step S804 is executed when it is determined that the modification operation instruction is the file data modification operation instruction.
S803: copying metadata of the file system for data modification during writing, modifying an inode of the file system, and recording modification information into a tracking file of a snapshot of the file system;
s804: and performing a redirection process during writing of the data of the modified file system, then performing data modification of the file system, and recording modification information into a tracking file of the target file snapshot.
And if a modification operation instruction for modifying the data of the file system which does not generate the snapshot is received, directly modifying the data or modifying the inode. Because the data of the file system has not generated the snapshot, the modification of the data does not have the condition that the data shared by the snapshot is affected.
The above embodiments may be implemented by combining two or more functions, that is, on the basis of the first embodiment, any two or more functions of processing of snapshot data access, snapshot deletion processing, snapshot increment comparison, moving operation hard link access processing, rollback processing, and data modification processing may be implemented in combination.
Referring to fig. 16, fig. 16 is a schematic structural diagram of an embodiment of a snapshot processing apparatus according to the present invention, where the snapshot processing apparatus 100 of the embodiment includes a receiving module 11, a returning module 12, and a creating module 13, where:
the receiving module 11 is configured to receive a snapshot identifier application from a client, where the snapshot identifier is used to identify a snapshot generated by a file system of the client;
after the client generates the snapshot, the client applies for a snapshot identifier for identifying the snapshot generated by the client to the server. The receiving module 11 receives a snapshot identification application from the client.
The returning module 12 is configured to generate a latest snapshot number of the entire file system in response to the snapshot identifier application received by the receiving module 11, return the latest snapshot number as a snapshot identifier to the client, and update an inode structure of the file system, where the inode structure of the file system at least includes an identifier of the file system, a generation time of the file system, the latest snapshot identifier of the file system, a format of the file system, and a snapshot linked list of the file system;
the returning module 12 generates the latest Snapshot number of the entire file system, for example, adds 1 to the latest Snapshot number of the entire file system as the latest Snapshot number (Global Last Snapshot ID, GLS), and returns the latest Snapshot number as a Snapshot identifier to the client. The GLS in the file system is monotonically increasing, that is, after a snapshot is generated by any directory or file of the file system, the GLS adds 1 to the original base as a snapshot identifier of the currently generated snapshot (of course, the GLS may be monotonically increasing in other manners such as adding 2 and adding 3, which are preset by the client). For example, if the current GLS is snap12, if the current client generates a snapshot for a certain file or directory, and applies for a snapshot identifier from the metadata server, the metadata server adds 1 to the snap12, that is, returns the snapshot identifier to the client as the current snapshot through snap 13.
After returning the client snapshot identifier, the metadata server needs to update the inode (file descriptor) structure of the file system. The inode structure of the file system at least includes an identifier (FID) of the file system, a generating time (byte _ movement) of the file system, a latest snapshot identifier (Current _ snapshot id) of the file system, Flags (Flags) of the file system, and a snapshot linked List (Snap _ List) of the file system, and in addition, the inode structure of the file system may further include a renaming field (Rename) and a temporary field (Change _ movement). Wherein the file system includes at least one of a directory and a file.
The inode structure for updating the file system is specifically an inode structure for updating the snapshot root, and records snapshot root information. The snapshot root refers to a directory or a file to which the generated snapshot belongs. For example, Current _ snapshot id in the inode of the snapshot root is updated to be a new snapshot identifier, and is recorded in the snapshot List, and the Current GLS of the file system = the latest snapshot identifier.
The creating module 13 is configured to create a trace file corresponding to the snapshot generated by the file system, and record change information of the file system after the snapshot is generated in the trace file.
The creating module 13 generates a snapshot for a file or a directory at the client, and then generates a trace file (Track file) corresponding to the snapshot, the trace file being used to record change information of the file system after the generated snapshot. For example, after the snapshot snap22 is generated, the file 1 is modified or deleted, or the file2 is created, and the information of the changes is recorded in the trace file corresponding to the snap 22.
The Track file has the following main functions: 1) when rolling back, finding out which files need to be processed, namely finding out files which are modified, deleted and added; 2) when the snapshot is deleted, finding out which files need to be deleted, namely finding out modified and deleted files; 3) during incremental backup, the difference data between the two snapshots is found, namely, the files which are modified, deleted and added are found.
In the embodiment of the invention, for performance consideration, a Track file of a snapshot can be split into a plurality of sub-files and bound with MDS, the local Track file is updated after each MDS operation is finished, and the object number distribution rule of the Track sub-table is snapshot identification + metadata server identification + mark, namely Snap ID + MDS ID + Flags. In addition, in order to simplify the design of the Track file, only the change information of the directory file can be recorded in the Track file, and the specific change can be acquired only by scanning the directory.
Referring to fig. 17, fig. 17 is a schematic structural diagram of another embodiment of a snapshot processing apparatus in the present invention, a snapshot processing apparatus 200 in the present embodiment includes a receiving module 21, a returning module 22, and a creating module 23, and further includes a snapshot access processing module 24, where:
the receiving module 21 is configured to receive a snapshot identifier application from a client, where the snapshot identifier is used to identify a snapshot generated by a file system of the client;
the returning module 22 is configured to respond to the snapshot identifier application received by the receiving module 21, generate a latest snapshot number, and return the latest snapshot number as a snapshot identifier to the client;
the creating module 23 is configured to create a tracking file corresponding to the generated snapshot, where the tracking file records change information of a file system after the snapshot is generated;
for the specific implementation of the functions of the receiving module 21, the returning module 22 and the creating module 23, please refer to the detailed description of the embodiment shown in fig. 16, which is not described herein again.
The snapshot access processing module 24 is configured to receive a snapshot access request from a client, read snapshot data requested by the client, and return the snapshot data to the client, where the snapshot access request is a directory snapshot access request or a file snapshot access request.
When the snapshot access is a directory snapshot access request, the snapshot access request includes a snapshot identifier and a directory name that are requested to be accessed, the snapshot access processing module 24 is configured to find a directory data object of the directory snapshot and a directory data object of the directory according to the snapshot access request, where the directory data object includes an inode and a record item corresponding to the inode, search the directory data object of the directory snapshot item by item, read the snapshot identifier whose generation time is less than the snapshot identifier that is requested to be accessed, and the snapshot identifier is greater than the snapshot identifier that is requested to be accessed, return the record item corresponding to the inode to the client, search the directory data object of the directory item by item, read the record item corresponding to the inode whose generation time is less than or equal to the snapshot identifier that is requested to be accessed, and.
When the snapshot access request is a file snapshot access request, the snapshot access request includes a snapshot identifier and a file name that are requested to be accessed, the snapshot access processing module 24 is configured to determine a data block that needs to be read according to the size of a snapshot that is requested to be accessed by the snapshot access request, and for each data block that needs to be read, find each data block that needs to be read from the current snapshot backward in the order of generating the snapshot until the data block that needs to be read is read, and return all the read data blocks that need to be read to the client.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus further includes a snapshot deleting module 25, wherein:
the snapshot deleting module 25 is configured to receive a request for deleting a snapshot from the client, and delete the snapshot according to the request.
The snapshot deleting module 25 is configured to search the trace files of all snapshots, check whether a snapshot requested to be deleted has a previous snapshot, copy, when the snapshot requested to be deleted has the previous snapshot, the shared data to the previous snapshot, and record change information in the trace file corresponding to the previous snapshot, and then delete the snapshot requested to be deleted and the trace file corresponding to the previous snapshot, and directly delete the snapshot requested to be deleted and the trace file corresponding to the snapshot requested to be deleted if the snapshot requested to be deleted does not have the data shared with the previous snapshot.
The snapshot deleting module 25 is further configured to directly delete the snapshot requested to be deleted and the corresponding trace file when the snapshot requested to be deleted does not have the previous snapshot.
In a specific processing procedure, the snapshot deleting module 25 receives a snapshot deleting request from the client, where the snapshot deleting request includes a snapshot identifier of a snapshot requested to be deleted. And then searching the Track files of all snapshots, finding out the snapshot requested to be deleted, checking whether the snapshot requested to be deleted has a previous snapshot version, and acquiring the previous snapshot version and a corresponding snapshot number by checking a snapshot linked list recorded in an inode of the snapshot requested to be deleted. If the snapshot requested to be deleted has the previous snapshot version, whether the snapshot requested to be deleted is a file snapshot or a directory snapshot is further judged, if the snapshot requested to be deleted is the file snapshot, the processing is carried out according to the processing flow of deleting the file snapshot, and if the snapshot requested to be deleted is the directory snapshot, the processing is carried out according to the processing flow of the directory snapshot. And if the snapshot requested to be deleted has no previous snapshot version, directly deleting the snapshot requested to be deleted.
The processing flow of deleting the file snapshot mainly comprises the following steps: judging whether the strip data of the snapshot requiring deletion exists in the previous snapshot version, if not, copying the strip data of the snapshot requiring deletion to the previous snapshot version, and recording the change information to a Track file corresponding to the previous snapshot. And if the previous snapshot version has the strip data of the snapshot requested to be deleted, directly deleting the snapshot requested to be deleted and the corresponding Track file thereof.
The processing flow of deleting the directory snapshot mainly comprises the following steps: searching all record items of the directory snapshot requested to be deleted, deleting the record items of the directory snapshot requested to be deleted when the snapshot record item copy of the previous snapshot version exists, and generating the snapshot record item of the previous version by COW copy when the snapshot record item copy of the previous snapshot version does not exist. After the entry item of the directory is processed, the directory itself needs to be processed, whether the last snapshot version of the directory itself exists or not is further judged, if yes, the snapshot is deleted, if not, the COW generates the record item of the directory, and the directory information of the directory is added to the Track file of the previous snapshot version.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus may further include an increment obtaining module 26, wherein:
the increment obtaining module 26 is configured to obtain an increment through the inter-snapshot comparison, where the increment includes an increment of data and an increment of metadata.
The data increment includes an increment generated by modified write and an increment generated by additional write, wherein the increment obtaining module 26 obtains the increment generated by modified write by scanning a trace file of the snapshot and obtains the increment generated by additional write by comparing an inode of the file with the inode of the corresponding snapshot.
The increment obtaining module 26 obtains the metadata increment by comparing the snapshot with the generation time in the inode of the corresponding non-snapshot version or by looking up the mark in the inode of the snapshot.
The comparison between snapshots adopts an increment superposition comparison method between two adjacent snapshots, and if the increment between < snap1 and snap7> needs to be compared, three increment comparison processes of superposition < snap1, snap3>, < snap3, snap5>, < snap5 and snap7> are carried out in the following processes; in the process of increment comparison, the increment of the three parts of modification, deletion and new addition needs to be identified; the increment is divided into data increment and metadata increment, the data increment is mainly generated by modification writing and additional writing of file data, the metadata increment is mainly generated by modifying inode or entry items, and the increment acquisition process is also divided into the following two parts:
1. incremental portion of data
1) The incremental quantity generated by the modification writing can acquire which strips are modified by scanning the Track file of the snapshot;
2) the increment generated by the additional writing can be known by comparing the size attributes of the inode and the snapshot version inode.
2. Metadata delta section
1) Creating, by comparing the generation time of the non-snapshot version inode, whether the creation is based on the current snapshot, for example, looking at the increment between the < N, N1> snapshots, and needing to look at whether the generation time of the increment is between the generation times of the < N, N1> snapshots;
2) deleting, wherein the generated snapshot version can be known by looking up the Flags in the snapshot version inode;
3) modifications (inode modifications or entry renaming), by looking at the Flags in the snapshot version inode, it can be known that the snapshot version was generated due to the modification.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus further includes a moving module 27, wherein:
the file system of this embodiment includes a first directory and a second directory, where the first directory and the second directory generate a snapshot, and the moving module 27 is configured to receive a moving instruction of the client, and move data in the first directory to the second directory, where the data in the first directory includes at least one of a subfile and a subdirectory in the first directory.
Specifically, the moving module 27 is configured to delete a record item of data of the first directory, search upwards for all snapshot information, record all found snapshot information in a snapshot linked list of an inode of the data, generate a snapshot of the record item of the data in a directory data object of the first directory, and record change information of the first directory in a corresponding trace file, where the directory data object of the first directory includes the inode of the first directory and the record item corresponding to the inode, copy the record item of the generated data in the second directory, collect upwards all snapshot information to update the snapshot linked list of the inode of the data and a generation time of the data, and record change information of the second directory in the trace file corresponding to the second directory snapshot.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus further includes a hard link management module 28, wherein:
the hard link management module 28 is configured to record a relationship between the hard link file extension and its parent directory extension through a birth table.
The file system of the invention comprises a hard link file, wherein the hard link file is provided with a plurality of father directories, and the plurality of father directories generate snapshots. The hard link management module 28 of this embodiment is configured to record, for a hard link file, a relationship between the hard link file and its parent directory through a birth table.
For the handling of the hard-linked file, the relationship of the file extension (fid) to its parent directory fid is recorded through a birth table (birth table). That is, all the parent directories fid of the file2 can be obtained by looking up the fid of the file2, and the original parent directory (numbered 0) can be found according to the number of the parent directory.
Since a hard link does not have its own inode, it shares one inode with its linked file, and when accessing a file on the hard link, it needs to consider whether there is snapshot management on its parent directory path. Therefore, the hard link management module 28 of this embodiment is further configured to receive a request for accessing the hard link file through a specified path, find all parent directories of the hard link file according to the birth table, and generate all paths and snapshots corresponding to all paths through which all the parent directories access the hard link file.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus further includes a rollback module 29, wherein:
the rollback module 29 is configured to receive a first-level directory rollback instruction and perform a first-level directory rollback operation.
The rollback module 29 is configured to receive a first-level directory rollback instruction, lock a first-level directory to which the instruction is to be rolled back, determine snapshots to which the instruction is to be rolled back, check a snapshot linked list of the first-level directory to which the instruction is to be rolled back, find all snapshots after the snapshots to which the instruction is to be rolled back, roll back files of the first-level directory one by one from large to small according to snapshot identifiers of the snapshots until the first-level directory is rolled back to which the instruction is to be rolled back, and then unlock the first-level directory to be rolled back.
The rollback module 29 is specifically configured to find the snapshot with the maximum snapshot identifier as the snapshot to be rolled back currently, scan a tracking file of the snapshot with the maximum snapshot identifier, determine whether the snapshot with the maximum snapshot identifier is a file snapshot or a directory snapshot, delete the stripe object of the snapshot with the maximum snapshot identifier when the snapshot with the maximum snapshot identifier is a file snapshot, rename the corresponding stripe object in the files in the primary directory to the stripe object of the snapshot with the maximum snapshot identifier, delete the tracking file of the snapshot with the maximum snapshot identifier, end the rollback operation and release the lock of the primary directory to be rolled back when the snapshot with the maximum snapshot identifier is the snapshot to be rolled back by the instruction, and otherwise, return the snapshot with the maximum snapshot identifier as the snapshot to be rolled back currently and steps thereafter.
The rollback module 29 is further configured to scan a directory data object of the primary directory snapshot when the snapshot with the maximum snapshot identifier is a directory snapshot, where the directory data object includes an inode and a record entry corresponding to the inode, find a snapshot with the snapshot identifier identical to the maximum snapshot identifier in the directory data object, then determine whether the inode size of the snapshot with the maximum snapshot identifier changes with respect to the inode in the current version of the primary directory, when the inode changes, reduce the current version of the primary directory to the snapshot size with the maximum snapshot identifier, otherwise, copy the snapshot record entry with the maximum snapshot identifier to the directory data object of the current version of the primary directory, scan the directory data object of the current version of the primary directory, where the directory data object includes the inode and a record entry corresponding to the inode, delete the record entry with a generation time greater than the maximum snapshot identifier, and when the snapshot identified by the maximum snapshot is the snapshot to which the instruction needs to be rolled back, ending the roll-back operation and unlocking the primary directory to be rolled back, otherwise, returning to find out the snapshot identified by the maximum snapshot as the current snapshot to be rolled back and the subsequent steps.
Referring to fig. 17, in another embodiment of the snapshot processing apparatus 200 of the present invention, the snapshot processing apparatus further includes a modification module 30, wherein:
the modification module 30 is configured to receive a modification operation instruction for modifying data of the file system that has generated the snapshot, and modify the data according to the modification operation instruction, where the modification operation instruction includes a metadata modification operation instruction and a file data modification operation instruction.
When the modification operation instruction is a metadata modification operation instruction, the modification module 30 is configured to copy metadata of the file system for data modification when writing, modify an inode of the file system, and record modification information in a trace file of the file system.
When the modification operation instruction is a file data modification operation instruction, the modification module 30 is configured to perform a redirection process during writing of data of the modified file system, then perform data modification of the file system, and record modification information in a trace file of the target file snapshot.
If a modification operation instruction for modifying the data of the file system which has not generated the snapshot is received, the modification module 30 may directly modify the data or modify the inode. Because the data of the file system has not generated the snapshot, the modification of the data does not have the condition that the data shared by the snapshot is affected.
Referring to fig. 18, fig. 18 is a schematic structural diagram of a snapshot processing apparatus 300 according to another embodiment of the present invention, which includes a processor 31, a memory 32, an input device 33, an output device 34, and a bus system 35, wherein:
the processor 31 controls the operation of the snapshot Processing apparatus 300, and the processor 31 may also be referred to as a CPU (Central Processing Unit). The processor 31 may be an integrated circuit chip having signal processing capabilities. The processor 31 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 32 may include a read-only memory and a random access memory, and provides instructions and data to the processor 31. A portion of the memory 32 may also include non-volatile random access memory (NVRAM).
The various components of snapshot processing apparatus 300 are coupled together by a bus system 35, which may be an ISA (Industry Standard Architecture) bus, a PCI (peripheral component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be one or more physical lines, and when a plurality of physical lines are provided, may be divided into an address bus, a data bus, a control bus, and the like. The bus system 35 may include a power bus, a control bus, a status signal bus, and the like, in addition to the data bus. For clarity of illustration, however, the various buses are labeled as bus system 35 in the figures.
The input device 33 may be embodied as a mouse, keyboard, microphone, etc., while the output device 34 may be embodied as a display, audio device, video device. Of course, the input device 33 and the output device 34 may also implement their functions through one input and output device, such as a touchable screen.
The memory 32 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof:
and (3) operating instructions: including various operational instructions for performing various operations.
Operating the system: including various system programs for implementing various basic services and for handling hardware-based tasks.
In the embodiment of the present invention, the processor 31 performs the following operations by calling the operation instruction (which may be stored in the operating system) stored in the memory 32:
the processor 31 responds to a snapshot identification application sent by the client, generates a latest snapshot number, returns the latest snapshot number to the client as a snapshot identification, and updates an inode structure of the file system, where the inode structure of the file system at least includes an identification of the file system, a generation time of the file system, a latest snapshot identification of the file system, a format of the file system, and a snapshot linked list of the file system, where the file system includes at least one of a directory and a file.
The processor 31 further creates a trace file corresponding to the generated snapshot, and the trace file records the change information of the file system after the snapshot is generated.
The processor 31 may also be configured to receive a snapshot access request from the client, read snapshot data requested by the client, and return the snapshot access request to the client, where the snapshot access request is a directory snapshot access request or a file snapshot access request.
When the snapshot access is a directory snapshot access request, the snapshot access request includes a snapshot identifier and a directory name which are requested to be accessed, the processor 31 finds a directory data object of the directory snapshot and a directory data object of the directory according to the snapshot access request, the directory data object includes an inode and a record item corresponding to the inode, searches the directory data object of the directory snapshot item by item, reads the snapshot identifier of which the generation time is less than that of the snapshot identifier which is requested to be accessed, and the snapshot identifier is greater than that of the snapshot identifier which is requested to be accessed, returns the record item corresponding to the inode to the client, searches the directory data object by item, reads the record item corresponding to the inode of which the generation time is less than or equal to that of the snapshot identifier which is requested to be accessed.
When the snapshot access request is a file snapshot access request, the snapshot access request includes a snapshot identifier and a file name that are requested to be accessed, the processor 31 is configured to determine a data block that needs to be read according to the size of a snapshot that is requested to be accessed by the snapshot access request, and for each data block that needs to be read, find each data block that needs to be read from the current snapshot backward in the order of generating the snapshot until the data block that needs to be read is read, and return all the read data blocks that need to be read to the client.
The processor 31 may be further configured to receive a request for deleting a snapshot from the client, and delete the snapshot according to the request.
Specifically, the processor 31 is configured to search for the trace files of all snapshots, check whether a snapshot requested to be deleted has a previous snapshot, copy, when the snapshot requested to be deleted has the previous snapshot, shared data to the previous snapshot, record change information in the trace file corresponding to the previous snapshot, delete the snapshot requested to be deleted and the trace file corresponding to the previous snapshot, and directly delete the snapshot requested to be deleted and the trace file corresponding to the snapshot requested to be deleted if the snapshot requested to be deleted does not have the data shared with the previous snapshot.
The processor 31 may also be configured to directly delete the snapshot requested to be deleted and the corresponding trace file when the snapshot requested to be deleted does not have the previous snapshot.
The processor 31 may be further configured to obtain deltas through the inter-snapshot comparison, where the deltas include a delta of data and a delta of metadata.
The data increment comprises an increment generated by modified write and an increment generated by additional write, wherein the processor 31 acquires the increment generated by modified write by scanning a trace file of the snapshot and acquires the increment generated by additional write by comparing the inode of the file with the inode of the corresponding snapshot.
The processor 31 obtains the metadata increment by comparing the snapshot with the generation time in the inode of the corresponding non-snapshot version thereof or looking up the mark in the inode of the snapshot.
The file system includes a first directory and a second directory, the first directory and the second directory generate snapshots, and the processor 31 may be further configured to receive a move instruction from the client, and move data in the first directory to the second directory, where the data in the first directory includes at least one of a sub-directory and a sub-file in the first directory.
The processor 31 is specifically configured to delete a record item of data of the first directory, search up all snapshot information, record all found snapshot information in a snapshot linked list of an inode of the data, generate a snapshot of the record item of the data in a directory data object of the first directory, and record change information of the first directory in a trace file corresponding to the snapshot linked list, where the directory data object of the first directory includes the inode of the first directory and the record item corresponding to the inode, copy the record item of the generated data in the second directory, collect all snapshot information upward to update the snapshot linked list of the inode of the data and a generation time of the data, and record change information of the second directory in the trace file corresponding to the snapshot of the second directory.
Further, the file system includes a hard-link file, the hard-link file has a plurality of parent directories, the plurality of parent directories all generate snapshots, and the processor 31 is further configured to record the relationship between the hard-link file extension and its parent directory extension through a birth table.
The processor 31 is further configured to receive a request for accessing the hard link file through a specified path, find all parent directories of the hard link file according to the birth table, and generate snapshots corresponding to all paths and all paths through which all the parent directories access the hard link file.
The processor 31 is configured to receive a primary directory rollback instruction and perform a primary directory rollback operation.
Specifically, the processor 31 receives a first-level directory rollback instruction, locks a first-level directory to be rolled back by the instruction, determines a snapshot to be rolled back by the instruction, looks up a snapshot linked list of the first-level directory to be rolled back by the instruction, finds all snapshots after the snapshot to be rolled back by the instruction, rolls back files of the first-level directory one by one from large to small in sequence according to snapshot identifiers of the snapshots until the snapshots to be rolled back by the instruction are rolled back, and then releases the lock of the first-level directory to be rolled back.
The processor 31 is configured to find the snapshot with the maximum snapshot identifier as the snapshot to be currently rolled back, scan a tracking file of the snapshot with the maximum snapshot identifier, determine whether the snapshot with the maximum snapshot identifier is a file snapshot or a directory snapshot, delete the stripe object of the snapshot with the maximum snapshot identifier when the snapshot with the maximum snapshot identifier is a file snapshot, rename the corresponding stripe object in the files in the primary directory to the stripe object of the snapshot with the maximum snapshot identifier, delete the tracking file of the snapshot with the maximum snapshot identifier, end the roll-back operation and unlock the primary directory to be rolled back when the snapshot with the maximum snapshot identifier is the snapshot to be rolled back by the instruction, and otherwise, return to find the snapshot with the maximum snapshot identifier as the snapshot to be currently rolled back and subsequent steps.
The processor 31 is further configured to scan a directory data object of the primary directory snapshot when the snapshot with the maximum snapshot identifier is a directory snapshot, where the directory data object includes an inode and a record entry corresponding to the inode, find a snapshot with the snapshot identifier identical to the maximum snapshot identifier in the directory data object, then determine whether the inode size of the snapshot with the maximum snapshot identifier changes with respect to the inode in the current version of the primary directory, when the inode changes, truncate the current version of the primary directory to the snapshot size of the maximum snapshot identifier, otherwise, copy the snapshot record entry generating the maximum snapshot identifier to the directory data object of the current version of the primary directory, scan the directory data object of the current version of the primary directory, where the directory data object includes the inode and a record entry corresponding to the inode, and delete the record entry whose generation time is greater than the maximum snapshot identifier, and when the snapshot identified by the maximum snapshot is the snapshot to which the instruction needs to be rolled back, ending the roll-back operation and unlocking the primary directory to be rolled back, otherwise, returning to find out the snapshot identified by the maximum snapshot as the current snapshot to be rolled back and the subsequent steps.
The processor 31 may be further configured to receive a modification operation instruction for modifying data of the file system that has generated the snapshot, and modify the data of the file system according to the modification operation instruction, where the modification operation instruction includes a metadata modification operation instruction and a file data modification operation instruction.
When the modification operation instruction is a metadata modification operation instruction, the processor 31 is specifically configured to copy metadata of the file system for data modification when writing, modify an inode of the file system data, and record modification information in a trace file of the file system data.
When the modification operation instruction is a file data modification operation instruction, the processor 31 is specifically configured to perform a redirection process during writing of data of the modified file system, then perform file data modification, and record modification information in a trace file of the file system snapshot.
The method disclosed in the above embodiments of the present application may be applied to the processor 31, or implemented by the processor 31. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 31. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 32, and the processor 31 reads the information in the memory 32 and completes the steps of the method in combination with the hardware.
Through the description of the above embodiment, it can be understood that, according to the snapshot management method and apparatus provided by the present invention, the monotonically increasing GLS is used as the snapshot identifier and returned to the client, and a tracking file that records the change information of the snapshot file system is newly added in the server, so that any directory snapshot can be implemented for the distributed file system without file layout and global file descriptor structure table, and data management is facilitated.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (9)

1. A snapshot management method, comprising:
the method comprises the steps that a metadata server receives a snapshot identification application from a client, wherein the snapshot identification is used for identifying a snapshot generated by a client file system, and the server is used for managing the file system and the snapshot of the file system;
the metadata server generates a latest snapshot number of the whole file system, returns the latest snapshot number as the snapshot identifier to the client, and updates a file descriptor structure of the file system, wherein the file descriptor structure of the file system at least comprises the identifier of the file system, the generation time of the file system, the latest snapshot identifier of the file system, a mark, a temporary field, an important named field and a snapshot linked list of the file system, and the file system comprises at least one of a file and a directory;
the method comprises the steps that a metadata server creates a tracking file corresponding to a snapshot generated by the file system, the tracking file records change information of the file system after the snapshot is generated, and the tracking file is divided into a plurality of subfiles and bound with the metadata server;
wherein the latest snapshot number in the file system is monotonically increasing;
the file system comprises a first directory and a second directory, the first directory and the second directory generate snapshots, and the method further comprises the following steps:
receiving a moving instruction of a client, wherein the moving instruction indicates that data in a first directory is moved to a second directory, and the data comprises at least one of sub-directories and sub-files in the first directory;
deleting entries of the data of the first directory;
searching all snapshot information upwards, and recording all the found snapshot information in a snapshot linked list of a file descriptor of the data;
generating a snapshot of a record item of the data in a directory data object of the first directory, and recording change information of the first directory into a corresponding tracking file, where the directory data object of the first directory includes a file descriptor of the first directory and a record item corresponding to the file descriptor of the first directory;
copying and generating a record item of the data under the second directory, and collecting all snapshot information upwards to update a snapshot chain table of a file descriptor of the data and the generation time of the data;
and recording the change information of the second directory into a tracking file corresponding to the second directory snapshot.
2. The method of claim 1, further comprising:
receiving a directory snapshot access request from a client, wherein the directory snapshot access request comprises a snapshot identifier and a directory name which are requested to be accessed;
finding a directory data object of a directory snapshot and a directory data object of a directory according to the snapshot access request, wherein the directory data object of the directory snapshot comprises a file descriptor of the directory snapshot and a record item corresponding to the file descriptor of the directory snapshot, and the directory data object of the directory comprises a file descriptor of the directory and a record item corresponding to the file descriptor of the directory;
searching a directory data object of the directory snapshot, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than that of the snapshot identifier requested to be accessed, and the snapshot identifier is greater than that of the snapshot identifier requested to be accessed, and returning the record item to the client;
and searching a directory data object of the directory, reading a record item corresponding to the file descriptor of the snapshot identifier of which the generation time is less than or equal to the access request, and returning the record item to the client.
3. The method of claim 1, further comprising:
receiving a file snapshot access request from a client, wherein the file snapshot access request comprises a snapshot identifier and a file name which are requested to be accessed;
determining a data block to be read according to the snapshot requested to be accessed by the snapshot access request;
for each data block needing to be read, finding the data block from the current snapshot to the next according to the sequence of generating the snapshot until the data block needing to be read is read;
and returning all the read data blocks needing to be read to the client.
4. The method of claim 1, further comprising:
receiving a first-level directory rollback instruction, locking a first-level directory to be rolled back by the first-level directory rollback instruction, and determining a snapshot to which the first-level directory rollback instruction is to be rolled back;
checking a snapshot linked list of the primary directory to be rolled back by the primary directory rollback instruction, and finding out all snapshots after the snapshot to which the primary directory rollback instruction is to be rolled back;
and rolling back the files of the primary directory one by one from large to small according to the snapshot identification of the snapshot until the files are rolled back to the snapshot to which the primary directory rollback instruction needs to be rolled back, and then unlocking the primary directory to be rolled back.
5. The method according to claim 4, wherein the step of rolling back the files of the primary directory one by one from large to small according to the snapshot identifier of the snapshot until the files are rolled back to the snapshot to which the primary directory rollback instruction is to be rolled back comprises:
finding out the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back, scanning the tracking file of the snapshot with the maximum snapshot identifier, and judging whether the snapshot with the maximum snapshot identifier is a file snapshot or a directory snapshot;
when the snapshot identified by the maximum snapshot is a file snapshot, deleting the strip object of the snapshot identified by the maximum snapshot, and renaming the corresponding strip object in the file of the primary directory as the strip object of the snapshot identified by the maximum snapshot;
deleting the trace file of the snapshot identified by the maximum snapshot;
and when the snapshot of the maximum snapshot identifier is the snapshot to which the primary directory rollback instruction needs to be rolled back, ending the rollback operation, otherwise, returning the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back and the subsequent steps.
6. The method of claim 5,
when the snapshot identified by the maximum snapshot is a directory snapshot, scanning a directory data object of the primary directory snapshot, wherein the directory data object of the primary directory snapshot comprises a file descriptor and a record entry corresponding to the file descriptor of the primary directory snapshot;
finding the snapshot with the same snapshot identifier as the maximum snapshot identifier in the directory data object of the primary directory snapshot, and then judging whether the size of the snapshot file descriptor of the maximum snapshot identifier changes relative to the file descriptor in the current version of the primary directory;
when the change occurs, the current version of the primary directory is cut down to the size of the snapshot of the maximum snapshot identifier, otherwise, the snapshot record item of the maximum snapshot identifier is copied and generated to the directory data object of the current version of the primary directory;
scanning a directory data object of the current version of the primary directory, wherein the directory data object of the current version of the primary directory comprises a file descriptor and a record item corresponding to the file descriptor of the directory data object of the current version of the primary directory;
deleting the record item with the generation time larger than the maximum snapshot identification;
and when the snapshot of the maximum snapshot identifier is the snapshot to which the primary directory rollback instruction needs to be rolled back, ending the rollback operation, otherwise, returning the snapshot with the maximum snapshot identifier as the current snapshot to be rolled back and the subsequent steps.
7. A snapshot processing apparatus, wherein the snapshot processing apparatus is a metadata server, the apparatus comprises a receiving module, a returning module, and a creating module, wherein:
the receiving module is used for receiving a snapshot identification application from the client, and the snapshot identification is used for identifying a snapshot generated by a file system of the client;
the return module is used for responding to the snapshot identification application received by the receiving module, generating the latest snapshot number of the whole file system, returning the latest snapshot number as the snapshot identification to the client, and updating a file descriptor structure of the file system, wherein the file descriptor structure at least comprises the identification of the file system, the generation time of the file system, the latest snapshot identification of the file system, a mark, a temporary field and a rename field of the file system and a snapshot linked list of the file system, and the file system comprises at least one of a file and a directory;
the creating module is used for creating a tracking file corresponding to the snapshot generated by the file system, the tracking file records the change information of the file system after the snapshot is generated, and the tracking file is divided into a plurality of subfiles and is bound with a metadata server;
wherein the latest snapshot number in the file system is monotonically increasing;
the file system comprises a first directory and a second directory, the first directory and the second directory generate snapshots, the apparatus further comprises a moving module, the moving module is configured to receive a moving instruction of a client, the moving instruction instructs to move data under the first directory to the second directory, the data includes at least one of sub-directories and sub-files under the first directory, delete a record item of the data of the first directory, search all snapshot information upwards, record all the found snapshot information in a snapshot linked list of a file descriptor of the data, generate a snapshot of the record item of the data in a directory data object of the first directory, and record change information of the first directory in a corresponding tracking file of the first directory, the directory data object of the first directory includes the file descriptor of the first directory and a record item corresponding to the file descriptor of the first directory, and copying and generating a record item of the data under the second directory, collecting all snapshot information upwards to update a snapshot chain table of a file descriptor of the data and the generation time of the data, and recording the change information of the second directory into a tracking file corresponding to the snapshot of the second directory.
8. The apparatus according to claim 7, wherein the apparatus further comprises a snapshot access processing module, the snapshot access processing module is configured to receive a directory snapshot access request from the client, where the directory snapshot access request includes a snapshot identifier and a directory name of a request for access, and a directory data object of a directory snapshot and a directory data object of a directory are found according to the directory snapshot access request, where the directory data object of the directory snapshot includes entries corresponding to a file descriptor of the directory snapshot and a file descriptor of the directory snapshot, and the directory data object of the directory includes entries corresponding to a file descriptor of the directory and a file descriptor of the directory; searching a directory data object of the directory snapshot, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than the snapshot identifier requested to be accessed, and the snapshot identifier is greater than the snapshot identifier requested to be accessed, returning the record item to the client, searching the directory data object of the directory, reading a record item corresponding to a file descriptor, wherein the generation time of the record item is less than or equal to the snapshot identifier requested to be accessed, and returning the record item to the client; or
The snapshot access processing module is further configured to receive a file snapshot access request from a client, where the file snapshot access request includes a snapshot identifier and a file name that are requested to be accessed, determine, according to a snapshot that is requested to be accessed by the file snapshot access request, a data block that needs to be read, and, for each data block that needs to be read, find, from the current snapshot to the next according to a snapshot generation sequence, until the data block that needs to be read is read; and returning all the read data blocks needing to be read to the client.
9. The apparatus according to claim 7, wherein the apparatus further comprises a rollback module, and the rollback module is configured to receive a primary directory rollback instruction, lock a primary directory to be rolled back by the primary directory rollback instruction, determine snapshots to which the primary directory rollback instruction is to be rolled back, view a snapshot linked list of the primary directory to be rolled back by the primary directory rollback instruction, find all snapshots after the snapshots to which the primary directory rollback instruction is to be rolled back, roll back files of the primary directory one by one from large to small according to snapshot identifiers of the snapshots until the snapshots to which the primary directory rollback instruction is to be rolled back are rolled back, and then release the lock of the primary directory to be rolled back.
CN201310690529.6A 2013-12-13 2013-12-13 Snapshot management method and device Active CN104714755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310690529.6A CN104714755B (en) 2013-12-13 2013-12-13 Snapshot management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310690529.6A CN104714755B (en) 2013-12-13 2013-12-13 Snapshot management method and device

Publications (2)

Publication Number Publication Date
CN104714755A CN104714755A (en) 2015-06-17
CN104714755B true CN104714755B (en) 2020-01-03

Family

ID=53414142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310690529.6A Active CN104714755B (en) 2013-12-13 2013-12-13 Snapshot management method and device

Country Status (1)

Country Link
CN (1) CN104714755B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045542B (en) * 2015-09-11 2018-08-03 浪潮(北京)电子信息产业有限公司 A kind of method and device for realizing snapshot management
CN105335253B (en) * 2015-10-28 2019-01-15 北京百度网讯科技有限公司 The method and apparatus for creating virtual machine system disk snapshot
CN105302922B (en) * 2015-11-24 2018-07-06 无锡江南计算技术研究所 A kind of distributed file system snapshot implementing method
CN106250265A (en) * 2016-07-18 2016-12-21 乐视控股(北京)有限公司 Data back up method and system for object storage
CN107783776B (en) * 2016-08-26 2021-10-15 斑马智行网络(香港)有限公司 Processing method and device of firmware upgrade package and electronic equipment
CN108572986B (en) * 2017-03-13 2022-05-17 华为技术有限公司 Data updating method and node equipment
CN108255638B (en) * 2017-06-29 2021-05-28 新华三技术有限公司 Snapshot rollback method and device
CN107526840A (en) * 2017-09-14 2017-12-29 郑州云海信息技术有限公司 File system snapshot querying method, device and computer-readable recording medium
CN107609176B (en) * 2017-09-29 2019-09-13 郑州云海信息技术有限公司 A kind of method and system that snapshot distributed storage is locally stored
CN109783274B (en) * 2017-11-15 2023-03-14 阿里巴巴集团控股有限公司 Disk snapshot management method and device and readable storage medium
CN107958034A (en) * 2017-11-20 2018-04-24 郑州云海信息技术有限公司 Distribution method, device and the medium of the inode number of distributed file system
CN108228226B (en) * 2017-12-29 2021-07-13 北京元心科技有限公司 Hard link differential method and device and corresponding terminal
CN108829813A (en) * 2018-06-06 2018-11-16 郑州云海信息技术有限公司 A kind of File Snapshot method and system based on distributed memory system
CN108958888A (en) * 2018-07-04 2018-12-07 联想(北京)有限公司 The data processing method and processing system of electronic equipment
CN110866063B (en) * 2018-08-27 2023-10-31 阿里云计算有限公司 Data tracking processing method and device
CN109344118B (en) * 2018-09-26 2021-10-15 郑州云海信息技术有限公司 Snapshot rollback recovery method, system, device and computer readable storage medium
CN109189614A (en) * 2018-10-19 2019-01-11 郑州云海信息技术有限公司 A kind of snapshot rollback method and device
CN109408294A (en) * 2018-11-13 2019-03-01 郑州云海信息技术有限公司 A kind of snapshot rollback method, device, equipment and storage medium
CN109739819A (en) * 2019-01-15 2019-05-10 北京智融时代信息技术有限公司 Snapshot lossless compression method, device, equipment and the readable storage medium storing program for executing that can be recalled
CN111506583A (en) * 2019-01-31 2020-08-07 北京嘀嘀无限科技发展有限公司 Update method, update apparatus, server, computer device, and storage medium
CN110032474B (en) * 2019-04-12 2020-03-06 苏州浪潮智能科技有限公司 Method, system and related components for determining snapshot occupied capacity
CN111143126A (en) * 2019-12-20 2020-05-12 浪潮电子信息产业股份有限公司 Data copying method, system and related components of distributed file system
CN111190878B (en) * 2019-12-29 2022-04-22 北京浪潮数据技术有限公司 Method, device, equipment and storage medium for sharing access NAS snapshot
CN111782587A (en) * 2020-06-30 2020-10-16 北京三快在线科技有限公司 Snapshot information recording method, device, equipment and storage medium
CN112650723A (en) * 2020-12-28 2021-04-13 北京浪潮数据技术有限公司 File sharing method, device, equipment and computer readable storage medium
CN112596956B (en) * 2020-12-28 2024-02-13 北京浪潮数据技术有限公司 File system management method, device and related components
CN113032346B (en) * 2021-04-12 2023-05-02 曙光信息产业股份有限公司 File system freezing method, management method, device, equipment and storage medium
US11822804B2 (en) 2021-10-04 2023-11-21 Vmware, Inc. Managing extent sharing between snapshots using mapping addresses
CN113687834B (en) * 2021-10-27 2022-02-18 深圳华锐金融技术股份有限公司 Distributed system node deployment method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1766843A (en) * 2004-10-07 2006-05-03 微软公司 Method and system for limiting resource usage of a version store
CN101178677A (en) * 2007-11-09 2008-05-14 中国科学院计算技术研究所 Computer system for protecting software and method for protecting software

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011037624A1 (en) * 2009-09-22 2011-03-31 Emc Corporation Snapshotting a performance storage system in a system for performance improvement of a capacity optimized storage system
CN102971698B (en) * 2012-06-29 2014-07-09 华为技术有限公司 Snapshot data-processing method and system, storage system and snapshot agency

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1766843A (en) * 2004-10-07 2006-05-03 微软公司 Method and system for limiting resource usage of a version store
CN101178677A (en) * 2007-11-09 2008-05-14 中国科学院计算技术研究所 Computer system for protecting software and method for protecting software

Also Published As

Publication number Publication date
CN104714755A (en) 2015-06-17

Similar Documents

Publication Publication Date Title
CN104714755B (en) Snapshot management method and device
AU2021202623B2 (en) System for synchronization of changes in edited websites and interactive applications
US11461365B2 (en) Atomic moves with lamport clocks in a content management system
EP2422282B1 (en) Asynchronous distributed object uploading for replicated content addressable storage clusters
CN102567140B (en) Use the file system backup of change journal
US8423733B1 (en) Single-copy implicit sharing among clones
CN106484820B (en) Renaming method, access method and device
CN104641365A (en) System and method for managing deduplication using checkpoints in a file storage system
CN104778192A (en) Representing directory structure in content-addressable storage systems
CN107330024B (en) Storage method and device of tag system data
CN113342741B (en) Snapshot implementation method and device, electronic equipment and computer readable storage medium
CN111176901B (en) HDFS deleted file recovery method, terminal device and storage medium
US11954066B2 (en) Coalescing storage log entries
US20230147552A1 (en) Methods and systems for ordering operations on a file system having a hierarchical namespace
CN112181899A (en) Metadata processing method and device and computer readable storage medium
CN117290298A (en) Data processing method and related device
CN104166723A (en) Data shredding method and device for resilient file system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220118

Address after: 450046 Floor 9, building 1, Zhengshang Boya Plaza, Longzihu wisdom Island, Zhengdong New Area, Zhengzhou City, Henan Province

Patentee after: Super fusion Digital Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.