CN115599743A - Method and device for migrating files and storage medium - Google Patents

Method and device for migrating files and storage medium Download PDF

Info

Publication number
CN115599743A
CN115599743A CN202111044488.4A CN202111044488A CN115599743A CN 115599743 A CN115599743 A CN 115599743A CN 202111044488 A CN202111044488 A CN 202111044488A CN 115599743 A CN115599743 A CN 115599743A
Authority
CN
China
Prior art keywords
file
metadata
directory
source
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111044488.4A
Other languages
Chinese (zh)
Inventor
王张平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN115599743A publication Critical patent/CN115599743A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services

Abstract

The application discloses a method, a device and a storage medium for migrating files, and belongs to the field of communication. The method comprises the following steps: acquiring a handle of each file directory included in a source file system, wherein each file directory includes at least one file, and the at least one file is located in the source file system; acquiring metadata of each file in the source file system based on the handle of each file directory; and migrating a first file to a destination file system based on the metadata of each file, wherein the first file is a file which is changed or a newly added file on the source file system after a first time, and the first time is the last time of migrating the file to the destination file system. The file migration method and device can improve the file migration efficiency.

Description

Method, device and storage medium for migrating files
The present application claims priority of chinese patent application No. 202110773762.5 entitled "a method for fast incremental migration of mass files" filed on 8/7/2021, which is incorporated herein by reference in its entirety.
Technical Field
The present application relates to the field of communications, and in particular, to a method, an apparatus, and a storage medium for migrating a file.
Background
In the field of file storage, files are often required to be migrated, i.e., files in a source filesystem are required to be migrated to a destination filesystem.
Currently, when a file is migrated, a file path of each file in a source file system is acquired, based on a file path of any file, metadata of the file can be acquired from the source file system only by interacting with a Portable Operating System Interface (POSIX) interface call message with the source file system for multiple times, the metadata of each file in the source file system is acquired in the same manner as described above, and the metadata of each file in a destination file system is acquired from the destination file system.
And obtaining a changed file or a newly added file in the source file system compared with the destination file system based on the metadata of each file in the source file system and the metadata of each file in the destination file system, and migrating the changed file or the newly added file from the source file system to the destination file system.
Whether the metadata of the file is acquired from the source file system or the metadata of the file is acquired from the target file system, each file needs to acquire the metadata of each file by interacting with the POSIX interface for many times to call messages, and the interaction messages are excessive, so that the metadata acquisition efficiency is low, and the file migration efficiency is further reduced.
Disclosure of Invention
The application provides a method and a device for file migration and a storage medium, so as to improve the efficiency of file migration. The technical scheme is as follows:
in a first aspect, the present application provides a method for migrating files, in which a handle of each file directory included in a source file system is obtained, each file directory includes at least one file, and the at least one file is located in the source file system. Metadata for each file in the source file system is obtained based on the handle for each file directory. And migrating a first file to the destination file system based on the metadata of each file, wherein the first file is a file which is changed or newly added on the source file system after the first time, and the first time is the last time of migrating the file to the destination file system.
The method comprises the steps of obtaining a file in a source file system, and obtaining metadata of each first-level file in the first file directory in a batch mode, wherein the first-level file directory is directly located in the source file system based on a handle of the first file directory, so that the metadata of each file is obtained without exchanging POSIX interface call messages for multiple times for each file, the efficiency of obtaining the metadata of the file is improved, and the efficiency of transferring the file is improved.
In one possible implementation, a query request message is sent to the source file system, the query request message including a directory identification of a root directory of the source file system. And receiving a query response message sent by the source file system, wherein the query response message comprises a handle of the root directory. Therefore, the source file system provides the handle of the root directory through the directory identification of the root directory, and the handle of each file in the source file system is acquired based on the handle of the root directory.
In another possible implementation manner, a first read request message is sent to the source file system, where the first read request message includes a handle of a first file directory, the first file directory is a file directory where the handle has been acquired, and the first file directory is a file directory in the source file system. Receiving a first read response message sent by a source file system, wherein the first read response message comprises a handle of a primary subfile directory in a first file directory, and other file directories are not spaced between the first file directory and the primary subfile directory. Thus, the source file system provides handles to the primary subfile directories under the file directories of each level through the handles to the file directories of each level, and the process is repeated to obtain the handles to each file directory in the source file system.
In another possible implementation, a second read request message is sent to the source file system, where the second read request message includes a handle to a second file directory, and the second file directory is a file directory in the source file system. And receiving a second read response message sent by the source file system, wherein the second read response message comprises metadata of the primary file in the second file directory, and no other file directories are arranged between the second file directory and the primary file. Therefore, the metadata of the first-level file in the first file directory is directly acquired through the handle of the first file directory, and the efficiency of acquiring the metadata of the file is improved.
In another possible implementation, the metadata of the first file includes a file identification and a file attribute of the first file.
In another possible implementation manner, the file identification of the first file is obtained based on a first metadata sequence and a second metadata sequence, the first metadata sequence includes metadata of each file, and the second metadata sequence includes metadata of each file in the file system at the first time. The first file is migrated from the source filesystem to the destination filesystem based on the file identification of the first file. The metadata in the second metadata sequence are all metadata acquired at the first time, so that when the file is migrated, the metadata of the file does not need to be acquired from the target file system, and the efficiency of migrating the file is further improved.
In another possible implementation, the metadata of each file in the first metadata sequence is arranged in the order of the file identification of each file.
In another possible implementation manner, first metadata indicated by the first indication information and second metadata indicated by the second indication information are obtained, where first metadata indicated by the first indication information is first metadata in the first metadata sequence, and first metadata indicated by the second indication information is first metadata in the second metadata sequence. When the file identifier included in the first metadata is the same as the file identifier included in the second metadata and the file attribute included in the first metadata is different from the file attribute included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier included in the first metadata is a file which changes on a source file system after the first time. Therefore, the changed first file can be quickly obtained by comparing the metadata indicated by the two pieces of indication information, and the efficiency of migrating the first file is further improved.
In another possible implementation manner, when the file identifier included in the first metadata and the file identifier included in the second metadata are the same and the file attribute included in the first metadata and the file attribute included in the second metadata are different, the first indication information is set in the first metadata sequence to indicate the next metadata after the first metadata, and the second indication information is set in the second metadata sequence to indicate the next metadata after the second metadata. In this way, other changed files or newly added files are migrated by moving the first indication information and the second indication information.
In another possible implementation manner, the metadata of each file in the first metadata sequence is arranged in the order from small to large according to the file identifier of each file, when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, it is determined that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time. Therefore, the newly added first file can be obtained quickly by comparing the metadata indicated by the two pieces of indication information, and the efficiency of migrating the first file is improved.
In another possible implementation manner, when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, the first indication information is set in the first metadata sequence to indicate the next metadata after the first metadata. In this way, other changed files or newly added files are migrated by moving the first indication information.
In another possible implementation manner, the second metadata sequence is updated to the first metadata sequence, so that when the file is migrated next time, the updated second metadata sequence is directly used without acquiring metadata of the file from the target file system, and the efficiency of migrating the file is improved.
In a second aspect, the present application provides an apparatus for migrating a file, configured to perform the method in the first aspect or any one of the possible implementation manners of the first aspect. In particular, the apparatus comprises means for performing the first aspect or the method in any one of its possible implementations.
In a third aspect, the present application provides an apparatus for migrating a file, the apparatus comprising a processor and a memory. The processor and the memory can be connected through an internal connection. The memory is configured to store a program, and the processor is configured to execute the program in the memory, so that the apparatus performs the method in the first aspect or any possible implementation manner of the first aspect.
In a fourth aspect, the present application provides a computer program product, which comprises a computer program stored in a computer-readable storage medium, and which, when loaded by a processor, implements the method of the first aspect or any possible implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium for storing a computer program, which is loaded by a processor to perform the method of the first aspect or any possible implementation manner of the first aspect.
Drawings
Fig. 1 is a schematic diagram of a network architecture provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a multi-level file directory provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of another multi-level file directory provided by an embodiment of the present application;
FIG. 4 is a schematic diagram of another multi-level file directory provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of another multi-level file directory provided by an embodiment of the present application;
FIG. 6 is a flowchart of a method for migrating files according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram of an apparatus for migrating files according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of another apparatus for migrating a file according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, an embodiment of the present application provides a network architecture 100, where the network architecture 100 includes:
a host 101, a source filesystem 102, and a destination filesystem 103, the host 101 can communicate with the source filesystem 102 and the destination filesystem 103, respectively.
Both the source filesystem 102 and the destination filesystem 103 are filesystems for storing files. Optionally, the file system is a Network File System (NFS) file system or a Common Internet File System (CIFS) file system, and the like.
The source file system 102 can be used to store files generated by an application.
The application may access the source file system 102 at runtime, e.g., the application may generate a file and save the generated file to the source file system 102, modify a file stored in the source file system 102, and/or delete a file in the source file system 102.
The network architecture 100 is applied to application scenarios such as migration, backup and/or flow of unstructured data. In this application scenario, files in the source filesystem 102 need to be migrated to the destination filesystem 103.
When a host 101 first migrates a file in a source file system 102 to a destination file system 103, the host 101 first migrates each file in the source file system 102 to the destination file system 103. Since the application is still accessing the source file system 102, after the first migration, the application may modify files stored in the source file system 102 and/or save new files generated in the source file system 102.
Periodically or triggered by operation and maintenance personnel, the host 101 determines a changed file and/or a newly added file in the source file system 102 after a first time, wherein the first time is the last time a file is migrated into the destination file system 103, and migrates the changed file and/or the newly added file in the source file system 102 into the destination file system 103.
Where the file that changed in the source file system 102 after the first time is a file stored in the source file system 102 that was modified by the application after the first time. The newly added file in the source file system 102 after the first time is a new file that is generated by the application after the first time.
In some embodiments, source file system 102 may be a device or a cluster of devices, and destination file system 103 may be a device or a cluster of devices. Alternatively, source filesystem 102 and destination filesystem 103 may be two systems on the same device or on the same device cluster.
In some embodiments, host 101 may be a separate device from source file system 102 and destination file system 103, or host 101 may be the same device as source file system 102, or host 101 may be the same device as destination file system 103, or host 101 may be a device in a device cluster when source file system 102 is the device cluster, or host 102 may be a device in the device cluster when destination file system 103 is the device cluster.
In some embodiments, the device used to run the application may be a separate device from host 101, source filesystem 102, and destination filesystem 103, or the device used to run the application may be the same device as host 101, or the device used to run the application may be the same device as source filesystem 102, or, when source filesystem 102 is a device cluster, the device used to run the application may be a device in the device cluster, or the device used to run the application may be the same device as destination filesystem 103, or, when destination filesystem 103 is a device cluster, the device used to run the application may be a device in the device cluster.
Referring to FIG. 2, the source file system 102 maintains files in a multi-level file directory. The multilevel file directory is of a tree structure, leaf nodes of the multilevel file directory of the tree structure are files, non-leaf nodes are file directories, root nodes are root directories, and the root directories are also file directories.
For any file directory, the file directory is referred to as a first file directory for ease of description, and the first file directory may be a root directory of the source file system 102, or may be another file directory in the source file 102 other than the root directory. The first file directory comprises at least one primary subfile directory and/or at least one primary file, other file directories are not arranged between each primary subfile directory and the first file directory in a spaced mode, and other file directories are not arranged between each primary file and the first file directory in a spaced mode.
For example, referring to the multi-level file directory shown in fig. 2, the root directory includes a plurality of one-level subfiles, which are file directory 1, file directory 2, \8230;, respectively. For the file directory 1, the file directory 1 comprises a plurality of primary files, namely a file 11, a file 12, a file 8230, a file 8230and a file directory. For the file directory 2, the file directory 2 includes two primary files including the file 21 and the file 22 and one primary subfile directory, which is the file directory 21. As for the file directory 21, the file directory 21 includes three primary files, which are a file 211, a file 212, and a file 213, respectively.
Referring next to fig. 3, an example of a multi-level file directory in the source file system 102 is also illustrated, where a Root directory of the source file system 102 is "Root", and the Root directory "Root" includes a plurality of primary subfile directories, which are used for storing files of different years, and the plurality of primary subfile directories include file directories "Year1", "Year2", \8230;. For the file directory "Year1" and the Year corresponding to the file directory "Year1", the file directory "Year1" includes a plurality of primary subfiles for storing files of different months included in the Year, the plurality of primary subfiles including file directories "Month1", "Month2", "8230 \8230;. The file directory "Month1" includes first-level files of the file 111, the file 112, and the file 113, respectively, and the file directory "Month2" includes first-level files of the file 121, the file 122, and the file 123, respectively. The meaning of the file directory "Year2" is the same as that of the file directory "Year1", and a description thereof will not be repeated.
For each file directory in the source file system 102 (including the root directory and the subfile directories), each file directory has a handle (handle). For each file directory, the handle to the file directory is used to uniquely identify the file directory. The host 101 may use the handle of the file directory to directly obtain the metadata of each primary file included in the file directory and/or the handle and metadata of each primary subfile directory of the file directory from the source file system 102.
In some embodiments, the handle of the file directory is an identifier for identifying the file directory, and in this embodiment, the source file system is enabled to provide the host 101 with the handle of the file directory, so that the host 101 can obtain the handle of the file directory.
For example, referring to fig. 3, the host 101 may directly obtain the handle and/or metadata of each primary subfile directory of the Root directory "Root" from the source file system 102 by using the handle of the Root directory "Root", i.e., directly obtain the handle and/or metadata of the file directory "Year1", the handle and/or metadata of the file directory "Year2" \ 8230; \8230;. For another example, the host 101 directly obtains the handle and/or metadata of each primary subfile directory of the file directory "Year1" from the source file system 102 by using the handle of the file directory "Year1", that is, directly obtains the handle and/or metadata of the file directory "Month1", the handle and/or metadata of the file directory "Month2" \\ 8230; \ 8230;. Also for example, for the acquired handle of the file directory "Month1", the host 101 directly acquires, from the source file system 102, the metadata of each file included in the file directory "Month1", that is, the metadata of the file 111, the metadata of the file 112, and the metadata of the file 113, using the handle of the file directory "Month 1". With respect to the handles of the other file directories and the metadata of the other files shown in fig. 3, the handles of the other file directories and the metadata of the other files are acquired in the same manner as described above.
For each file in the source file system 102, the metadata for the file includes a file identification and file attributes for the file, the file identification for the file being used to uniquely identify the file. For each file directory in the source file system 102, the metadata for the file directory includes a directory identification and directory attributes for the file directory, and the directory identification for the file directory is used to uniquely identify the file directory.
In some embodiments, the file attributes of the file include one or more of the following: the file content modification time of the file, the file attribute modification time of the file, the access authority of the file, the file type of the file, the directory identification of the file directory where the file is located and the like.
In some embodiments, the directory attributes of the file directory include one or more of the following: the modification time of the file directory, the access rights of the file directory, etc.
In some embodiments, the file identification of the file is information in the form of a numerical value.
It should be noted that: during application execution, an application may modify a file saved in the source file system 102, add a new file to the source file system 102, and/or delete a file in the source file system 102. Wherein an application modifies a file stored in the source file system 102, causing the file to change.
In some embodiments, the application modifies the files in the source file system 102 in two ways, although the application may modify the files in the source file system 102 in other ways, not specifically listed here. The two modes are respectively as follows:
in a first modification mode, an application modifies the file content of a file in the source file system 102, that is, for a file stored in the source file system 102, the application can modify the file content of the file.
When the method is applied to modifying the file content of the file, the file attribute of the file is also modified, so that the file content and the file attribute of the file are changed. For example, when the application modifies the file content of the file, the current time is taken as the modification time for modifying the file content, and the file content modification time and the file attribute modification time included in the file attribute of the file are both updated to the modification time.
In the second modification mode, the application modifies the file attribute of the file in the source file system 102, that is, for a certain file stored in the source file system 102, the application may modify the file attribute of the file, so that the file attribute of the file is changed.
When the application modifies the file attribute of the file, the current time is taken as the modification time for modifying the file attribute, and the file attribute modification time included in the file attribute of the file is updated to the modification time. For example, assuming that the access right of a certain file stored in the source file system 102 is "read only", the application modifies the access right "read only" included in the file attribute of the file into "readable and writable", takes the current time as the modification time for modifying the file attribute, and updates the file attribute modification time included in the file attribute of the file to the modification time.
In the above two modification modes, the modified file is a primary file of a certain file directory, and when an application modifies the file, the source file system 102 also modifies the directory attribute of the file directory. For example, the modification time included in the directory attribute of the file directory is updated to the modification time for modifying the file.
In some embodiments, the application adds a new file to the source file system 102 in two ways, namely, a new file generated by the application, although the application may also add a new file to the source file system 102 in other ways, which are not listed here. The two modes are respectively as follows:
a first addition is that when an application generates a new file, it saves the new file to a file directory in the source file system 102.
In a first addition, the new file is a primary file of the certain file directory. When the application saves the new file to the certain file directory, the source file system 102 also modifies directory attributes of the certain file directory. For example, the modification time included in the directory attribute of the certain file directory is updated to the save time for saving the new file.
For example, referring to FIG. 4, the application generates a new file 224, saves the file 224 to the file directory "Month2" in the source file system 102, the file directory "Month2" being the primary subfiles of the file directory "Year 2". The source file system 102 also updates the modification time included in the directory attribute of the file directory "Month2" to the save time for saving the file 224.
In the second addition mode, when the application generates a new file, a new file directory is added under a certain file directory in the source file system 102, where the file directory may be a root directory or another file directory except the root directory, and the new file is saved in the new file directory.
In a second addition, the new file directory is a primary subfile directory of the certain file directory, and when a new file directory is added under the file directory, the source file system 102 also modifies directory attributes of the file directory. For example, the source file system 102 also updates the modification time included in the directory attributes of the file directory to the time of addition of the new file directory.
For example, referring to fig. 5, the application generates a new file 231, adds a new file directory "Month3" to the file directory "Year2" shown in fig. 3, and saves the new file 231 to the new file directory "Month 3". The source file system 102 also updates the modification time included in the directory attribute of the file directory "Year2" to the time of adding the new file directory "Month 3".
In some embodiments, when an application generates a new file, the file identification assigned to the new file is greater than the file identification of each file in the source file system 102.
In some embodiments, the application may also delete primary files included in a certain file directory in the source file system 102 during operation. For example, referring to FIG. 3, the application deletes the file 222 included in the file directory "Month2" in the source file system 102 during the running process, where the file directory "Month2" is the primary subfile directory of the file directory "Year 2". Referring to fig. 5, after deleting file 222, the file directory includes file 221 and file 223.
The source file system 102 also modifies the directory attributes of the file directory after the primary file is deleted from the file directory. For example, the modification time included in the directory attribute of the file directory is updated to the deletion time for deleting the primary file.
After an application modifies a file in the source filesystem 102 and/or adds a new file in the source filesystem 102, the host 101 needs to migrate the modified file and/or the newly added file to the destination filesystem 103. In the embodiment of the present application, the host 101 may migrate a file through any of the following embodiments.
Referring to fig. 6, an embodiment of the present application provides a method 600 for migrating a file, where the method 600 is applied to the network architecture 100 shown in fig. 1, and the method 600 may be executed by a host 101 in the network architecture 100. The method 600 is used to migrate a changed file and/or a newly added file in a source filesystem to a destination filesystem after each file in the source filesystem is first migrated to the destination filesystem.
In some embodiments, the process for the host to first migrate each file in the source filesystem to the destination filesystem is: the host computer obtains a directory identifier of a root directory of the source file system, obtains a file identifier of each file in the source file system based on the directory identifier of the root directory, and migrates each file from the source file system to the destination file system based on the file identifier of each file.
After the first migration, the host executes the method 600 periodically or triggered by operation and maintenance personnel, and the method 600 includes:
step 601: the host obtains a handle to each file directory included in the source file system, each file directory including at least one file, the at least one file being located in the source file system.
In step 601: the host acquires the handle of the root directory first, and acquires the handle of each primary subfile directory included in the root directory from the source file system based on the handle of the root directory. For the acquired handle of the file directory, acquiring the handle of each primary sub-file directory included in the file directory from the source file system based on the handle of the file directory. Repeating the above process, and obtaining the handle of each file directory in the source file system.
In some embodiments, when acquiring the handle of the primary sub-file directory included in the file directory based on the handle of the file directory, if the file directory also includes the primary file of the file directory, the host also acquires the metadata of the primary file included in the file directory based on the handle of the file directory, that is, the operation of this step 601 is performed simultaneously with the operation of acquiring the metadata of the file in the following step 602.
When implemented, the step 601 may be implemented by the following operations 6011 to 6012, and the operations 6011 to 6012 are respectively:
6011: the host obtains a handle to the root directory.
In operation 6011, the host obtains a directory identifier of a root directory of a source file system, and sends a query request message to the source file system, where the query request message includes the directory identifier of the root directory. The source file system receives the query request message, acquires a handle of the root directory based on the directory identifier of the root directory included in the query request message, and sends a query response message to the host, wherein the query response message includes the handle of the root directory. The host receives the query response message and extracts the handle of the root directory from the query response message.
In some embodiments, when the host first migrates each file in the source file system to the destination file system, the directory identifier of the root directory is obtained and the root directory identifier is saved, so in operation 6011, the host obtains the saved identifier of the root directory. Or the host receives the directory identification of the root directory input by the operation and maintenance personnel.
In some embodiments, the host saves the handle of the root directory after obtaining the handle of the root directory, so that the saved handle of the root directory is directly obtained when the file is migrated to the destination file system next time.
In some embodiments, the communication protocol used by the host and the source file system is a Remote Procedure Call (RPC) protocol, and the query request message may be a Lookup message defined by the RPC protocol.
6012: the host acquires a handle of each primary sub-file directory included in the first file directory from the source file system based on the handle of the first file directory, wherein the first file directory is the file directory acquired by the host.
The first file directory may be a root directory of the source file system or may be a file directory other than the root directory.
Among them, it should be noted that: the first file directory includes a primary subfile directory and/or a primary file. In a case where the first file directory includes the primary subfile directory but does not include the primary file, a handle to the primary subfile directory included in the first file directory is acquired based on the handle to the first file directory.
In the case that the first file directory includes the primary subfile directory and the primary file, based on the handle of the first file directory, the handle of the primary subfile directory and the metadata of the primary file included in the first file directory may be simultaneously obtained, that is, the operations of step 601 and the following step 602 are simultaneously performed; alternatively, the handle of the primary sub-file directory included in the first file directory may be obtained first, and then the operation of step 602 is performed, that is, the metadata of the primary file included in the first file directory is obtained based on the handle of the first file directory.
In the case where the first file directory does not include the primary subfiles directory but includes the primary file, the operation of step 602 is performed in which the host obtains metadata of the primary file included in the first file directory based on the handle of the first file directory.
In some embodiments, the host, in obtaining the handle to the primary subfile directory comprised by the first file directory, also obtains metadata for the primary subfile directory comprised by the first file directory.
In 6012, the operation of the host obtaining the handle to the primary subfile directory included in the first file directory may be:
the host sends a first read request message to the source file system, the first read request message including a handle to the first file directory. The source file system receives a first read request message, acquires a handle of each primary subfile directory included in the first file directory based on the handle of the first file directory, and sends a first read response message to the host, wherein the first read response message includes the handle of each primary subfile directory. The host receives a first read response message sent by the source file system, and extracts the handle of each primary subfile directory included in the first file directory from the first read response message.
In some embodiments, the first file directory may include primary files, the source file system may be capable of obtaining metadata for each primary file in the first file directory based on a handle to the first file directory, and the first read response message further includes the metadata for each primary file. In this case, the operations of step 601 and the following step 602 are performed simultaneously, the first read request message in step 601 and the following second read request message in step 602 are the same message, and the first read response message in step 601 and the following second read response message in step 602 are the same message. Of course, the metadata of each primary file included in the first file directory may not be acquired first, i.e., the operations of step 601 and step 602 below are not performed simultaneously.
In some embodiments, the source file system further obtains metadata for each of the primary subfile directories comprised by the first file directory, and the first read response message further comprises the metadata for each of the primary subfile directories.
In some embodiments, the first read request message may be a Readdirplus message defined for the RPC protocol, or the like.
The host may repeatedly perform 6012, obtain a handle to each file directory in the source file system, or further obtain metadata of each file directory in the source file system.
For example, taking the example of a multi-level file directory in the source file system shown in fig. 5, the host sends a first read request message to the source file system, the first read request message including a handle to the Root directory "Root". The source file system receives a first read request message, acquires a handle of a primary subfile directory 'Year 1' and a handle of a primary subfile directory 'Year 2' which are included in the Root directory 'Root' based on the handle of the Root directory 'Root' included in the first read request message, and sends a first read response message to the host, wherein the first read response message includes the handle of the primary subfile directory 'Year 1' and the handle of the primary subfile directory 'Year 2'. The host receives the first read response message, and extracts a handle of the primary subfile directory "Year1" and a handle of the primary subfile directory "Year2" from the first read response message.
For the file directory "Year1", the host sends a first read request message to the source file system, the first read request message including a handle to the file directory "Year 1". The source file system receives a first reading request message, acquires a handle of a primary subfile directory 'Month 1' and a handle of a primary subfile directory 'Month 2' which are included in the file directory 'Year 1' based on the handle of the file directory 'Year 1' included in the first reading request message, and sends a first reading response message to the host, wherein the first reading response message includes the handle of the primary subfile directory 'Month 1' and the handle of the primary subfile directory 'Month 2'. The host receives the first read response message, and extracts the handle of the primary subfile directory "Month1" and the handle of the primary subfile directory "Month2" from the first read response message. Repeating the above process to obtain the handle of each other file directory in the source file system.
Wherein, it needs to be explained that: if the host obtains the handle of each file directory in the source file system first and then obtains the metadata of the primary file included in the file directory, the host completes step 601 and then performs step 602 as follows. If the host acquires the handle of the primary subfile directory and the metadata of the primary file included in the file directory in the source file system at the same time, and the host acquires the metadata of each file in the source file system at the same time when acquiring the handle of each file directory in the source file system, the host directly executes the operation of the following step 603.
Step 602: the host obtains metadata for each file in the source file system based on the handle for each file directory.
In step 602, the host sends a second read request message to the source file system, the second read request message including a handle to a second file directory, the second file directory being any file directory in the source file system. And the source file system receives the second read request message, acquires the metadata of each primary file included in the second file directory based on the handle of the second file directory included in the second read request message, and sends a second read response message to the host, wherein the second read response message includes the metadata of each primary file. And the host receives the second reading response message and extracts the metadata of each primary file from the second reading response message. Repeating the above operations to obtain the metadata of each file in the source file system.
For example, taking the multi-level file directory in the source file system shown in fig. 5 as an example, the host sends a second read request message to the source file system, where the second read request message includes a handle of the file directory "Month1", and the file directory "Month1" is a primary subfile directory of the file directory "Year 1". The source file system receives the second read request message, acquires the metadata of the file 111, the metadata of the file 112 and the metadata of the file 113 which are included in the file directory "Month1" based on the handle of the file directory "Month1" included in the second read request message, and sends a second read response message to the host, wherein the second read response message includes the metadata of the file 111, the metadata of the file 112 and the metadata of the file 113. The host receives the second read response message, and extracts the metadata of the file 111, the metadata of the file 112, and the metadata of the file 113 included in the file directory "Month1" from the second read response message. In the same manner as described above, the host also obtains metadata of each primary file included in the other file directories.
It should be noted that, when acquiring the handle of each primary subfile directory and/or the metadata of each primary file included in the file directory based on the handle of the file directory, the handle of the file directory does not need to be compared from the root directory of the source file system at a primary level. The file directory can be located in the source file system directly based on the handle of the file directory, and the handle of each primary subfile directory and/or the metadata of each primary file included in the file directory can be obtained from the file directory in batch. Therefore, the metadata of the file can be acquired quickly, and the efficiency of acquiring the metadata of the file is improved.
In some embodiments, the second read request message may be a Readdirplus message defined for the RPC protocol, or the like.
In step 602, the metadata of each file in the currently acquired source file system is organized into a first metadata sequence, and the metadata of each file in the first metadata sequence is arranged according to the file identification order of each file.
In some embodiments, the metadata of each file in the first metadata sequence is arranged in order from the small file identifier of each file to the large file identifier of each file, or the metadata of each file in the first metadata sequence is arranged in order from the large file identifier of each file to the large file identifier of each file.
For example, taking the multilevel file directory shown in fig. 5 as an example, assume that the source file system includes each file of the multilevel file directory shown in fig. 5, that is, includes twelve files including files 111, 112, 113, 121, 122, 123, 211, 212, 213, 221, 223, and 231. The host obtains the metadata of the twelve files, which constitute the first metadata sequence shown in table 1 below.
The metadata of the twelve files in the first metadata sequence is arranged in the order of the file identifications of the twelve files from small to large, see table 1 below, or the metadata of the twelve files in the first metadata sequence is arranged in the order of the file identifications of the twelve files from large to small, see table 2 below.
TABLE 1
Figure BDA0003250723660000111
TABLE 2
Figure BDA0003250723660000112
Wherein the host further comprises a second metadata sequence, the second metadata sequence being a metadata sequence obtained at the first time, i.e. the second metadata sequence comprises metadata of each file in the source file system at the first time. And arranging the metadata of each file in the second metadata sequence according to the file identification sequence of each file, wherein the arrangement mode of the file identification of each file in the second metadata sequence is the same as that of the file identification of each file in the first metadata sequence.
In some embodiments, the metadata of each file in the first metadata sequence is arranged in the order of the file identifier of each file from small to large, and the metadata of each file in the second metadata sequence is also arranged in the order of the file identifier of each file from small to large. Or, the metadata of each file in the first metadata sequence is arranged according to the order of the file identifier of each file from large to small, and the metadata of each file in the second metadata sequence is also arranged according to the order of the file identifier of each file from large to small.
For example, assume at a first time that the source file system includes a multi-level file directory as shown in FIG. 3, the source file system includes each file of the multi-level file directory shown in FIG. 3, i.e., including files 111, 112, 113, 121, 122, 123, 211, 212, 213, 221, 222, and 223, for a total of twelve files. The host obtains the metadata for the twelve files at a first time and composes the metadata for the twelve files into a second sequence of metadata at the first time as shown in table 3 below.
Referring to table 3 below, the metadata of the twelve files in the second metadata sequence are arranged in the order of the file identifications of the twelve files from small to large, or, referring to table 4 below, the metadata of the twelve files in the second metadata sequence are arranged in the order of the file identifications of the twelve files from large to small.
TABLE 3
Figure BDA0003250723660000121
TABLE 4
Figure BDA0003250723660000131
Step 603: and migrating a first file to the destination file system based on the metadata of each file, wherein the first file is a file which is changed or newly added on the source file system after the first time, and the first time is the time of migrating the file to the destination file system last time.
In step 603, a file identifier of the first file is obtained based on the first metadata sequence and the second metadata sequence. The first file is migrated from the source filesystem to the destination filesystem based on the file identification of the first file.
In practice, the file identification of the first file may be obtained as follows, as described in 6031-6038. The operations of 6031-6038 are:
6031: and acquiring first metadata indicated by the first indication information and second metadata indicated by the second indication information.
At the start of execution of step 603, the first indication information is set to indicate the first metadata in the first metadata sequence, the second indication information is set to indicate the first metadata in the second metadata sequence, and then step 6031 is executed.
In some embodiments, the first indication information and the second indication information are two different pointers.
For example, a record having a sequence number of 1 in the first metadata sequence shown in table 1 or table 2 is the first metadata in the first metadata sequence, and a record having a sequence number of 1 in the second metadata sequence shown in table 3 or table 4 is the first metadata in the second metadata sequence. The following examples 1 and 2 are listed next.
Example 1: the host sets first indication information indicating first metadata in a first metadata sequence as shown in table 1, and sets second indication information indicating the first metadata in a second metadata sequence as shown in table 3. The host acquires first metadata indicated by the first indication information, wherein the first metadata comprises a file identifier '111' and file attributes 'T11, read only'. And acquiring second metadata indicated by the second indication information, wherein the second metadata comprises a file identifier '111' and file attributes 'T1, T1 and read only'.
Example 2: the host sets first indication information indicating first metadata in a first metadata sequence as shown in table 2, and sets second indication information indicating first metadata in a second metadata sequence as shown in table 4. The host acquires first metadata indicated by the first indication information, wherein the first metadata comprises a file identifier '231' and file attributes 'T13, readable and writable'. And acquiring second metadata indicated by the second indication information, wherein the second metadata comprises a file identifier '223' and file attributes 'T12, readable and writable'.
6032: the file identification included in the first metadata and the file identification included in the second metadata are compared, and 6033 is performed if the two are the same, and 6036 is performed if the two are different.
In example 1, the file identification "111" included in the first metadata and the file identification "111" included in the second metadata are compared, the two are compared to be the same, and 6033 is performed.
In example 2, the file identification "231" included in the first metadata and the file identification "223" included in the second metadata are compared, and the comparison is different, and 6036 is performed.
6033: the file attribute included in the first metadata and the file attribute included in the second metadata are compared, and if the two are different, 6034 is executed, and if the two are the same, 6035 is executed.
In example 1, file attributes "T11, read only" included in the first metadata and file attributes "T1, read only" included in the second metadata are compared, and they are compared to be different, and 6034 is performed. Wherein the first metadata indicates that the file content and the file attribute of the file corresponding to the file identification "111" in the first metadata are modified at time T11.
6034: determining that the file identifier included in the first metadata is the file identifier of the first file, wherein the file corresponding to the file identifier included in the first metadata is the file which is changed on the source file system after the first time.
In example 1, the first metadata includes a file identifier "111" as a file identifier of the first file, and the file corresponding to the file identifier "111" is a file that changes on the source file system after the first time.
6035: setting the first indication information in the first metadata sequence indicates the next metadata after the first metadata, and setting the second indication information in the second metadata sequence indicates the next metadata after the second metadata, returns to execute 6031.
In example 1, setting the first indication information in the first metadata sequence as shown in table 1 indicates the next metadata after the first metadata, i.e., the record indicating the sequence number 2 in table 1, and setting the second indication information in the second metadata sequence as shown in table 3 indicates the next metadata after the second metadata, i.e., the record indicating the sequence number 2 in table 3, returns to execution 6031.
That is, the host continues to obtain the first metadata indicated by the first indication information, where the first metadata is a record with a sequence number of 2 in the first metadata sequence as shown in table 1, and the first metadata includes a file identifier "112" and file attributes "T2, read only". And acquiring second metadata indicated by the second indication information, wherein the second metadata is a record with the sequence number of 2 in a second metadata sequence shown in table 3, and the second metadata comprises a file identifier "112" and file attributes "T2, T2 and read only".
The file identifier "112" included in the first metadata and the file identifier "112" included in the second metadata are compared, and the two are compared to be the same. Then, the file attributes "T2, read only" included in the first metadata and the file attributes "T2, read only" included in the second metadata are compared, and the two are compared to be the same. Then, setting first indication information indicating the next metadata after the first metadata, i.e., indicating the record with the sequence number of 3 in table 1, in the first metadata sequence shown in table 1, and setting second indication information indicating the next metadata after the second metadata, i.e., indicating the record with the sequence number of 3 in table 3, in the second metadata sequence shown in table 3, returns to execute 6031.
The above process is repeated until the first metadata indicated by setting the first indication information in the first metadata sequence as shown in table 1 is a record with sequence number 11 in table 1, and the second metadata indicated by setting the second indication information in the second metadata sequence as shown in table 3 is a record with sequence number 11 in table 3, i.e. the first metadata comprises a file identification "223" and file attributes "T12, readable and writable", and the second metadata comprises a file identification "222" and file attributes "T11, readable and writable".
The file identification "223" included in the first metadata and the file identification "222" included in the second metadata are compared, and the comparison is made to be different, and 6036 is performed.
6036: and determining whether the file corresponding to the file identification included in the first metadata is a newly added file or a deleted file based on the size relationship between the file identification included in the first metadata and the file identification included in the second metadata, and if the file is the newly added file, executing 6037, and if the file is the deleted file, executing 6038.
If the metadata of each file in the first metadata sequence is arranged in the order of the file identifier of each file from small to large, in 6036, when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, it is determined that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time, and 6037 is executed. When the file identification included in the first metadata is larger than the file identification included in the second metadata, it is determined that the file corresponding to the file identification in the second metadata is the file deleted in the source file system after the first time, and 6038 is performed.
For example, in example 1, the metadata of each file in the first metadata sequence is arranged in the order of the file identifier of each file from small to large, the first metadata is a record with a sequence number of 11 shown in table 1 above, the second metadata is a record with a sequence number of 11 shown in table 3 above, the file identifier "223" included in the first metadata is greater than the file identifier "222" included in the second metadata, it is determined that the file 222 corresponding to the file identifier "222" in the second metadata is a file deleted in the source file system after the first time, and 6038 is performed.
If the metadata of each file in the first metadata sequence is arranged in the order of the file identifier of each file from large to small, in 6036, when the file identifier included in the first metadata is larger than the file identifier included in the second metadata, it is determined that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time, and 6037 is executed. When the file identification included in the first metadata is smaller than the file identification included in the second metadata, it is determined that the file corresponding to the file identification in the second metadata is a file deleted in the source file system after the first time, and 6038 is performed.
For example, in example 2, the metadata of each file in the first metadata sequence is arranged in the order of the file identifiers of each file from large to small, the file identifier "231" included in the first metadata is larger than the file identifier "223" included in the second metadata, the file "231" corresponding to the file identifier "231" included in the first metadata is a newly added file, and 6037 is performed.
6037: and determining that the file identifier included in the first metadata is the file identifier of the first file, wherein the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time, setting the first indication information in the first metadata sequence to indicate the next metadata after the first metadata, and returning to execute 6031.
The metadata indicated by the second indication information remains unchanged in 6037.
For example, in example 2, it is determined that the file 231 corresponding to the file identification "231" included in the first metadata is a file newly added in the source file system after the first time. Setting the first indication information in the first metadata sequence as shown in table 2 indicates the next metadata after the first metadata, i.e. indicates the record with sequence number 2 in table 2, while the second metadata indicated by the second indication information remains unchanged, i.e. the second metadata still includes the file identification "223" and the file attributes "T12, readable and writable". And acquiring first metadata indicated by the first indication information, wherein the first metadata comprises a file identifier '223' and file attributes 'T12, readable and writable'.
The file identification "223" included in the first metadata and the file identification "223" included in the second metadata are compared, and the two are compared to be the same. Then, the file attributes "T12, readable and writable" included in the first metadata and the file attributes "T12, readable and writable" included in the second metadata are compared, and the two are compared to be the same. Then, the first indication information is set in the first metadata sequence as shown in table 2 to indicate the next metadata after the first metadata, i.e., to indicate the record with sequence number 3 in table 2, and the second indication information is set in the second metadata sequence as shown in table 4 to indicate the next metadata after the second metadata, i.e., to indicate the record with sequence number 2 in table 4.
The host acquires first metadata indicated by the first indication information, wherein the first metadata comprises a file identifier '221' and file attributes 'T10, read only'. And acquiring second metadata indicated by the second indication information, wherein the second metadata comprises a file identifier '222' and file attributes 'T11, readable and writable'. Comparing the file identifier '221' included in the first metadata with the file identifier '222' included in the second metadata, determining that the file 222 corresponding to the file identifier '222' included in the second metadata is a file deleted in the source file system after the first time. Setting the second indication information in the second metadata sequence as shown in table 4 to indicate the next metadata in the second metadata, i.e., the metadata indicating the sequence number of 3, returns to execute 6031, and repeats the above process until the first indication information indicates the record of the sequence number of 12 in table 2 and the second indication information indicates the record of the sequence number of 12 in table 4.
Acquiring first metadata indicated by the first indication information, wherein the first metadata comprises a file identifier '111' and file attributes 'T11, read only', and acquiring second metadata indicated by the second indication information, and the second metadata comprises a file identifier '111' and file attributes 'T1, read only'. Comparing the file identifier '111' included in the first metadata with the file identifier '111' included in the second metadata, comparing the two identifiers to be the same, then comparing the file attributes 'T11, read only' included in the first metadata with the file attributes 'T1, read only' included in the second metadata, comparing the two identifiers to be different, and determining that the file 111 corresponding to the file identifier '111' included in the first metadata is a file which changes after the first time.
6038: and determining that the file corresponding to the file identifier in the second metadata is a file deleted in the source file system after the first time, setting second indication information in the second metadata sequence to indicate the next metadata after the second metadata, and returning to execute 6031.
The metadata indicated by the first indication information remains unchanged in 6037.
In the case that the metadata of each file in the first metadata sequence is arranged from small to large according to the sequence of each file, and the metadata of each file in the second metadata sequence is arranged from small to large according to the sequence of each file, in this case, the first metadata indicated by the first indication information is not the last metadata in the first metadata sequence, and the metadata indicated by the second indication information is the last metadata in the second metadata sequence, each metadata in the first metadata sequence after the metadata indicated by the first indication information is acquired, and a file corresponding to a file identifier included in each acquired metadata is taken as a file newly added in the source file system after the first time. And when the first metadata indicated by the first indication information is the last metadata in the first metadata sequence and the metadata indicated by the second indication information is not the last metadata in the second metadata sequence, acquiring each metadata in the second metadata after the metadata indicated by the second indication information, and taking the file corresponding to the file identifier included in each acquired metadata as the file deleted in the source file system after the first time.
For example, in example 1, the file 222 corresponding to the file identification "222" included in the second metadata is determined to be a file deleted in the source file system after the first time. The metadata indicated by the first indication information remains unchanged, i.e., still indicates the record with the sequence number 11 in the first metadata shown in table 1. The second indication information is set in the second metadata sequence shown in table 3 to indicate the next metadata after the second metadata, i.e., to indicate the record with sequence number 12 in table 3. And acquiring second metadata indicated by the second indication information, wherein the second metadata comprises a file identifier '223' and file attributes 'T12, readable and writable', and the first metadata comprises the file identifier '223' and the file attributes 'T12, readable and writable'.
It is compared that the file identification "223" included in the first metadata is the same as the file identification "223" included in the second metadata, and the file attribute "T12, readable-writable" included in the first metadata is the same as the file attribute "T12, readable-writable" included in the second metadata. Wherein at this time, in the first metadata sequence shown in table 1, the first indication information indicates that the first metadata is a record with sequence number 11 in table 1, and the second indication information already indicates the last metadata in the second metadata, at this time, each metadata after the metadata of the first indication information in the first metadata sequence is obtained, each obtained metadata is a record with sequence number 12 in table 1, the metadata includes a file identification "231" and file attributes "T13, readable and writable", the metadata is the last metadata in the first metadata sequence shown in table 1, and therefore, the file 231 corresponding to the file identification "231" included in the metadata is a file newly added in the source file system after the first time.
If the metadata of each file in the first metadata sequence is arranged from big to small of each file, and the metadata of each file in the second metadata sequence is arranged from big to small of each file, in this case, the first metadata indicated by the first indication information is the last metadata in the first metadata sequence, and the metadata indicated by the second indication information is not the last metadata in the second metadata sequence, each metadata in the second metadata sequence after the metadata indicated by the second indication information is obtained, and the file corresponding to the file identifier included in each obtained metadata is taken as the file deleted in the source file system after the first time.
In step 603, the host acquires the first file from the source file system based on the file identifier of the first file, and stores the first file in the destination file system, so as to migrate the first file from the source file system to the destination file system.
In some embodiments, the host also updates the saved second metadata sequence to the first metadata sequence.
In some embodiments, the file attribute of the first file includes a directory identifier of a file directory in which the first file is located, and the host may save the first file to the file directory in the destination file system based on the directory identifier of the file directory in which the first file is located.
In some embodiments, the host also retrieves metadata for each file directory currently in the source file system, and the host also retrieves metadata for each file directory in the source file system at the first time. The host acquires a third file directory based on the metadata of each current file directory of the source file system and the metadata of each file directory in the source file system at a first time, wherein the metadata of the third file directory at the current time is different from the metadata of the third file directory at the first time, namely the metadata of the third file directory is modified after the first time. The host may also update the metadata of a third file directory in the destination file system to the metadata of the current third file directory.
When the host needs to migrate the file again, the operations of the steps 601-603 are repeatedly executed.
In the embodiment of the present application, the host obtains a handle of each file directory in the source file system, and obtains metadata of each file in the source file system based on the handle of each file directory. For the handle of any file directory, the host is directly positioned to the file directory based on the handle of the file directory, and the metadata of each first-level file in the file directory is obtained in batches, so that the interaction of a POSIX interface call message for multiple times on each file is not needed to obtain the metadata of each file, the efficiency of obtaining the metadata of the file is improved, and the efficiency of transferring the file is improved. The host also stores the second metadata sequence acquired at the first time, so that the first file is acquired directly based on the currently acquired first metadata sequence and the second metadata sequence, and the first file is a file which changes or is newly added in the source file system after the first time, so that the second metadata sequence is not required to be acquired from the target file system at the current host, and the file migration efficiency is further improved.
Referring to fig. 7, an embodiment of the present application provides an apparatus 700 for migrating a file, where the apparatus 700 is deployed in a host 101 of the network architecture 100 shown in fig. 1 or a host of the method 600 shown in fig. 6, and the apparatus 700 includes:
a processing unit 701, configured to obtain a handle of each file directory included in a source file system, where each file directory includes at least one file, and the at least one file is located in the source file system;
the processing unit 701 is further configured to obtain metadata of each file in the source file system based on the handle of each file directory;
the processing unit 701 is further configured to migrate, to the destination file system, a first file based on the metadata of each file, where the first file is a file that changes or is newly added to the source file system after a first time, and the first time is a time when the file was migrated to the destination file system last time.
Optionally, the detailed process of the processing unit 701 obtaining the handle of each file directory is described in reference to the related content in step 601 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the detailed process of the processing unit 701 for obtaining the metadata of each file is described in reference to the related content in step 602 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the detailed process of migrating the file by the processing unit 701 is shown in step 603 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the apparatus 700 further includes: a sending unit 702 and a receiving unit 703,
a sending unit 702, configured to send an inquiry request message to a source file system, where the inquiry request message includes a directory identifier of a root directory of the source file system;
the receiving unit 703 is configured to receive an inquiry response message sent by the source file system, where the inquiry response message includes a handle of the root directory.
Wherein, the handle of the root directory is obtained by the source file system based on the directory identification.
Optionally, the detailed process of the sending unit 702 sending the query request message, see the relevant content in step 6011 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the detailed process of the receiving unit 703 receiving the query response message refers to relevant contents in step 6011 of the method 600 shown in fig. 6, and is not described in detail here.
Optionally, the sending unit 702 is further configured to send a first read request message to the source file system, where the first read request message includes a handle of a first file directory, the first file directory is a file directory where the handle is obtained, and the first file directory is a file directory in the source file system;
the receiving unit 703 is further configured to receive a first read response message sent by the source file system, where the first read response message includes a handle of a primary subfile directory in the first file directory, and there is no other file directory between the first file directory and the primary subfile directory.
Wherein the handle of the primary subfile directory is obtained by the source file system based on the handle of the first file directory.
Optionally, the detailed process of the sending unit 702 sending the first read request message, see the relevant content in step 6012 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the detailed process of the receiving unit 703 receiving the first read response message refers to relevant contents in step 6011 of the method 600 shown in fig. 6, and is not described in detail here.
Optionally, the sending unit 702 is further configured to send a second read request message to the source file system, where the second read request message includes a handle of a second file directory, and the second file directory is a file directory in the source file system;
the receiving unit 703 is further configured to receive a second read response message sent by the source file system, where the second read response message includes metadata of a primary file in a second file directory, and there is no other file directory between the second file directory and the primary file.
And acquiring the metadata of the primary file based on the handle of the second file directory by the source file system.
Optionally, the metadata of the first file includes a file identifier and a file attribute of the first file.
Optionally, the processing unit 701 is configured to:
acquiring a file identifier of a first file based on a first metadata sequence and a second metadata sequence, wherein the first metadata sequence comprises metadata of each file, and the second metadata sequence comprises metadata of each file in the source file system at a first time;
and migrating the first file from the source file system to the destination file system based on the file identification of the first file.
Optionally, the detailed process of the processing unit 701 acquiring the file identifier of the first file refers to relevant contents in step 603 of the method 600 shown in fig. 6, and is not described in detail here.
Optionally, the metadata of each file in the first metadata sequence is arranged according to the file identification order of each file.
Optionally, the processing unit 701 is configured to:
acquiring first metadata indicated by the first indication information and second metadata indicated by the second indication information, wherein the first metadata indicated by the first indication information is first metadata in the first metadata sequence, and the first metadata indicated by the second indication information is first metadata in the second metadata sequence;
when the file identifier included in the first metadata is the same as the file identifier included in the second metadata and the file attribute included in the first metadata is different from the file attribute included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier included in the first metadata is a file which changes on a source file system after the first time.
Optionally, the detailed process of the processing unit 701 for acquiring the first metadata and the second metadata is described with reference to the relevant content in step 6031 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the detailed process of the processing unit 701 determining that the file identifier included in the first metadata is the file identifier of the first file is described in detail with reference to steps 6032-6034 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the processing unit 701 is further configured to:
when the file identification included in the first metadata and the file identification included in the second metadata are the same and the file attribute included in the first metadata and the file attribute included in the second metadata are different, setting the first indication information in the first metadata sequence to indicate the next metadata after the first metadata, and setting the second indication information in the second metadata sequence to indicate the next metadata after the second metadata.
Optionally, the metadata of each file in the first metadata sequence is arranged in order from small to large according to the file identifier of each file, and the processing unit 701 is further configured to:
when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time.
Optionally, the detailed process of the processing unit 701 determining that the file identifier included in the first metadata is the file identifier of the first file refers to relevant contents in step 6036 of the method 600 shown in fig. 6, and will not be described in detail here.
Optionally, the processing unit 701 is further configured to:
when the file identification included in the first metadata is smaller than the file identification included in the second metadata, the first indication information is set in the first metadata sequence to indicate the next metadata after the first metadata.
Optionally, the processing unit 701 is further configured to:
the second metadata sequence is updated to the first metadata sequence.
In an embodiment of the present application, the processing unit obtains a handle of each file directory included in the source file system, where each file directory includes at least one file, and the at least one file is located in the source file system. Metadata for each file in the source file system is obtained based on the handle for each file directory. And migrating a first file to the destination file system based on the metadata of each file, wherein the first file is a file which is changed or newly added on the source file system after the first time, and the first time is the time of migrating the file to the destination file system last time. Because the processing unit directly locates the first file directory in the source file system based on the handle of the first file directory and obtains the metadata of each first-level file in the first file directory in batch, the POSIX interface call message does not need to be interacted with each file for multiple times to obtain the metadata of each file, thereby improving the efficiency of obtaining the metadata of the file and improving the efficiency of transferring the file.
Referring to fig. 8, an embodiment of the present application provides a schematic diagram of an apparatus 800 for migrating a file. The apparatus 800 may be a host in the network architecture 100 shown in fig. 1, or may be a host in the method 600 of fig. 6. The apparatus 800 comprises at least one processor 801, an internal connection 802, a memory 803 and at least one transceiver 804.
The apparatus 800 is a hardware structure apparatus, and can be used to implement the functional modules in the apparatus 700 described in fig. 7. For example, one skilled in the art will appreciate that the processing unit 701 in the apparatus 700 shown in fig. 7 may be implemented by the at least one processor 801 invoking code in the memory 803. The transmitting unit 702 and the receiving unit 703 in the apparatus 700 shown in fig. 7 may be implemented by the transceiver 804.
Optionally, the apparatus 800 may also be used to implement the functions of the host in any of the embodiments described above.
Alternatively, the processor 801 may be a general-purpose Central Processing Unit (CPU), a Network Processor (NP), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs according to the present disclosure.
The internal connections 802 may include a path for passing information between the components. Optionally, the internal connection 802 is a single board or a bus.
The transceiver 804 is used for communicating with other devices or communication networks.
The memory 803 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these. The memory may be self-contained and coupled to the processor via a bus. The memory may also be integral to the processor.
The memory 803 is used for storing application program codes for executing the scheme of the application, and the processor 801 controls the execution. The processor 801 is configured to execute the application code stored in the memory 803 and cooperate with the at least one transceiver 804 to cause the apparatus 800 to perform functions of the method of the present patent.
In particular implementations, processor 801 may include one or more CPUs, such as CPU0 and CPU1 in fig. 8, as one embodiment.
In particular implementations, the apparatus 800 may include multiple processors, such as the processor 801 and the processor 807 of fig. 8, for example, as an example. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores that process data (e.g., computer program instructions).
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk.
The above description is intended only to illustrate the alternative embodiments of the present application, and not to limit the present application, and any modifications, equivalents, improvements, etc. made within the principle of the present application should be included in the scope of the present application.

Claims (26)

1. A method of migrating a file, the method comprising:
acquiring a handle of each file directory included in a source file system, wherein each file directory includes at least one file, and the at least one file is located in the source file system;
acquiring metadata of each file in the source file system based on the handle of each file directory;
and migrating a first file to a destination file system based on the metadata of each file, wherein the first file is a file which is changed or newly added on the source file system after a first time, and the first time is the time of migrating the file to the destination file system last time.
2. The method of claim 1, wherein obtaining a handle to each file directory included in the source file system comprises:
sending a query request message to the source file system, the query request message including a directory identification of a root directory of the source file system;
and receiving a query response message sent by the source file system, wherein the query response message comprises the handle of the root directory.
3. The method of claim 2, wherein obtaining a handle to each file directory included in the source file system further comprises:
sending a first read request message to the source file system, wherein the first read request message comprises handles of a first file directory, the first file directory is a file directory of which the handles are acquired, and the first file directory is a file directory in the source file system;
receiving a first read response message sent by the source file system, wherein the first read response message comprises a handle of a primary subfile directory in the first file directory, and no other file directories are arranged between the first file directory and the primary subfile directory.
4. The method of any of claims 1-3, wherein the obtaining metadata for each file in the source file system based on the handle to each file directory comprises:
sending a second read request message to the source file system, the second read request message including a handle to a second file directory, the second file directory being a file directory in the source file system;
receiving a second read response message sent by the source file system, wherein the second read response message includes metadata of a primary file in the second file directory, and no other file directories are arranged between the second file directory and the primary file.
5. The method of any of claims 1-4, wherein the metadata of the first file comprises a file identification and a file attribute of the first file.
6. The method of claim 5, wherein said migrating the first file to the destination file system based on the metadata of each file comprises:
obtaining a file identifier of the first file based on a first metadata sequence and a second metadata sequence, wherein the first metadata sequence comprises metadata of each file, and the second metadata sequence comprises metadata of each file in the source file system at the first time;
and migrating the first file from the source file system to the destination file system based on the file identification of the first file.
7. The method of claim 6, wherein the metadata of said each file in said first sequence of metadata is arranged in order of file identification of said each file.
8. The method of claim 7, wherein obtaining the file identification of the first file based on the first metadata sequence and the second metadata sequence comprises:
acquiring first metadata indicated by first indication information and second metadata indicated by second indication information, wherein the first metadata indicated by the first indication information is first metadata in the first metadata sequence, and the first metadata indicated by the second indication information is first metadata in the second metadata sequence;
when the file identifier included in the first metadata is the same as the file identifier included in the second metadata and the file attribute included in the first metadata is different from the file attribute included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and determining that the file corresponding to the file identifier included in the first metadata is a file which is changed on the source file system after the first time.
9. The method of claim 8, wherein the method further comprises:
setting the first indication information to indicate next metadata after the first metadata in the first metadata sequence and setting the second indication information to indicate next metadata after the second metadata in the second metadata sequence when a file identification included in the first metadata and a file identification included in the second metadata are the same and a file attribute included in the first metadata and a file attribute included in the second metadata are different.
10. The method according to claim 8 or 9, wherein the metadata of said each file in said first metadata sequence is arranged in order of file identification of said each file from small to large, said method further comprising:
when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time.
11. The method of claim 10, wherein the method further comprises:
when the file identification included in the first metadata is smaller than the file identification included in the second metadata, the first indication information is set in the first metadata sequence to indicate the next metadata after the first metadata.
12. The method of any of claims 6 to 11, further comprising:
updating the second metadata sequence to the first metadata sequence.
13. An apparatus for migrating files, the apparatus comprising:
the processing unit is used for acquiring a handle of each file directory included in a source file system, wherein each file directory includes at least one file, and the at least one file is located in the source file system;
the processing unit is further configured to obtain metadata of each file in the source file system based on the handle of each file directory;
the processing unit is further configured to migrate a first file to a destination file system based on the metadata of each file, where the first file is a file that changes or a file that is newly added to the source file system after a first time, and the first time is a time when the file was migrated to the destination file system last time.
14. The apparatus of claim 13, wherein the apparatus further comprises: a sending unit and a receiving unit, wherein the sending unit and the receiving unit are connected,
the sending unit is configured to send a query request message to the source file system, where the query request message includes a directory identifier of a root directory of the source file system;
the receiving unit is configured to receive a query response message sent by the source file system, where the query response message includes a handle of the root directory.
15. The apparatus of claim 14,
the sending unit is further configured to send a first read request message to the source file system, where the first read request message includes a handle of a first file directory, the first file directory is a file directory where the handle has been obtained, and the first file directory is a file directory in the source file system;
the receiving unit is further configured to receive a first read response message sent by the source file system, where the first read response message includes a handle of a primary subfile directory in the first file directory, and no other file directories are spaced between the first file directory and the primary subfile directory.
16. The apparatus of any one of claims 13-15,
the sending unit is further configured to send a second read request message to the source file system, where the second read request message includes a handle of a second file directory, and the second file directory is a file directory in the source file system;
the receiving unit is further configured to receive a second read response message sent by the source file system, where the second read response message includes metadata of a primary file in the second file directory, and there is no other file directory between the second file directory and the primary file.
17. The apparatus of any of claims 13-16, wherein the metadata of the first file comprises a file identification and a file attribute of the first file.
18. The apparatus as recited in claim 17, said processing unit to:
obtaining a file identifier of the first file based on a first metadata sequence and a second metadata sequence, wherein the first metadata sequence comprises metadata of each file, and the second metadata sequence comprises metadata of each file in the source file system at the first time;
and migrating the first file from the source file system to the destination file system based on the file identification of the first file.
19. The apparatus of claim 18, wherein the metadata for each of the files in the first sequence of metadata is arranged in order of file identification for each of the files.
20. The apparatus as claimed in claim 19, wherein said processing unit is configured to:
acquiring first metadata indicated by first indication information and second metadata indicated by second indication information, wherein the first metadata indicated by the first indication information is first metadata in the first metadata sequence, and the first metadata indicated by the second indication information is first metadata in the second metadata sequence;
when the file identifier included in the first metadata is the same as the file identifier included in the second metadata and the file attribute included in the first metadata is different from the file attribute included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and determining that the file corresponding to the file identifier included in the first metadata is a file which is changed on the source file system after the first time.
21. The apparatus as recited in claim 20, said processing unit to further:
setting the first indication information to indicate next metadata after the first metadata in the first metadata sequence and setting the second indication information to indicate next metadata after the second metadata in the second metadata sequence when a file identifier included in the first metadata and a file identifier included in the second metadata are the same and a file attribute included in the first metadata and a file attribute included in the second metadata are different.
22. The apparatus according to claim 20 or 21, wherein the metadata of said each file in said first metadata sequence is arranged in order from small to large according to the file identification of said each file, said processing unit further configured to:
when the file identifier included in the first metadata is smaller than the file identifier included in the second metadata, determining that the file identifier included in the first metadata is the file identifier of the first file, and the file corresponding to the file identifier in the first metadata is a file newly added in the source file system after the first time.
23. The apparatus as recited in claim 22, said processing unit to further:
when the file identification included in the first metadata is smaller than the file identification included in the second metadata, the first indication information is set in the first metadata sequence to indicate the next metadata after the first metadata.
24. The apparatus of any of claims 18 to 23, wherein the processing unit is further configured to:
updating the second metadata sequence to the first metadata sequence.
25. A computer-readable storage medium, on which a computer program is stored, which, when executed by a computer, carries out the method according to any one of claims 1-12.
26. A computer program product, characterized in that the computer program product comprises a computer program stored in a computer-readable storage medium and that the computer program is loaded by a processor for implementing the method as claimed in any one of claims 1-12.
CN202111044488.4A 2021-07-08 2021-09-07 Method and device for migrating files and storage medium Pending CN115599743A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110773762 2021-07-08
CN2021107737625 2021-07-08

Publications (1)

Publication Number Publication Date
CN115599743A true CN115599743A (en) 2023-01-13

Family

ID=84842026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111044488.4A Pending CN115599743A (en) 2021-07-08 2021-09-07 Method and device for migrating files and storage medium

Country Status (1)

Country Link
CN (1) CN115599743A (en)

Similar Documents

Publication Publication Date Title
US10853242B2 (en) Deduplication and garbage collection across logical databases
CN109684282B (en) Method and device for constructing metadata cache
US10831720B2 (en) Cloud storage distributed file system
US8756199B2 (en) File level hierarchical storage management system, method, and apparatus
US20190370362A1 (en) Multi-protocol cloud storage for big data and analytics
US8266192B2 (en) File-sharing system and method for processing files, and program
JP5895099B2 (en) Destination file server and file system migration method
US6714949B1 (en) Dynamic file system configurations
CN106484820B (en) Renaming method, access method and device
WO2019231690A1 (en) Distributed transactions in cloud storage with hierarchical namespace
JP5375972B2 (en) Distributed file system, data selection method thereof, and program
US20180276267A1 (en) Methods and system for efficiently performing eventual and transactional edits on distributed metadata in an object storage system
US9696919B1 (en) Source/copy reference tracking with block pointer sets
US8380806B2 (en) System and method for absolute path discovery by a storage virtualization system
KR102208704B1 (en) Blockchain software capable of operation corresponding sql query, blockchain system, and method of thereof
US20220342888A1 (en) Object tagging
CN110958293B (en) File transmission method, system, server and storage medium based on cloud server
JP2006031608A (en) Computer, storage system, file management method which computer performs, and program
CN110109866B (en) Method and equipment for managing file system directory
CN115599743A (en) Method and device for migrating files and storage medium
CN113853778B (en) Cloning method and device of file system
US11016933B2 (en) Handling weakening of hash functions by using epochs
CN114443598A (en) Data writing method and device, computer equipment and storage medium
WO2024001280A1 (en) Data flow perception method and related apparatus
CN114546668B (en) Log collection method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication