CN117194337A

CN117194337A - Method, device, computer equipment and storage medium for selecting new source file

Info

Publication number: CN117194337A
Application number: CN202311243056.5A
Authority: CN
Inventors: 李伟; 刘洪栋; 李旭东
Original assignee: Jinan Inspur Data Technology Co Ltd
Current assignee: Jinan Inspur Data Technology Co Ltd
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2023-12-08

Abstract

The invention relates to the technical field of distributed file systems, and discloses a method, a device, computer equipment and a storage medium for selecting a new source file. When the target deleted file is a source file, a first inode number of the source file is obtained from a directory entry of the source file. And acquiring first metadata of the source file according to the first inode number. And obtaining hard link data of at least one hard link file corresponding to the source file from the first metadata. And determining the directory entry corresponding to each hard link file according to each hard link data. Based on the target item of each hard link file, it is determined whether there is a hard link file that is not occupied by the client. When there is at least one hard link file that is not occupied by a client, a new source file may be selected in a different manner. The invention can avoid the problem of service processing failure caused by incapability of selecting a new source file.

Description

Method, device, computer equipment and storage medium for selecting new source file

Technical Field

The present invention relates to the field of distributed file system technologies, and in particular, to a method, an apparatus, a computer device, and a storage medium for selecting a new source file.

Background

In current distributed file systems, files can be simply divided into source files and hard-linked files. When a user creates a hard link file based on a source file, the distributed file system records an inode number to be linked in a directory entry corresponding to the created hard link file.

When a user accesses the hard link file, the distributed file system associates with the index node corresponding to the source file through the index node number recorded in the directory entry of the distributed file system, and records the associated information in the directory entry corresponding to the hard link file. When a user deletes a hard-linked source file, the distributed file system moves the index node corresponding to the deleted source file into the free directory. When deleting the source file, the distributed file system can select any hard link file which is recorded with the associated information and is not occupied by the client as a new hard link source file, and move the index node into the directory where the new hard link file is located.

However, if the user does not access any of the hard-linked files, i.e., none of the hard-linked files are associated with the inodes to which the source file corresponds, then no new source file can be selected. After deleting the source file, when the user performs an operation of accessing the hard-linked file, the metadata of the index node cannot be accessed further because the index node corresponding to the source file is in the free directory, so that a problem occurs in the service.

Disclosure of Invention

In view of the above, the present invention provides a method, apparatus, computer device and storage medium for selecting a new source file, so as to solve the problem of service processing caused by the inability to select a new source file.

In a first aspect, the present invention provides a method of selecting a new source file, the method comprising:

after a deletion operation instruction is acquired, determining a target deletion file according to the deletion operation instruction;

when the target deleted file is a source file, a first inode number corresponding to the source file is obtained from a directory entry corresponding to the source file;

acquiring first metadata corresponding to the source file according to the first inode number;

obtaining hard link data of at least one hard link file corresponding to the source file from the first metadata;

determining a catalog item corresponding to each hard link file according to each hard link data;

determining whether a hard link file which is not occupied by the client exists or not according to the target item corresponding to each hard link file;

when at least one hard link file which is not occupied by the client exists, selecting a first hard link file from the at least one hard link file which is not occupied by the client as a new source file according to the directory entry and the hard link data corresponding to each hard link file in the at least one hard link file which is not occupied by the client;

Or selecting the first hard link file from at least one hard link file not occupied by the client as a new source file according to the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

The method for selecting the new source file has the following advantages:

because the operation of accessing the hard link file by the user is that the establishment of the association between the hard link file and the index node corresponding to the source file is a random event, and the operation of deleting the source file by the user is also a random event, the sequence of the two events is uncontrollable, no associated information is recorded in the directory entry corresponding to any hard link file when the source file is deleted, and further, a new source file cannot be selected, so that the problem of subsequent business processing occurs. Therefore, the hard link file is selected through the hard link data contained in the metadata, whether the associated information is recorded in the directory entry corresponding to the hard link file is not required to be paid attention to, and the influence of the random event on the selection of the source file can be avoided. After deleting the source file, the new source file can be selected timely through hard link data recorded in metadata, the selection efficiency is high, and the service processing problem caused by uncontrollable operation can be avoided.

In an optional implementation manner, the selecting, according to the directory entry and the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client, the first hard link file from the at least one hard link file not occupied by the client as the new source file first hard link file includes:

extracting the belonged catalogue of each hard link file in at least one hard link file which is not occupied by the client from the hard link data corresponding to each hard link file in at least one hard link file which is not occupied by the client;

determining the catalog of the source file according to the catalog item corresponding to the source file;

determining whether at least one of the hard link files belongs to the same directory as the source file according to the belonging directory of each of the at least one hard link file not occupied by the client and the belonging directory of the source file;

and when the affiliated catalogue of one or more hard link files is the same as the affiliated catalogue of the source file, selecting any hard link file from the one or more hard link files as the new source file.

Specifically, after the new source file is finally determined, the first inode in the free directory needs to be moved to the directory where the new source file is located. Because more synchronization is involved in each movement of data in the distributed file system, the first inode is moved back to its original directory, and less synchronization is required compared to the directory in which the first inode was not deleted before the source file was deleted. Furthermore, the occupancy rate of processing resources of the distributed file system can be reduced, and the efficiency of selecting the source file can be improved.

In an alternative embodiment, when there is no directory to which any hard link file belongs and the directory to which the source file belongs are the same, selecting, according to the directory entry and the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client, a first hard link file from the at least one hard link file not occupied by the client as a new source file includes:

extracting the process of the catalog of each hard link file in at least one hard link file which is not occupied by the client from the hard link data corresponding to each hard link file in at least one hard link file which is not occupied by the client;

Determining a second inode number corresponding to the catalog of the source file according to the catalog of the source file;

determining second metadata corresponding to the catalog of the source file according to the second inode number;

acquiring a process of a catalog of the source file from the second metadata;

determining whether at least one process of the hard link file belonging to the directory is the same as the process of the source file belonging to the directory according to the process of the hard link file belonging to each hard link file not occupied by the client and the process of the source file belonging to the directory;

and when the process of the belonging directory of one or more hard link files is the same as the process of the belonging directory of the source file, selecting any hard link file from the one or more hard link files as the new source file.

Specifically, after the new source file is finally determined, the first inode in the free directory needs to be moved to the directory where the new source file is located. Since in a distributed file system, traffic processing is typically performed by multiple processes. Assuming that the current operation of deleting the source file is processed in the process 1, if the selected new source file is in the process 2, when the first inode is moved from the free directory corresponding to the process 1 to the process 2, a cross-process is required. Obviously, the cross-process processing is more complex, more processing resources are consumed and the efficiency is lower. Therefore, when the hard link file belonging to the same process as the source file can be selected, the hard link file of the same process is selected as a new source file, and the efficiency is higher.

In an alternative embodiment, when the process of the directory of each hard link file of the at least one hard link file not occupied by the client is different from the process of the directory of the source file, selecting the first hard link file from the at least one hard link file not occupied by the client as a new source file according to the directory entry and the hard link data corresponding to each hard link file of the at least one hard link file not occupied by the client, including:

and selecting any hard link file from at least one hard link file which is not occupied by the client as the new source file.

In particular, since the final objective of the present solution is to select a new source file, the distributed file system still has to select a new source file for the first inode to link, although a hard link file that is the same as the source file's belonging directory and the same process as the source file's belonging directory is in. Thus, one of the hard link files that is not occupied by the client can be randomly selected as a new source file. Therefore, the method can ensure that a new source file is selected under the condition of having a hard link file, so that the subsequent business processing can be normally performed.

In an optional implementation manner, the selecting, according to the hard link data corresponding to each of the at least one hard link file not occupied by the client, the first hard link file from the at least one hard link file not occupied by the client as the new source file includes:

extracting the creation time of each hard link file in at least one hard link file which is not occupied by the client from the hard link data corresponding to each hard link file in at least one hard link file which is not occupied by the client;

determining the priority of each hard link file according to the creation time of each hard link file;

and selecting the hard link file with the highest priority from at least one hard link file which is not occupied by the client as the new source file.

Specifically, the creation time of the file is an item of information included in the metadata. Therefore, the present scheme can also be applied when new improvements are not made to the content of metadata. In addition, in the distributed file system, each process can cut data at fixed time, and the cutting is performed according to the using times of the data. If the file with the earlier creation time exists all the time, the file is frequently used, namely, a plurality of businesses can use the file, and further, the probability of deleting the file by a user is lower. Thus, the distributed file system may determine the hard-linked file that was created earlier as the new source file. Therefore, the number of times of selecting new source files can be reduced, and the occupation of processing resources is greatly reduced.

In an alternative embodiment, the method further comprises:

when a creation operation instruction is acquired, creating a second hard link file corresponding to a target source file according to the creation operation instruction, and generating a catalog entry and hard link data corresponding to the second hard link file;

extracting a third inode number from a directory entry corresponding to the second hard link file;

acquiring third metadata corresponding to the target source file according to the third inode number;

and adding the hard link data corresponding to the second hard link file into the third metadata.

Specifically, since the file name of the hard link file, the directory to which the hard link file belongs, the process in which the directory to which the hard link file belongs, and the like are not part of the original metadata, in order to select a new source file, the distributed file system may also use the corresponding hard link data as part of the metadata. In this way, the distributed file system does not need to select a new source file through the associated information in the directory entry corresponding to the hard link file, and only needs to select the new source file according to the hard link data. Further, on the basis of the hard link file which is not occupied by the client, a new source file can be selected certainly, and normal processing of subsequent business is guaranteed.

In an alternative embodiment, the method further comprises:

after a renaming operation instruction is acquired, extracting identification information, a new name and a new directory of a hard link file to be subjected to renaming operation;

determining a third hard link file corresponding to the identification information according to the identification information;

extracting a fourth inode number corresponding to the third hard link file from a directory entry corresponding to the third hard link file;

acquiring fourth metadata according to the fourth inode number, wherein the fourth metadata comprises file names of a plurality of hard link files and a catalog of each hard link file in the plurality of hard link files;

when the new catalogue is different from the catalogue of any hard link file included in the fourth metadata, changing the catalogue of the third hard link file in the fourth metadata into the new catalogue;

or,

when the new directory is the same as the directory to which any hard link file included in the fourth metadata belongs, determining whether the new name of the third hard link file is the same as the name of any hard link file corresponding to the new directory;

Deleting the file name and the affiliated catalog of the third hard link file in the fourth metadata when the new name of the third hard link file is determined to be the same as the name of a fourth hard link file corresponding to the new catalog, wherein the fourth hard link file is any hard link file stored under the new catalog;

or,

and when the new name is determined to be different from the name of any hard link file in the new directory, changing the file name of the third hard link file in the fourth metadata into the new name.

Specifically, the distributed file system may modify the hard link data included in the metadata at any time according to the obtained renaming operation instruction, so that the hard link data matches with the actual situation of the file. If the hard link data is not modified in real time, there is a high probability that the selected new source file does not exist, resulting in problems for subsequent processing. Therefore, the hard link data included in the metadata is modified in real time through the scheme, so that the existence of the selected new source file can be ensured, and further, the normal processing of the follow-up business can be ensured.

In an alternative embodiment, the method further comprises:

when the target deleted file is a target hard link file, a fifth inode number corresponding to the target hard link file is obtained from a directory entry corresponding to the target hard link file;

obtaining fifth data according to the fifth inode number;

obtaining hard link data corresponding to the target hard link file from the fifth data;

and deleting the hard link data corresponding to the target hard link file in the fifth data.

Specifically, the distributed file system may modify the hard link data included in the metadata at any time according to the obtained deletion operation instruction, so that the hard link data matches with the actual situation of the file. If the hard link data is not modified in real time, there is a high probability that the selected new source file does not exist, resulting in problems for subsequent processing. Therefore, the hard link data included in the metadata is modified in real time through the scheme, so that the existence of the selected new source file can be ensured, and further, the normal processing of the follow-up business can be ensured.

In a second aspect, the present invention provides a method for selecting a new source file device, the method comprising:

The determining module is used for determining a target deletion file according to the deletion operation instruction after the deletion operation instruction is acquired;

the acquisition module is used for acquiring a first inode number corresponding to the source file from a directory entry corresponding to the source file when the target deleted file is the source file; acquiring first metadata corresponding to the source file according to the first inode number; obtaining hard link data of at least one hard link file corresponding to the source file from the first metadata;

the determining module is used for determining a catalog item corresponding to each hard link file according to each hard link data; determining whether a hard link file which is not occupied by the client exists or not according to the target item corresponding to each hard link file;

the selection module is used for selecting a first hard link file from at least one hard link file which is not occupied by the client as a new source file according to the directory entry and the hard link data corresponding to each hard link file in the at least one hard link file which is not occupied by the client when the at least one hard link file which is not occupied by the client exists; or selecting the first hard link file from at least one hard link file not occupied by the client as a new source file according to the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

In a third aspect, the present invention provides a computer device comprising: the storage and the processor are in communication connection, computer instructions are stored in the storage, and the processor executes the computer instructions, so that the method for selecting a new source file according to the first aspect or any implementation manner corresponding to the first aspect is executed.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of selecting a new source file of the first aspect or any of its corresponding embodiments.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a file structure according to an embodiment of the present invention;

FIG. 2 is a flow diagram of a method of selecting a new source file according to an embodiment of the invention;

FIG. 3 is a flow diagram of another method of selecting a new source file according to an embodiment of the invention;

FIG. 4 is a flow diagram of yet another method of selecting a new source file according to an embodiment of the invention;

FIG. 5 is a flow diagram of a method of modifying metadata according to an embodiment of the present invention;

FIG. 6 is a flow chart of another method of modifying metadata according to an embodiment of the present invention;

FIG. 7 is a flow chart of yet another method of modifying metadata according to an embodiment of the present invention;

FIG. 8 is a block diagram of a configuration for selecting a new source file device according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Some terms involved in this scheme are explained below.

File Data Block (Data Block): a data structure in a distributed file system records the data itself of a file.

Directory entry (Dentry): a data structure in a distributed file system records information such as the name of a file, an inode number corresponding to the file, a directory where the file is located, and the like.

Inode (Inode): a data structure in a distributed file system records information such as inode number, file size, creation time, and modification time, which may be collectively referred to as Metadata (Metadata) representing additional attributes of a file. In the distributed file system, metadata of each file is managed through Metadata Server (MDS).

A file: in the distributed file system, each file may correspond to a plurality of directory entries, an inode and a file data block, and the structure of the file may be as shown in fig. 1. Any directory entry can be linked to the same index node, metadata corresponding to the file can be further obtained from the index node, and finally, a corresponding file data block is determined according to the metadata, and file data is accessed from the file data block.

Free catalogue: a special directory in the MDS. For placing file metadata in the distributed file system that has been deleted by the user but not deleted by the actual disk data.

In a distributed file system, files can be simply divided into source files and hard-linked files, which correspond to the same inode. Metadata in the inode is accessible to a user either through a source file or a hard link file. The user can delete the source file or the hard link file at any time. After deleting the source file, the distributed file system needs to reassign a hard-linked file as the source file.

In accordance with an embodiment of the present invention, a method embodiment of selecting a new source file is provided, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.

In this embodiment, a method for selecting a new source file is provided, which may be used in the distributed file system, and fig. 2 is a flowchart of a method for selecting a new source file according to an embodiment of the present invention, as shown in fig. 2, where the flowchart includes the following steps:

Step S201, when a deletion operation instruction is acquired, determining a target deletion file according to the deletion operation instruction.

Specifically, the user can delete any file in the distributed file system at any time, and after the distributed file system acquires the deletion operation instruction, the target deletion file can be determined according to the identification information of the file in the deletion operation instruction.

In step S202, when the target deletion file is a source file, a first inode number corresponding to the source file is obtained from a directory entry corresponding to the source file.

Specifically, after determining the target deletion file, a directory entry corresponding to the target deletion file may be obtained, further, if the source file tag may be extracted from the directory entry, the target deletion file may be determined to be the source file, and if the source file tag may not be extracted from the directory entry, the target deletion file may be determined to be the hard link file.

When the target deletion file is determined to be the source file, any hard link file in the hard link files corresponding to the source file needs to be designated as a new source file. Hard link data of the hard link file may be recorded in metadata. The distributed file system may obtain hard link data for a hard link file corresponding to the source file, and select the hard link file via the hard link data. Therefore, when the target deletion file is determined to be the source file, the first inode number corresponding to the source file can be determined from the directory entry corresponding to the source file.

Step S203, according to the first inode number, first metadata corresponding to the source file is acquired.

Specifically, the metadata may include an inode number, so the distributed file system may determine a first inode according to the first inode number, and extract the first metadata from the first inode. Also, the distributed file system may move the first inode into the free directory, i.e., move the first metadata into the free directory. The first metadata in the free directory is not accessible.

When the first metadata is acquired, the first metadata is generally firstly performed from the process of the directory of the source file. In some cases, the first metadata may be trimmed at the timing of the process, where the first metadata is not stored in the cache corresponding to the process, and then the first metadata needs to be loaded from the disk.

Step S204, hard link data of at least one hard link file corresponding to the source file is obtained from the first metadata.

Specifically, the metadata may further include hard link data of a hard link file corresponding to the source file. The distributed file system may determine whether hard link data exists in the first metadata, and if so, may acquire the hard link data, and perform the processing of step S205, and if not, may delete the first metadata in the free directory.

Step S205, according to each hard link data, determining the catalog item corresponding to each hard link file.

Specifically, the hard link data may include identification information of the hard link file, and the distributed file system may determine a directory entry corresponding to the hard link file according to the identification information of the hard link file.

Step S206, determining whether the hard link files which are not occupied by the client exist according to the target item corresponding to each hard link file.

Specifically, the distributed file system may obtain a specific value of a service count parameter corresponding to each hard link file from a directory entry of each hard link file, and determine, according to the specific value of the service count parameter, whether each hard link file is occupied by a client, where specific processing may be: when the specific value of the service counting parameter is zero, determining that the hard link file is not occupied by the client, and when the specific value of the service counting parameter is not zero, determining that the hard link file is occupied by the client. Further, the distributed file system may count the number of hard link files and identification information that are not occupied by the client. When the number is zero, it is determined that there are no hard link files not occupied by the client. At this time, the distributed file system may acquire the directory entry corresponding to each hard link file again after the preset time period, and determine whether there are hard link files that are not occupied by the client, until it is determined that the number of hard link files that are not occupied by the client is not zero. When the number is not zero, it is determined that there is a hard link file that is not occupied by the client. Further, the first hard link file may be selected from among the hard link files not occupied by the client according to the first or second mode. The first mode may be as shown in step S207, and the second mode may be as shown in step S208.

Step S207, selecting a first hard link file from at least one hard link file not occupied by the client as a new source file according to the directory entry and the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

Specifically, the distributed file system may obtain relevant information of the source file from the directory entry of the source file, may obtain relevant information of each hard link file in the hard link files not occupied by the client (i.e., the hard link data corresponding to each hard link file) from the hard link data according to the identification information of the hard link files not occupied by the client, further, compare the relevant information of the source file with the relevant information of the hard link files not occupied by the client, determine similarity between the relevant information of each hard link file not occupied by the client and the relevant information of the source file, and select the hard link file with the highest similarity (i.e., the first hard link file) as the new hard link file.

Step S208, selecting a first hard link file from at least one hard link file not occupied by the client as a new source file according to the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

Specifically, when at least one hard link file not occupied by the client exists, the distributed file system may also sort each hard link file not occupied by the client only according to the hard link data in each hard link file not occupied by the client and a preset sorting rule, and further determine the hard link file with the first sorting value (i.e., the first hard link file) as a new source file.

After a new source file is selected, a source file tag may be added to the directory entry corresponding to the new source file. And the first index node can be moved out of the free catalog and then moved into the catalog where the new source file is located, namely the first metadata is moved out of the free catalog, and the first metadata can be accessed. The directory in which the new source file is located may be obtained from the directory entry corresponding to the new source file.

According to the method for selecting the new source file, because the operation of accessing the hard link file by the user is performed, the establishment of the association between the hard link file and the index node corresponding to the source file is a random event, and the operation of deleting the source file by the user is also a random event, the sequence of the two events is uncontrollable, when the source file is deleted, no associated information is recorded in the directory entry corresponding to any hard link file, and further, the new source file cannot be selected, so that the problem of subsequent business processing occurs. Therefore, the hard link file is selected through the hard link data contained in the metadata, whether the associated information is recorded in the directory entry corresponding to the hard link file is not required to be paid attention to, and the influence of the random event on the selection of the source file can be avoided. After deleting the source file, the new source file can be selected timely through hard link data recorded in metadata, the selection efficiency is high, and the service processing problem caused by uncontrollable operation can be avoided.

In this embodiment, a method for selecting a new source file is provided, corresponding to the first mode in the foregoing embodiment, which may be used in the foregoing distributed file system, and fig. 3 is a flowchart of a method for selecting a new source file according to an embodiment of the present invention, as shown in fig. 3, where the flowchart includes the following steps:

step S301 extracts, from the hard link data corresponding to each of the at least one hard link file not occupied by the client, the directory to which each of the at least one hard link file not occupied by the client belongs.

Specifically, the hard link data may include a directory to which the hard link file belongs. Thus, the distributed file system may extract the directory to which each hard link file is assigned from the hard link data corresponding to the hard link file that is not occupied by the client.

Step S302, determining the catalog of the source file according to the catalog item corresponding to the source file.

Specifically, the directory entry may include a directory to which the file belongs. Thus, the distributed file system may extract the directory to which the source file belongs from the directory entry to which the source file corresponds.

Step S303, determining whether the belonging directory of at least one hard link file is identical to the belonging directory of the source file according to the belonging directory of each hard link file in the at least one hard link file not occupied by the client and the belonging directory of the source file.

Specifically, the distributed file system may compare the belonging directory of the source file with the belonging directory in each hard link file not occupied by the client, determine whether there is a hard link file identical to the belonging directory of the source file, if so, may proceed to step S304, and if not, may proceed to step S305.

Step S304, when the belonging catalog of one or more hard link files is the same as the belonging catalog of the source file, selecting any hard link file from the one or more hard link files as a new source file.

In step S305, when there is no directory to which any hard link file belongs and the directory to which the source file belongs are the same, a process in which the directory to which each hard link file in the at least one hard link file not occupied by the client belongs is extracted from hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

Specifically, the hard link data may include a process in which the directory to which the hard link file belongs. Therefore, when it is determined that the belonging directory of any hard link file does not exist and the belonging directory of the source file are the same, the distributed file system can extract the process of the belonging directory of each hard link file from the hard link data corresponding to each hard link file not occupied by the client.

Step S306, determining a second inode number corresponding to the source file belonging directory according to the source file belonging directory.

Specifically, the Directory (DIR) may include an inode number corresponding to the directory. Thus, the distributed file system may extract a second inode number corresponding to the source file's belonging directory from the source file's belonging directory.

Step S307, determining the second metadata corresponding to the catalog of the source file according to the second inode number.

Specifically, the distributed file system may determine a second inode according to the second inode number, and extract second metadata corresponding to the directory to which the source file belongs from the second inode.

Step S308, the process of the catalog of the source file is obtained from the second metadata.

Specifically, the metadata may include a process in which the directory is located. Thus, the distributed file system can extract the process of the directory of the source file from the second metadata.

Step S309, determining whether the process of the belonging directory of the at least one hard link file is the same as the process of the belonging directory of the source file according to the process of the belonging directory of each hard link file of the at least one hard link file not occupied by the client and the process of the corresponding belonging directory of the source file.

Specifically, the distributed file system may compare the process of the directory of the source file with the process of the directory of each hard link file not occupied by the client, determine whether there is a hard link file identical to the directory of the source file, if so, may proceed to step S310, and if not, may proceed to step S311.

In step S310, when the process of the directory where one or more hard link files exist is the same as the process of the directory where the source file belongs, any hard link file is selected from the one or more hard link files as a new source file.

In step S311, when the process of the belonging directory of each hard link file in the at least one hard link file not occupied by the client is different from the process of the belonging directory of the source file, any hard link file is selected as a new source file from the at least one hard link file not occupied by the client.

After selecting a new source file, the distributed file system may add a source file tag to the directory entry corresponding to the new source file. And the first index node can be moved out of the free catalog and then moved into the catalog where the new source file is located, namely the first metadata is moved out of the free catalog, and the first metadata can be accessed. The directory in which the new source file is located may be obtained from the directory entry corresponding to the new source file.

According to the method for selecting the new source file, the new source file is selected through the rule of progressive layer by layer, any hard link file which is the same as the catalog of the source file can be selected as the new source file under the condition that the hard link file is the same as the catalog of the source file, and therefore subsequent related processing efficiency is high and resources are occupied less. In the worst case, a new source file can be selected from the hard link files corresponding to the source file, so that the normal processing of the follow-up business is ensured.

In this embodiment, a method for selecting a new source file is provided, corresponding to the second mode in the foregoing embodiment, which may be used in the foregoing distributed file system, and fig. 4 is a flowchart of a method for selecting a new source file according to an embodiment of the present invention, as shown in fig. 4, where the flowchart includes the following steps:

in step S401, the creation time of each of the at least one hard link file not occupied by the client is extracted from the hard link data corresponding to each of the at least one hard link file not occupied by the client.

Specifically, the hard link data may include a creation time of the hard link file. Thus, the distributed file system may extract the creation time of each hard link file not occupied by the client from the hard link data corresponding to each hard link file not occupied by the client.

Step S402, determining the priority of each hard link file according to the creation time of each hard link file.

Specifically, the distributed file system may determine the priority of the hard link file with the foremost creation time as the highest, determine the priority of the hard link file with the latest creation time as the lowest, and so on, to obtain the priority of each hard link file.

In step S403, among at least one hard link file not occupied by the client, the hard link file with the highest priority is selected as the new source file.

Specifically, after selecting a new source file, the distributed file system may add a source file tag to a directory entry corresponding to the new source file. And the first index node can be moved out of the free catalog and then moved into the catalog where the new source file is located, namely the first metadata is moved out of the free catalog, and the first metadata can be accessed. The directory in which the new source file is located may be obtained from the directory entry corresponding to the new source file.

The method for selecting a new source file provided in this embodiment, the creation time of the file is an item of information included in metadata. Therefore, the present scheme can also be applied when new improvements are not made to the content of metadata. In addition, in the distributed file system, each process can cut data at fixed time, and the cutting is performed according to the using times of the data. If the file with the earlier creation time exists all the time, the file is frequently used, namely, a plurality of businesses can use the file, and further, the probability of deleting the file by a user is lower. Thus, the distributed file system may determine the hard-linked file that was created earlier as the new source file. Therefore, the number of times of selecting new source files can be reduced, and the occupation of processing resources is greatly reduced.

In this embodiment, a metadata modification method is provided, which may be used in the distributed file system described above, and fig. 5 is a flowchart of a metadata modification method according to an embodiment of the present invention, as shown in fig. 5, where the flowchart includes the following steps:

in step S501, when a creation operation instruction is acquired, a second hard link file corresponding to the target source file is created according to the creation operation instruction, and directory entries and hard link data corresponding to the second hard link file are generated.

Specifically, the distributed file system may add a set member to the index node, where the set member is used to record hard link data, and the hard link data may be a new component of metadata. And when the distributed file system detects the creation operation of the user, creating a second hard link file corresponding to the target source file according to the acquired creation operation instruction, and generating a catalog item and hard link data corresponding to the second hard link file. The hard link data may include a file name of the second hard link file, a belonging directory, a process in which the belonging directory is located, a creation time, and the like.

Step S502, extracting a third inode number from the directory entry corresponding to the second hard link file.

Specifically, the directory entry includes an inode number. Thus, the distributed file system may extract the third inode number from the directory entry corresponding to the second hard-linked file.

Step S503, according to the third inode number, third metadata corresponding to the target source file is obtained.

Specifically, the distributed file system may determine, according to the third inode number, a third inode corresponding to the third inode number. And acquiring third metadata corresponding to the target source file from the third index node. The inode number in the directory entry of the target source file is the third inode number.

In step S504, the hard link data corresponding to the second hard link file is added to the third metadata.

Specifically, the distributed file system may add the hard link data corresponding to the second hard link file to a set member of the third metadata, where the hard link data in the set member may be used to select a new source file. Since the distributed file system performs service processing through multiple processes, copies of the third metadata may also exist on other processes than the process where the directory of the target source file belongs. Therefore, after the third metadata is updated, the distributed file system can update the copies of the third metadata on other processes to complete the synchronization process of the third metadata updating content. In this way, the contents of the master and the copy of the third metadata are the same, and business processing can be performed on any process through the latest third metadata. In addition, the problems of service processing failure and the like caused by different contents of the master copy and the slave copy of the third metadata can be avoided.

In addition, since more content is included in the metadata, a larger storage space may be occupied. Therefore, the distributed storage system can compress the metadata to obtain the compressed file corresponding to the metadata. And decompressing the compressed file to obtain the metadata when the content in the metadata is modified or used each time.

In the metadata modification method provided in this embodiment, since the file name of the hard link file, the directory to which the hard link file belongs, the process in which the directory to which the hard link file belongs, and other hard link data are not original components in metadata, in order to select a new source file, the distributed file system may also use the corresponding hard link data as components of metadata. In this way, the distributed file system does not need to select a new source file through the associated information in the directory entry corresponding to the hard link file, and only needs to select the new source file according to the hard link data. Further, on the basis of the hard link file which is not occupied by the client, a new source file can be selected certainly, and normal processing of subsequent business is guaranteed.

In this embodiment, a metadata modification method is provided, which may be used in the distributed file system described above, and fig. 6 is a flowchart of a metadata modification method according to an embodiment of the present invention, as shown in fig. 6, where the flowchart includes the following steps:

In step S601, after obtaining the renaming operation instruction, the identification information, the new name and the new directory of the hard link file to be subjected to the renaming operation are extracted.

Specifically, the user can rename the hard link file at any time, and the distributed file system can extract the identification information, the new name, the new directory and the like of the hard link file from the rename operation instruction after obtaining the rename operation instruction. Wherein, the identification information of the hard link file can be the original file name of the hard link file.

Step S602, determining a third hard link file corresponding to the identification information according to the identification information.

Specifically, the distributed file system may search for a third hard link file according to the identification information of the hard link file.

Step S603, extracting a fourth inode number corresponding to the third hard link file from the directory entry corresponding to the third hard link file.

Specifically, after the third hard link file is determined, the distributed file system may extract the fourth inode number from the directory entry corresponding to the third hard link file.

In step S604, fourth data is acquired according to the fourth inode number.

The fourth metadata comprises file names of a plurality of hard link files and a catalog of each hard link file in the plurality of hard link files.

For example, the fourth metadata includes hard link data of [ dirA, dent 1), (dirB, dent 2), (dirC, dent 3) ], where the hard link data corresponding to the third hard link file is (dirA, dent 1).

Specifically, the distributed file system may determine the fourth inode according to the fourth inode number, and obtain fourth metadata from the fourth inode.

In step S605, when the new directory is different from the directory of any hard link file included in the fourth metadata, the directory of the third hard link file in the fourth metadata is changed to the new directory.

For example, the new directory is dirD, which is a new directory compared to dirA, dirB and dirC in [ dirA, dent 1), (dirB, dent 2), (dirC, dent 3) ], and thus the hard link data corresponding to the third hard link file is modified to (dirD, dent 1).

In step S606, when the new directory is the same as the directory to which any of the hard link files included in the fourth metadata belongs, it is determined whether the new name of the third hard link file is the same as the name of any of the hard link files corresponding to the new directory.

In step S607, when it is determined that the new name of the third hard link file is the same as the name of the fourth hard link file corresponding to the new directory, the file name and the affiliated directory of the third hard link file in the fourth metadata are deleted.

The fourth hard link file is any hard link file stored in the new catalogue.

For example, the new directory is dirB, the new name is dent 2, and the hard link data (dirA, dent 1) corresponding to the third hard link file is deleted. The current fourth metadata includes hard link data of [ dirB, dent 2), (dirC, dent 3 ].

In step S608, when it is determined that the new name is different from the name of any hard link file in the new directory, the file name of the third hard link file in the fourth metadata is changed to the new name.

Example 1, the new directory is dirB, the new name is dirB 2-1, the hard link data (dirA, dir 1) corresponding to the third hard link file is modified to (dirB, dirY 2-1), and the current fourth metadata includes hard link data of [ dirB, dirY 2-1), (dirB, dirY 2), (dirC, dirY 3 ].

Example 2, the new directory is dirA, the new name is dent 1-1, the hard link data (dirA, dent 1) corresponding to the third hard link file is modified to (dirA, dent 1-1), and the current fourth metadata includes hard link data of [ dirA, dent 1-1), (dirB, dent 2), (dirC, dent 3 ].

Specifically, since the distributed file system performs service processing through multiple processes, copies of fourth metadata may also exist on other processes than the process where the directory of the target source file belongs. Therefore, after the fourth metadata is updated, the distributed file system can update the copies of the fourth metadata on other processes to complete the synchronization process of the fourth metadata updating content. In this way, the content of the main copy and the copy of the fourth metadata is the same, and business processing can be performed on any process through the latest fourth metadata. And the problems of service processing failure and the like caused by different contents of the main copy and the duplicate copy of the fourth metadata can be avoided.

According to the metadata modification method provided by the embodiment, the distributed file system can modify the hard link data included in the metadata at any time according to the obtained renaming operation instruction, so that the hard link data accords with the actual condition of the file. If the hard link data is not modified in real time, there is a high probability that the selected new source file does not exist, resulting in problems for subsequent processing. Therefore, the hard link data included in the metadata is modified in real time through the scheme, so that the existence of the selected new source file can be ensured, and further, the normal processing of the follow-up business can be ensured.

In this embodiment, a metadata modification method is provided, which may be used in the distributed file system described above, and fig. 7 is a flowchart of a metadata modification method according to an embodiment of the present invention, as shown in fig. 7, where the flowchart includes the following steps:

in step S701, after the deletion operation instruction is acquired, the target deletion file is determined according to the deletion operation instruction.

The deleting operation instruction comprises identification information of the target deleting file.

Referring to step S201 shown in fig. 2 in detail, a detailed description is omitted here.

In step S702, when the target deletion file is the target hard link file, the fifth inode number corresponding to the target hard link file is obtained from the directory entry corresponding to the target hard link file.

Specifically, the directory entry includes an inode number. Therefore, when the target deleted file is determined to be the target hard link file, the distributed file system may acquire the fifth inode number from the directory entry corresponding to the target hard link file.

In step S703, fifth data is acquired according to the fifth inode number.

Specifically, the distributed file system may determine the fifth inode according to the fifth inode number, and obtain fifth data from the fifth inode.

In step S704, the hard link data corresponding to the target hard link file is obtained from the fifth data.

Specifically, the distributed file system may determine, from the fifth data, hard link data corresponding to the target hard link file according to the identification information of the target hard link file, where the hard link data is included in the fifth data.

In step S705, in the fifth data, the hard link data corresponding to the target hard link file is deleted.

Specifically, since the distributed file system performs service processing through multiple processes, copies of fifth data may also exist on other processes than the process where the directory of the target source file belongs. Therefore, after updating the fifth data, the distributed file system can update the copies of the fifth data on other processes to complete the synchronization process of the update content of the fifth data. Therefore, the main and duplicate contents of the fifth data are the same, and business processing can be performed on the latest fifth data in any process. And the problems of service processing failure and the like caused by different contents of the main book and the copy of the fifth data can be avoided.

According to the metadata modification method provided by the embodiment, the distributed file system can modify the hard link data included in the metadata at any time according to the acquired deleting operation instruction, so that the hard link data accords with the actual condition of the file. If the hard link data is not modified in real time, there is a high probability that the selected new source file does not exist, resulting in problems for subsequent processing. Therefore, the hard link data included in the metadata is modified in real time through the scheme, so that the existence of the selected new source file can be ensured, and further, the normal processing of the follow-up business can be ensured.

In this embodiment, a new source file device is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, which are not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

The present embodiment provides a new source file device, as shown in fig. 8, including:

A determining module 801, configured to determine, according to the deletion operation instruction, a target deletion file after the deletion operation instruction is acquired;

an obtaining module 802, configured to obtain, when the target deletion file is a source file, a first inode number corresponding to the source file from a directory entry corresponding to the source file; acquiring first metadata corresponding to a source file according to a first inode number; obtaining hard link data of at least one hard link file corresponding to a source file from the first metadata;

a determining module 801, configured to determine, according to each hard link data, a directory entry corresponding to each hard link file; determining whether the hard link files which are not occupied by the client exist or not according to the target item corresponding to each hard link file;

a selecting module 803, configured to, when there is at least one hard link file not occupied by the client, select, according to the directory entry and hard link data corresponding to each of the at least one hard link file not occupied by the client, a first hard link file from the at least one hard link file not occupied by the client as a new source file; or selecting the first hard link file from the at least one hard link file not occupied by the client as a new source file according to the hard link data corresponding to each hard link file in the at least one hard link file not occupied by the client.

In an alternative embodiment, the selecting module 803 is configured to:

determining the belonging catalog of the source file according to the catalog item corresponding to the source file;

determining whether the belonging directory of at least one hard link file is identical to the belonging directory of the source file according to the belonging directory of each hard link file in the at least one hard link file which is not occupied by the client and the belonging directory of the source file;

when the belonging directory of one or more hard link files is the same as the belonging directory of the source file, any hard link file is selected from the one or more hard link files as a new source file.

In an alternative embodiment, when the belonging directory of any hard link file does not exist and the belonging directory of the source file is the same, the selecting module 803 is configured to:

Determining a second inode number corresponding to the belonging catalog of the source file according to the belonging catalog of the source file;

acquiring a process of a catalog of the source file from the second metadata;

determining whether the process of the belonging directory of at least one hard link file is the same as the process of the belonging directory of the source file according to the process of the belonging directory of each hard link file in at least one hard link file which is not occupied by the client and the process of the belonging directory of the source file;

when the process of the directory where one or more hard link files exist is the same as the process of the directory where the source file belongs, any hard link file is selected from the one or more hard link files as a new source file.

In an alternative embodiment, when the process of the directory to which the at least one hard link file belongs and the process of the directory to which the source file belongs are different, the selecting module 803 is configured to:

any hard link file is selected as a new source file from at least one hard link file not occupied by the client.

In an alternative embodiment, the selecting module 803 is configured to:

and selecting the hard link file with the highest priority from at least one hard link file which is not occupied by the client as a new source file.

In an alternative embodiment, the apparatus further comprises a creation module 804, an extraction module 805, and an addition module 806:

a creating module 804, configured to create, when a creating operation instruction is acquired, a second hard link file corresponding to the target source file according to the creating operation instruction, and generate a directory entry and hard link data corresponding to the second hard link file;

an extracting module 805, configured to extract a third inode number from a directory entry corresponding to the second hard link file;

the obtaining module 802 is further configured to obtain third metadata corresponding to the target source file according to the third inode number;

and an adding module 806, configured to add the hard link data corresponding to the second hard link file to the third metadata.

In an alternative embodiment, the apparatus further comprises a modification module 807 and a deletion module 808:

the extracting module 805 is further configured to extract, after obtaining the renaming operation instruction, identification information, a new name and a new directory of the hard link file to be subjected to the renaming operation;

the determining module 801 is further configured to determine, according to the identification information, a third hard link file corresponding to the identification information;

the extracting module 805 is further configured to extract a fourth inode number corresponding to the third hard link file from the directory entry corresponding to the third hard link file;

the obtaining module 802 is further configured to obtain fourth metadata according to the fourth inode number, where the fourth metadata includes file names of a plurality of hard link files and a directory to which each hard link file in the plurality of hard link files belongs;

a changing module 807, configured to change the directory of the third hard link file in the fourth metadata to a new directory when the new directory is different from the directory of any hard link file included in the fourth metadata;

or,

the determining module 801 is further configured to determine, when the new directory is the same as a directory to which any hard link file included in the fourth metadata belongs, whether a new name of the third hard link file is the same as a name of any hard link file corresponding to the new directory;

A deleting module 808, configured to delete, when it is determined that the new name of the third hard link file is the same as the name of a fourth hard link file corresponding to the new directory, the file name of the third hard link file and the directory to which the third hard link file belongs in the fourth metadata, where the fourth hard link file is any hard link file stored under the new directory;

or,

the changing module 807 is further configured to change the file name of the third hard link file in the fourth metadata to the new name when it is determined that the new name is different from the name of any hard link file in the new directory.

In an alternative embodiment, the obtaining module 802 is further configured to obtain, when the target deletion file is the target hard link file, a fifth inode number corresponding to the target hard link file from a directory entry corresponding to the target hard link file; obtaining fifth data according to the fifth index node number; obtaining hard link data corresponding to the target hard link file from the fifth data;

and the deleting module 808 is further configured to delete, in the fifth data, the hard link data corresponding to the target hard link file.

Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.

The selection of the new source file means in this embodiment is in the form of functional units, here referred to as ASIC (Application Specific Integrated Circuit ) circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above functions.

The embodiment of the invention also provides a computer device which is provided with the novel source file device shown in the figure 8.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 9, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 9.

The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.

Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform the methods shown in implementing the above embodiments.

The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.

The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.

The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.

Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims

1. A method of selecting a new source file, the method comprising:

2. The method according to claim 1, wherein selecting a first hard link file from among the at least one hard link file not occupied by the client as a new source file first hard link file according to the directory entry and the hard link data corresponding to each of the at least one hard link file not occupied by the client, comprises:

3. The method according to claim 2, wherein when there is no belonging directory of any hard link file identical to the belonging directory of the source file, the selecting a first hard link file among the at least one hard link file not occupied by the client as a new source file according to the directory entry and the hard link data corresponding to each of the at least one hard link file not occupied by the client includes:

Acquiring a process of a catalog of the source file from the second metadata;

4. A method according to claim 3, wherein when the process of the directory of each hard link file of the at least one hard link file not occupied by the client and the process of the directory of the source file are different, selecting the first hard link file from the at least one hard link file not occupied by the client as the new source file according to the directory entry and the hard link data corresponding to each hard link file of the at least one hard link file not occupied by the client comprises:

5. The method according to claim 1, wherein selecting a first hard link file from among the at least one hard link file not occupied by the client as a new source file according to the hard link data corresponding to each of the at least one hard link file not occupied by the client comprises:

6. The method according to any one of claims 1-5, further comprising:

7. The method according to any one of claims 1-5, further comprising:

Or,

or,

8. The method according to claim 1, wherein the method further comprises:

obtaining fifth data according to the fifth inode number;

9. A method for selecting a new source file device, said device comprising:

10. A computer device, comprising:

a memory and a processor in communication with each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of selecting a new source file as claimed in any one of claims 1 to 8.

11. A computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of selecting a new source file as claimed in any one of claims 1 to 8.