CN103688257A - Method and device for managing metadata - Google Patents

Method and device for managing metadata Download PDF

Info

Publication number
CN103688257A
CN103688257A CN201280002998.8A CN201280002998A CN103688257A CN 103688257 A CN103688257 A CN 103688257A CN 201280002998 A CN201280002998 A CN 201280002998A CN 103688257 A CN103688257 A CN 103688257A
Authority
CN
China
Prior art keywords
directory
subtree
sub
metadata
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201280002998.8A
Other languages
Chinese (zh)
Other versions
CN103688257B (en
Inventor
过晓春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN103688257A publication Critical patent/CN103688257A/en
Application granted granted Critical
Publication of CN103688257B publication Critical patent/CN103688257B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/122File system administration, e.g. details of archiving or snapshots using management policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a method and a device for managing metadata. The method comprises: obtaining to-be-transferred load information; searching directory attribute of a metadata directory, selecting load information matching the to-be-transferred load information in the directory attribute and directory having sub-directory tree marks as a target directory, and determining all the sub-directory trees, which regard the target directory as a root directory, to be to-be-transferred target sub-directory trees; presetting the sub-directory tree marks in the directory attribute of the root directory marked by the sub-directory tree marks; and integrally transferring the target sub-directory trees. The method and the device allow metadata management efficiency of a file system to be improved.

Description

Metadata management method and device
Technical Field
The present invention relates to storage technologies, and in particular, to a method and an apparatus for managing metadata.
Background
The cluster file system is a system for uniformly managing files of a plurality of machines in a cluster, provides a uniform storage space for the machines, and is called a name space, and the name space is used for storing storage paths of the files in the machines in the cluster; for example, the storage path of file a is C/aa/C/a (where C/aa/C may be called directory); the metadata includes the information indicating the file storage path. The name space is divided into a plurality of shares, and each share is distributed to one machine in the cluster and used for storing metadata in the corresponding machine; assuming that two file paths, namely C/aa/C/a and C/bb/d, are stored in the space under the C directory in a certain machine, C/aa/C/a may be referred to as a sub-tree, C/bb/d may also be referred to as a sub-tree, and aa/C/a may also be referred to as a sub-tree, that is, the sub-tree is actually metadata, and is only referred to by the proper name set according to the hierarchical connection relationship between directories or files in the metadata.
In the operation process of the cluster, files can be accessed by the application of a user, the access amount of the files can be represented by load, and the load of a certain machine is higher when the access amount of the files of the machine is larger. In order to balance the load of each machine in the cluster, in the prior art, when the load of a certain machine is heavy, a part of the load of the machine is migrated to another machine (actually, the migration of a file); accordingly, the metadata of the machine is updated along with the movement of the load, and the subtree corresponding to the migrated load is also migrated to another machine, that is, the load migrates its corresponding metadata, and the distribution of the metadata is changed.
However, in load migration and subtree migration performed for load balancing, metadata may be distributed in a cluster sparsely, and it may be that, if a user wants to migrate a file from one directory to another directory, and the two directories are located on different machines, respectively, cross-machine operation is performed on metadata distribution management, that is, metadata is sent from one machine to another machine, so that efficiency of metadata distribution management performed by a file system when the user performs file operation is low, and access performance of the cluster to user applications is affected.
Disclosure of Invention
The invention provides a metadata management method and a metadata management device, which are used for improving the metadata management efficiency of a file system.
In a first aspect, a metadata management method is provided, including:
acquiring load information to be migrated;
searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with the load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and migrating the target subtree integrally.
With reference to the first aspect, in a first possible implementation manner, a sub-tree of a root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of a service access operation range.
With reference to the first aspect, in a second possible implementation manner, a sub-tree of the root directory identified by the sub-tree flag includes metadata whose access frequency reaches a preset threshold within a preset time period.
With reference to the first aspect, in a third possible implementation manner, before searching for a directory attribute of a directory in metadata, the method further includes: selecting a plurality of subtrees with the same load change trend as a concurrency group from at least two subtrees with the subtree marks on the directory attributes of the root directory, and setting the concurrency group marks in the directory attributes of the root directory of each subtree in the concurrency group; the selecting directory attribute includes load information matched with the load information to be migrated and a directory with a subtree mark is a target directory, including: and selecting the directory which not only comprises the load information and has a sub-tree mark, but also comprises the concurrent group mark in the directory attributes as the target directory.
With reference to any one of the first to third possible implementation manners of the first aspect, in a fourth possible implementation manner, the selecting, as the target directory, a directory that includes not only the load information and the sub-tree flag but also the concurrent group flag in the directory attribute, includes: and selecting a directory which not only comprises the load information and the subtree mark but also comprises the concurrency group mark in directory attributes from the concurrency group comprising the largest number of subtrees as the target directory.
With reference to the third possible implementation manner of the first aspect, in a fifth possible implementation manner, the directory attribute is an attribute that is set by a receiving user through an application programming interface API, where the attribute includes the subtree flag and the concurrent group flag.
In a second aspect, there is provided a metadata management apparatus including:
a load determining unit configured to acquire load information to be migrated;
a sub-tree searching unit, configured to search directory attributes of directories in the metadata, select a directory including load information matching the load information to be migrated and having a sub-tree flag as a target directory, and determine all sub-trees taking the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and the subtree migration unit is used for migrating the whole target subtree.
With reference to the second aspect, in a first possible implementation manner, a sub-tree of the root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of a service access operation range.
With reference to the second aspect, in a second possible implementation manner, the sub-tree of the root directory identified by the sub-tree flag includes metadata whose access frequency reaches a preset threshold within a preset time period.
With reference to the second aspect, in a third possible implementation manner, the method further includes: before the sub-tree searching unit searches the directory attributes of the directories in the metadata, the attribute setting unit is used for selecting a plurality of sub-trees with the same load change trend as a concurrency group from at least two sub-trees of which the directory attributes of the root directory have sub-tree marks, and setting a concurrency group mark in the directory attribute of the root directory of each sub-tree in the concurrency group; the sub-tree searching unit is specifically configured to search directory attributes of directories in the metadata, select a directory having the load information and a sub-tree mark in the directory attributes, and determine, as the target directory, all sub-trees using the target directory as a root directory and using the directory having the concurrent group mark as a target sub-tree to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
With reference to any one of the first possible implementation manner to the third possible implementation manner of the second aspect, in a fourth possible implementation manner, the subtree searching unit is specifically configured to select, from a concurrency group including the largest number of subtrees, a directory including not only the load information and the subtree indicator but also the concurrency group indicator in directory attributes as the target directory.
With reference to the third possible implementation manner of the second aspect, in a fifth possible implementation manner, the method further includes: and the attribute acquisition unit is used for receiving the attributes set by the user through an Application Programming Interface (API), wherein the attributes comprise the subtree mark and the concurrency group mark.
In a third aspect, a compute node for metadata management is provided, the compute node comprising: a processor, a communication interface, a memory, and a bus; the processor, the communication interface and the memory complete mutual communication through the bus;
the communication interface is used for receiving a program by the metadata management computing node;
the processor is used for executing programs;
the memory is used for storing programs;
the program includes: the system comprises a load determining unit, a subtree searching unit and a subtree migrating unit;
the load determining unit is used for acquiring load information to be migrated;
the sub-tree searching unit is used for searching directory attributes of directories in the metadata, selecting the directory with the load information matched with the load information to be migrated and a sub-tree mark as a target directory, and determining all sub-trees taking the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and the subtree migration unit is used for migrating the whole target subtree.
With reference to the third aspect, in a first possible implementation manner, a sub-tree of the root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of a service access operation range.
With reference to the third aspect, in a second possible implementation manner, the sub-tree of the root directory identified by the sub-tree flag includes metadata whose access frequency reaches a preset threshold within a preset time period.
With reference to the third aspect, in a third possible implementation manner, the method further includes: before the sub-tree searching unit searches the directory attributes of the directories in the metadata, the attribute setting unit is used for selecting a plurality of sub-trees with the same load change trend as a concurrency group from at least two sub-trees of which the directory attributes of the root directory have sub-tree marks, and setting a concurrency group mark in the directory attribute of the root directory of each sub-tree in the concurrency group; the sub-tree searching unit is specifically configured to search directory attributes of directories in the metadata, select a directory having the load information and a sub-tree mark in the directory attributes, and determine, as the target directory, all sub-trees using the target directory as a root directory and using the directory having the concurrent group mark as a target sub-tree to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
With reference to any one of the first possible implementation manner to the third possible implementation manner of the third aspect, in a fourth possible implementation manner, the sub-tree searching unit is specifically configured to select, from the concurrent group that includes the largest number of sub-trees, a directory that includes not only the load information and the sub-tree flag but also the concurrent group flag in directory attributes as the target directory.
With reference to the third possible implementation manner of the third aspect, in a fifth possible implementation manner, the method further includes: and the attribute acquisition unit is used for receiving the attributes set by the user through an Application Programming Interface (API), wherein the attributes comprise the subtree mark and the concurrency group mark.
In a fourth aspect, a computer program product for repairing data is provided, comprising a computer readable storage medium storing program code;
the program code includes instructions for obtaining load information to be migrated; searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with the load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark; and migrating the target subtree integrally.
The metadata management method and the metadata management device have the technical effects that: compared with the metadata dispersed migration mode in the prior art, the method has the advantages that metadata operation across MDSs can be effectively reduced, information transmission time among different MDSs is saved during the MDS crossing process, and metadata distribution management efficiency is improved.
Drawings
FIG. 1 is a diagram illustrating a metadata distribution of a file system according to an embodiment of a metadata management method of the present invention;
FIG. 2 is a flowchart illustrating a metadata management method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a metadata management method according to another embodiment of the present invention;
FIG. 4 is a diagram illustrating a metadata distribution of a file system according to another embodiment of the metadata management method of the present invention;
FIG. 5 is a first chart illustrating comparison of overall performance of an MDS cluster according to an embodiment of the metadata management method of the present invention;
FIG. 6 is a second chart illustrating comparison of overall performance of an MDS cluster according to an embodiment of the metadata management method of the present invention;
FIG. 7 is a diagram illustrating a distribution of MDS cluster loads before improvement of an embodiment of a metadata management method according to the present invention;
FIG. 8 is a diagram illustrating an improved MDS cluster load distribution according to an embodiment of the metadata management method of the present invention;
FIG. 9 is a diagram illustrating an embodiment of a metadata management apparatus according to the present invention;
FIG. 10 is a schematic structural diagram of another embodiment of a metadata management apparatus according to the present invention;
FIG. 11 is a diagram illustrating a structure of a computing node for metadata management according to an embodiment of the present invention.
Detailed Description
In order to make the metadata management method according to the embodiment of the present invention easier to understand, first, some basic concepts related to the embodiment of the present invention are described with reference to fig. 1, where fig. 1 is a metadata distribution diagram of a file system applied by the embodiment of the metadata management method according to the present invention. The metadata management of the embodiment is actually to manage metadata in a file system, specifically to manage metadata distribution; the metadata is management information about a directory or a file, and the like, for example, a name, an attribute, a hierarchical relationship, and the like of the directory or the file.
Referring to fig. 1, a Metadata server (MDS) cluster is taken as an example, and the cluster includes a plurality of MDSs, and each MDS is responsible for managing Metadata of one machine; the metadata is tree information composed of interconnected blocks shown in fig. 1, for example, k1, bucket _1, etc. The ranges of the three storage spaces are outlined in fig. 1 with dashed boxes belonging to MDS1, MDS2 and MDS3, respectively, e.g. in dashed box p1 is metadata stored in MDS1, in dashed box p2 is metadata stored in MDS2, and in dashed box p3 is metadata stored in MDS 3. Metadata that is not within the dashed box is metadata on other MDS's.
The metadata of the MDS cluster is stored in a tree structure as shown in fig. 1, and the metadata of the tree structure relates to the following concepts, wherein it should be noted that directories, files, subtrees, nodes and the like mentioned below all refer to metadata, and the embodiment of the present invention discusses a distribution structure of metadata, for example, the mentioned file k _ e.avi in fig. 1 is actually metadata of the file k _ e.avi; in addition, the load mentioned in the embodiment of the present invention also refers to the load for metadata access, but the update of the metadata load is also caused by the actual access to the file or directory.
Catalog: such as kobe, james, bucket _1, etc. shown in FIG. 1;
file: for example, k _ e.avi, k _ d.avi, etc. shown in fig. 1, the files are located under directories, for example, k _ e.avi is located under directory k3, the upper level directory of the directory k3 is k2, the upper level directory of the directory k2 is k1_1, and so on, the directory "/" at the top of the tree structure is the total root directory;
the above files and directories and the hierarchical relationship between them (i.e. the file is located under a certain directory) constitute the storage path of the file; for example, the storage path of the file k _ b.avi is/kobe/bucket _1/k1/k _ b.avi. For another example, no file has been placed under directory packet _2 in fig. 1.
Subtree: if the directory "/" at the top of the tree structure in FIG. 1 is compared to the root of a big tree, any branch of the big tree can be called a sub-tree;
for example, the whole of the directory kobe and its subordinate metadata (including all directories and files) is called a sub-tree, the whole of the directory k1 and its subordinate metadata is called a sub-tree (i.e. k1-k _ b.avi), and the whole of the directory k2 and its subordinate metadata is called a sub-tree (which includes k2, k3, k _ e.avi, k _ c.avi). It can be seen that a subtree, i.e. a directory as a root, includes the directory and all branches connected to the directory, and this whole is called a subtree.
In the following migration of metadata, the migration of the subtree is described, because the migration of metadata is in units of subtrees.
And (3) node: each block in fig. 1 is referred to as a node, which may be a directory or a file; for example, k _ e.avi, bucket _1, kobe, etc. are all a node;
node attributes and load: the attributes of the node include, for example, creation time, load value, etc., each directory or file has a creation time, and the creation time is an attribute of the node; the load value can be referred to the numbers marked on the upper left corner of each node in fig. 1, such as 31 on the upper left corner of bucket _1, 71 on the upper left corner of kobe, and the like. The load value of a node represents the current load of the node, and is a parameter representing the access amount of a file under the node, and generally, the load is higher as the access amount of the file is larger, so the load value of the node is actually the load borne by the MDS in which the node is located, and of course, the load of the entire MDS is the sum of the load values of all nodes located above the node.
For example, the load value of bucket _1 is 31, and the load value of kobe is 71; for example, as can be seen from fig. 1, the load value of a file k _ e.avi is under the directory k3, the load value of the file k _ e.avi is 2 (the load value is a characteristic value calculated according to some algorithm according to the file access amount, the access frequency, and the like), and the load value of the directory k3 is the same as the load value of the file k _ e.avi; and the load value of the directory k2 is the sum of the load values of the files k _ e.avi and k _ c.avi under the directory k2, and the load value of k2 is equal to the load value of k3 + k _ c.avi.
And (3) migration of subtrees: FIG. 1 includes many subtrees that belong to different MDSs; for example, the sub-tree bucket _1/k1/k _ b.avi is located on MDS1, that is, metadata representing file k _ b.avi, directory k1 and directory bucket _1 are managed and served by MDS1, and a larger access amount to the metadata of the file k _ b.avi, directory k1 and directory bucket _1 will cause a larger load on MDS 1; while sub-tree k2_1/k _ d.avi is stored on MDS2, i.e., metadata representing file k _ d.avi and directory k2_1 is stored on MDS2, a larger amount of access to the file k _ d.avi or directory k2_1 will cause a larger load on MDS 2. The load of an MDS refers to the load of metadata on the MDS, which is also caused by the access of the actual directory or file to which the metadata corresponds. Examples are as follows: assuming that the file k _ b.avi is actually accessed, the load value of the corresponding metadata of the file k _ b.avi stored on the MDS1 will increase, and after the load value of the metadata of the file k _ b.avi increases, the load value of the upper directory thereof will also increase, for example, the metadata load value of the directory bucket _1/k1 increases; alternatively, assuming that directory k2_1 is accessed, the load value of the corresponding metadata stored on MDS2 for that directory k2_1 will increase.
To maintain load balancing among the multiple MDSs, it is common to migrate a portion of the load to another MDS when the load of the current MDS is large, in effect migrating metadata for certain files and directories to another MDS, such that access to the migrated files and directories by the other MDS serves to reduce the load of the current MDS. During load migration, the storage location of the metadata of the file and directory changes, such as from MDS2 to MDS 3. It should be noted that the metadata migration is performed in units of subtrees, such as the subtree k2_1/k _ d.avi can only be migrated. The migration of the subtree is to change the storage location of the subtree, which is equivalent to enclosing the subtree k2_1/k _ d.avi by the dashed box p3 in fig. 1, but the connection relationship of the subtree in the whole tree structure is unchanged, for example, the subtree k2_1/k _ d.avi is still connected under the directory k1_ 1.
On the basis of the above description, the metadata management method of the embodiment of the present invention is described in detail below:
example one
Fig. 2 is a flowchart illustrating an embodiment of a metadata management method according to the present invention, as shown in fig. 2, the method may include:
201. acquiring load information to be migrated;
the load information refers to a load value, such as a number indicated in the upper left corner of each node shown in fig. 1, for example, 31 in the upper left corner of bucket _ 1.
The load information to be migrated is acquired in this embodiment, for example, as follows: the cluster comprises three MDSs, each MDS updates the load value in corresponding metadata according to access information such as the access quantity of files and directories which are responsible for the MDS, the difference between the total load value carried by the MDS and the load values of other MDSs can be regularly detected, when the difference exceeds a certain threshold value, the load distribution among the MDSs is unbalanced, the MDS with the heavier load starts load balancing processing, and a part of the load of the MDS is transferred to the other MDSs. Actually, the load migration is the migration of a certain subtree in the metadata, and is to migrate the access corresponding to the subtree to another MDS, and the other MDS serves the access, so that the load carried by the corresponding MDS with the heavier load is reduced.
In this embodiment, for example, the load information to be migrated acquired by a certain MDS is 7 load values, that is, 7 load values need to be migrated to other MDSs.
202. Searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
after determining load information to be migrated in 201, directory attributes of directories in the metadata are searched, and directories with load information matched with the load information to be migrated and sub-tree marks in the directory attributes are selected as target directories; the load information in the directory attribute of the target directory indicates a load value corresponding to the largest subtree having the target directory as the root directory. It should be noted that the matching means that the load value corresponding to the sub-tree is not necessarily equal to the load information, and may be similar, and the specific numerical close range may be determined by a fuzzy matching algorithm of common load balancing, and the embodiment of the present invention is not described in detail.
Examples are as follows: referring to fig. 1, assuming that the MDS2 needs to migrate its own 7 load values, it can be seen in fig. 2 that the load of the subtree k2_1/k _ d.avi is 7 load values (the load value marked at the top left corner of the directory k21 is 7, and the load value is also one of the attributes of the directory k2_1, which may be referred to as a load attribute); at this time, the load value corresponding to the sub-tree is equal to the load information. If the MDS2 wants to migrate 8 load values of itself, and if there is no subtree with completely equal load values, the subtree k2_1/k _ d.avi with load value of 7 can also be migrated, and the values of 8 and 7 are close, and the load value corresponding to the subtree can be considered to match the load information.
It should be noted that, in the embodiment, the selected migrated sub-tree not only matches the load value with the load information, but also includes a sub-tree flag attribute in the attribute of the root directory of the sub-tree, i.e. the directory k2_ 1.
The concept of a subtree has been described in conjunction with fig. 1, in which a subtree, i.e. a directory as a root, includes the directory and all branches connected to the directory, and this whole is called a subtree; the directory that is the root is referred to herein as the root directory. For example, the root directory of the sub-tree k2_1/k _ d.avi is directory k2_ 1; the subtree consisting of two branches k2/k3/k _ e.avi and k2/k _ c.avi, whose root directory is directory k 2. The selected subtree to be migrated in this embodiment must have a subtree flag at its root directory.
The subtree flag is used to indicate that all subtrees with the target directory as the root directory can only be migrated in an integral manner, and this embodiment determines all subtrees with the target directory as the root directory as target subtrees to be migrated.
Optionally, the target sub-tree with the target directory as the root directory in this embodiment may have the following characteristics: the target sub-tree includes metadata corresponding to the smallest unit of the operation scope of the service access, for example, when a user executes a service, an application (i.e., an application program used by the user) is used, the access of the application to the directory or file corresponding to the sub-tree is basically the access inside the sub-tree, and the target sub-tree includes all metadata corresponding to the application access operation (the all metadata may be equivalent to the smallest unit of the operation scope of the service access). Or, when the user executes a service, two applications a and B are used, where a relates to a change of a part of metadata, B relates to a change of another part of metadata, and the two parts of metadata have an intersection, the metadata related to a and the metadata related to B can be taken as a whole (the whole may be equivalent to the minimum unit of the service access operation range), and the whole is completely included in the target sub-tree. Or, there is no intersection between the metadata related to the two applications a and B, the metadata related to the application a may be taken as a whole (the whole may be equivalent to the minimum unit of the service access operation range), and the target sub-tree completely includes the whole. The above are merely examples, and the present invention is not limited thereto.
An alternative meaning of the metadata included in the target subtree can be described by taking the subtree in fig. 1 as an example: for example, in the sub-tree of fig. 1 including two branches k2/k3/k _ e.avi and k2/k _ c.avi, if the root directory k2 of the sub-tree is set with a sub-tree flag attribute, the sub-tree is a whole; the access of the user application to the sub-tree includes, for example: the user moves the file k _ e.avi from under the directory k3 to under the directory k2, or moves the file k _ c.avi from under the directory k2 to under the directory k3, or creates one more file under the directory k3, and so on. It can be seen that these operations are operations inside the subtree, and do not involve any subtree outside the subtree that already includes metadata corresponding to the application access operation, for example, the application access operation moves k _ e.avi from under directory k3 to under directory k2, and the metadata involved are k _ e.avi, k3, k2, and these metadata are inside the subtree; assuming that the directory k3 is set as the sub-tree flag, the corresponding sub-tree k3/k _ e.avi actually does not include the metadata k2 involved in the application access operation, i.e. only includes a part of the metadata, which is not the case with the metadata included in the target sub-tree described in this embodiment.
Optionally, the target sub-tree with the target directory as the root directory in this embodiment may further have the following characteristics: the metadata included in the target subtree corresponds to the experience migration rules; that is, the subtree flag may be set according to the characteristics of the metadata change in each MDS, for example, if it is found that a certain portion of metadata is often accessed simultaneously within a period of time, and the access frequency of the portion of metadata reaches a preset threshold within a preset time period, for example, the preset threshold is 50 times, and the access frequency of the metadata reaches 50 times within the preset time period, it may be determined that the subtree including the metadata is taken as a target subtree as a whole.
The overall migration of the subtree corresponding to the directory in which the subtree flag is set means that, taking the subtree comprising two branches k2/k3/k _ e.avi and k2/k _ c.avi as an example, the action of the subtree flag attribute is explained: assuming that the root directory k2 of the subtree has no subtree flag attribute set, the branch of the subtree can be actually split, for example, the branch k3/k _ e.avi is migrated separately (the branch is actually a subtree), that is, 2 load values are migrated; however, if the root directory k2 is set with the sub-tree flag attribute, it indicates that the sub-tree including two branches k2/k3/k _ e.avi and k2/k _ c.avi can not be split any more, but can only be migrated as a whole, and can not migrate only a part of it.
In a specific implementation, when a subtree to be migrated is searched, this embodiment is performed by searching for a directory in the metadata, and if an attribute of a certain directory includes a subtree flag and a load attribute of the directory matches with the load information, it indicates that a load value of a subtree using the directory as a root directory is a load value to be migrated, and the subtree can only be migrated in an integrated manner, which is the subtree to be migrated, and this embodiment refers to the searched root directory of the subtree as a target directory.
203. And migrating the target subtree integrally.
The metadata management method of the embodiment is to improve the metadata management efficiency of the file system by the following steps: compared with the migration mode of metadata dispersion in the prior art, the migration mode of metadata dispersion always can effectively reduce metadata operations across MDSs because the metadata included in the target subtree is always migrated in an integral mode, for example, when some metadata related to change are actually metadata inside the target subtree, the metadata only needs to be changed inside the target subtree, and operations across MDSs are not related, so that the time of information transmission among different MDSs during the MDS crossing is saved, and the efficiency of metadata distribution management is improved.
That is, on one hand, the distribution of metadata is managed by the MDS itself, which is responsible for load migration when the load is heavy; when the metadata is distributed among the MDSs, the subtrees where the root directories with the subtree marking attributes are located are integrally placed in the same MDS, but the parts of the subtrees of the type are not respectively placed in different MDSs, so that when the metadata is distributed, the subtrees where the directories with the subtree marking attributes are used as the root directories are integrally placed in the same MDS according to the subtree marking attributes. On the other hand, in the load balancing process, even if a subtree including a target directory is to be migrated, the subtree is migrated as a whole without splitting only a part of the subtree, so that the whole subtree is always in the same MDS and is not located in different MDSs respectively.
The embodiment of the effect is illustrated by taking the example that the metadata included in the target subtree is all metadata related to a certain application: assuming that the access operation of a certain user application is to move a file k _ e.avi from directory k3 to below directory k2_1, correspondingly, due to the change of the storage path of the file, the corresponding metadata (which can be understood as representing the storage path of the file) also changes accordingly, specifically, in fig. 1, the link of k _ e.avi in the tree structure in fig. 1 to directory k3 is changed to link to directory k2_ 1; if the sub-tree k2_1/k _ d.avi is located on the MDS3 and the sub-tree k3/k _ e.avi is located on the MDS2, when the metadata distribution changes, the MDS2 sends metadata of the file k _ e.avi to the MDS3 through a cross-MDS operation, the MDS3 stores the updated metadata, i.e., the sub-tree k2_1/k _ e.avi, and the MDS3 provides a file storage path service when accessing the subsequent file k _ e.avi, wherein the file access is actually the load of the MDS 3. If the sub-tree k2_1/k _ d.avi and the sub-tree k3/k _ e.avi are both located in the MDS2, the metadata distribution change is operated in the MDS, so that the time for information transmission between different MDSs during the cross-MDS is saved, and the efficiency of metadata distribution update is improved.
In the following second and third embodiments, the implementation of the metadata distribution management method according to the embodiment of the present invention will be described in detail by using two optional specific examples.
Example two
FIG. 3 is a flowchart illustrating another embodiment of a metadata management method according to the present invention, which mainly describes how to migrate a certain load value from a certain MDS; fig. 4 is a file system metadata distribution diagram of another embodiment of the metadata management method of the present invention, and fig. 4 shows a metadata distribution structure on one of the MDSs in the cluster. As shown in fig. 3 and 4, the method includes:
301. setting subtree mark attribute for directory;
the cluster file system which uniformly manages the metadata of the cluster comprises a plurality of MDSs, before distributing the cluster metadata among the MDSs, the attribute of the metadata can be set firstly, and the task of setting the metadata attribute is executed by a metadata control module of the cluster; the metadata control module is responsible for setting attributes for the metadata and distributing the metadata to the multiple MDSs in the cluster. After the initial metadata distribution is completed, the subsequent metadata distribution processing in the cluster operation process, such as load balancing, is executed by each MDS, and each MDS manages the metadata stored by itself and manages the metadata distribution according to the load.
In this embodiment, the metadata control module sets a sub-tree flag attribute for a directory in the metadata; the setting principle of the subtree mark attribute is as follows: and taking the metadata related to the application access as a whole, and setting a subtree mark attribute for the root directory of the subtree corresponding to the whole to ensure that the access of the application corresponds to the metadata change in the subtree taking the target directory as the root directory.
Examples are as follows: when a user enables a certain application program, the access operation corresponding to the application program comprises moving the file k _ e.avi from the directory k3 to the directory k2_1, or moving the file k _ c.avi from the directory k2 to the directory k3, or newly creating a file under the directory k3, and the like. According to the access characteristics of the application program, if the metadata comprising k1_1, k2_1, k2, k3, k _ e.avi, k _ c.avi and k _ d.avi are taken as a whole, the access of the application program corresponds to the metadata change in the whole; the whole is a subtree, and the root directory k1_1 of the whole is set with a subtree flag attribute, wherein the subtree flag is used for indicating that the subtree corresponding to the root directory k1_1 can only be migrated in a whole manner.
As can also be known from the above, actually, the subtree flag attribute is set according to the access characteristics of the application, and based on this, the following two attribute setting modes are set in this embodiment:
one way is to provide the user with an Application Programming Interface (API), and the user directly sets the attribute of the directory according to the access characteristics of the user Application; i.e. if the application has the above-mentioned access feature, the user can set the attributes of the directory to the subtree flag accordingly.
In another mode, the metadata control module automatically executes the setting of the directory attribute; for example, a user may set an attribute setting policy for the metadata control module, where the policy is, for example, "set a subtree that changes metadata accessed by a corresponding application into changes inside the subtree as a whole", or the like; after the policy is set, the metadata control module may detect an access characteristic of an application access operation that occurs during the operation of the cluster, and if the policy is satisfied, the metadata control module may set an attribute of the root directory of the entire subtree as a subtree flag according to the policy.
In specific implementation, referring to fig. 4, in order to facilitate subsequent searching which directories have subtree flag attributes, a subtree queue may be set, where each MDS separately establishes a subtree queue corresponding to its own storage metadata in its inside, and the subtree queue includes all directories in which the subtree flag attributes have been set; alternatively, the MDS sets a subtree queue that includes all of its metadata and inserts a pointer in the subtree queue to the directory for which the subtree flag attribute is set. For example, in the present embodiment, the directories k1_1, j1, and j2 set the subtree flag attribute.
The metadata control module also sets other attributes for the metadata, such as creation time, load value and the like of a directory or a file; after setting attributes for the cluster metadata according to the rules, the metadata control module distributes the metadata to each MDS, wherein the subtrees corresponding to the root directories with the subtree flag attributes set are distributed in the same MDS as a whole.
It should be noted that, this embodiment only takes setting the subtree flag attribute at the beginning as an example for description; however, the specific implementation is not limited to this, and the setting of the metadata attribute includes, for example, setting of the subtree flag attribute, and the setting time is not limited, and the subtree flag attribute may also be set at any time during the operation of the MDS system according to the characteristics of the application. When the subtree marking attribute is set in the system operation, if the whole subtree taking the directory of the subtree marking attribute as the root directory is distributed on different MDSs, the system can migrate the whole subtree to the same MDS according to the subtree marking attribute. The subtree marking property is set in the system operation in the same way as described above, for example, in an API way or automatically.
302. Acquiring load information to be migrated;
in the running process of the cluster, each MDS manages the distribution of the stored metadata and executes load balancing processing among the MDSs.
Taking the MDS1 shown in fig. 4 as an example, the MDS1 updates the load value attribute of metadata stored by itself according to access of an application, for example, an application enabled by a user currently accesses a file k _ b.avi for multiple times, the MDS1 provides a path service for the file access, that is, the application accesses the file k _ b.avi according to a stored file path, and accordingly, the MDS1 updates the load value of the metadata related to the access, specifically, for example, the load value of the file k _ b.avi is updated according to the file access, which is increased from 10 to 12, and accordingly, the upper directory of the file k _ b.avi, such as the directory k1, the bucket _1, the kobe, and/or both, is updated, and two load values are increased.
In addition to updating the load values, MDS1 periodically detects load differences between MDSs; in a cluster including multiple MDSs, each MDS is in communication with each other, and can interact with its respective load information and the like, so that the MDS1 can acquire the load information of other MDSs, compare its own load information with the load information of other MDSs, and determine whether a trigger condition for performing load balancing is reached. For example, it may be set that if the load difference between the load of MDS1 and some other MDS reaches 20 load values, load balancing is triggered to balance the load distribution among the MDSs. In this embodiment, it is assumed that the MDS1 determines that its load is too heavy through detection, and 8 load values need to be migrated to other MDSs, that is, the obtained load information to be migrated is 8 load values.
303. Searching whether a target directory exists or not, wherein the attribute comprises a subtree mark and the load attribute is matched with the load information;
specifically, in this step, whether a target directory exists in the metadata is searched, the attributes of the target directory include a subtree flag, and the load attributes of the target directory are matched with the load information.
Still taking the MDS1 in fig. 4 as an example, after determining load information to be migrated, the MDS1 preferentially searches a directory in the metadata stored therein, in which the sub-tree flag attribute is set, and determines whether the load attribute (i.e., the load value) of the directory matches with the load information to be migrated. In this embodiment, a subtree with a load value equal to the load information will be searched preferentially.
Examples are as follows: MDS1 will look preferentially in directories k1_1, j1, and j2 because all three directories have the subtree flag attribute set; and judging whether the load attribute of the three directories is matched with the load information to be migrated. It is judged that the load values of these three directories are not 8, and therefore, there is no suitable directory.
In this step, if the target directory exists after the search, that is, the attribute is a subtree flag whose load attribute is the same as the load information, the execution continues to 305, and the subtree with the target directory as a root directory is migrated as a whole; otherwise, if the target directory is found not to exist through the search, the execution continues to 304.
304. Traversing from the dynamic subtree root of the MDS to find a proper directory, and entering a recursion;
the dynamic subtree root of the MDS refers to directories kobe and james in the MDS1, as can also be seen from fig. 4, the two directories are the starting root directories of all metadata in the MDS1, and other directories or files are partitioned and extended from the two directories, so that the two directories are called subtree roots; furthermore, the subtrees of the present embodiment are all divisible, for example, a directory is divided into two subdirectories, and therefore, the subtrees are called dynamic subtree roots.
The traversal searches for a proper directory, and the recursion is entered, that is, whether a directory with a load value matched with the load information exists is searched step by step along the directory hierarchy shown in fig. 4; when a directory with a subtree flag attribute is encountered during the search, it is stopped and returned to the upper level directory.
Examples are as follows: when the load information to be migrated is 8 load values, the MDS1 looks down from the directory kobe and determines whether the directory load value is 8, for example, kobe-bucket _ 1-k 1-k _ b.avi, kobe-bucket _ 1-k 1_1, etc., in the following order, where actually in the kobe-bucket _ 1-k 1-k _ b.avi path, the load value of the directory k1 is found to be 8, and then proceeds to 305.
Assuming that the MDS1 searches kobe-bucket _ 1-k 1_1 paths first, when finding the directory k1_1, it finds that the directory k1_1 has a sub-tree flag attribute, and then it will not continue to find sub-directories or files (such as directory k2, file k _ e.avi, etc.) under the directory k1_1, because the sub-trees of the root directory having the sub-tree flag attribute are a whole, it is not necessary to continue to find inside the whole sub-tree. The MDS1 will stop at directory k1_1, return to the previous level directory, i.e., packet _1, and seek from that packet _1 along the path of packets _ 1-k 1-k _ b.
305. And selecting a subtree matched with load information to be migrated, and migrating.
Wherein, as described above, the load value of directory k1 is 8, the subtree k1-k _ b.avi is determined to be the migrated subtree.
It can be seen from the above flow that, when a certain load value is to be migrated from a certain MDS, a directory with subtree flag attributes set therein is preferentially searched; moreover, the subtrees in the subtrees of the root directory having the subtree flag attribute are not selected, i.e., the subtrees of the root directory having the subtree flag attribute are a whole, and the whole is migrated, which is actually equivalent to a "static subtree" because the subtree is not divided any more. The mode avoids the condition of over fragmentation of metadata partition, is beneficial to enabling the metadata corresponding to the application access operation to be included in the same sub-tree, enables the metadata change of the application access to be carried out in one MDS, avoids the condition of MDS crossing, and effectively reduces the migration frequency of the sub-tree.
EXAMPLE III
The second embodiment also describes how to migrate a certain load value from a certain MDS, but the second embodiment is mainly different from the second embodiment in that the second embodiment further sets a concept of a concurrency group to further improve the effect of load balancing on the basis of improving the efficiency of metadata distribution management.
First, the concept of concurrency groups is explained: selecting a plurality of subtrees with the same load change trend from at least two subtrees with the subtree flag attributes set as a concurrency group, and setting the concurrency group attributes of the root directory of each subtree in the concurrency group. In one aspect, the concurrency group includes a plurality of subtrees, and the root directory of each subtree is a directory having a subtree flag attribute set. On the other hand, the multiple subtrees in the concurrency group are characterized by the same load change trend.
The same load change trend refers to, for example, that, assuming that there are two subtrees in the concurrency group, the load values of the two subtrees always increase greatly within a certain time, or both decrease greatly within a certain time, for example, within a certain one-hour interval, the load values of the two subtrees both increase by 10, or alternatively, one subtree increases by 9 load values and the other subtree increases by 10 load values, that is, it is emphasized that the load values of the two subtrees both increase, and specific values for the increase may be somewhat biased, while the other subtrees do not change substantially; alternatively, the load values of both subtrees decrease by 8 load values within a certain time.
The same load change trend generally indicates that the files corresponding to the two subtrees are usually accessed at the same time, so that the load change occurs at the same time; taking fig. 4 as an example, for a subtree with directory k1_1 as a root directory and a subtree with directory j1 as a root directory, the two subtrees belong to a concurrent group, and when a user is starting an application, the user accesses both files k-e.avi and files james.avi, which are linked, so that the load values of the two subtrees always synchronously rise or fall, which is "concurrent", and the user application accesses a plurality of subtrees in the same group at the same time.
If there are concurrent groups in the same MDS, which indicates that the load of the MDS varies greatly, for example, if there is a concurrent group including 5 subtrees in the MDS, it may happen that the load values of the 5 subtrees all increase within a certain time period, so that the MDS may rapidly experience a large load value increase, and the burden of the MDS is large; therefore, in this embodiment, the concurrent group attribute is set to distinguish such subtrees, and the set rule is that if a subtree is to be migrated, the subtree in the concurrent group is preferentially selected as much as possible to be migrated, so that the subtree in the concurrent group is prevented from aggravating the burden of the MDS in the same MDS.
Taking the subtree with 15 load values to be migrated in fig. 4 as an example, the searching process of the subtree to be migrated is the same as the embodiment, and is not detailed, and only some steps related to concurrent groups in the searching process are described: for example, in the process of preferentially searching for a directory with a subtree flag set, when it is found that all of directory k1_1, directory j1 and directory j2 have the subtree flag attribute set, and the load values of directory k1_1 and directory j2 are all 15, it is determined whether to select the subtree corresponding to directory k1_1 or the subtree corresponding to directory j 2.
Specifically, it is determined whether directory k1_1 and directory j2 have a concurrency group attribute, and if so, it indicates that the subtree using the directory as the root directory is a subtree in the concurrency group, and the subtree is preferentially selected. For example, the directory k1_1 has a concurrency group attribute, and the sub-tree corresponding to the directory k1_1 and the sub-tree corresponding to the directory j1 belong to the same concurrency group; then the subtree with directory k1_1 as the root directory is preferably selected for migration, where directory k1_1 is also referred to as the target directory to be searched.
The burden of the MDS can be segmented by setting the attribute of the concurrency group for the directory and preferentially migrating the subtrees in the concurrency group; for example, if the subtree corresponding to the directory k1_1 in fig. 4 is migrated from the subtree corresponding to the directory j1, even if the load change trends of the two subtrees are the same, for example, the load values increase within a certain period of time, the load values increase in the two subtrees respectively, and do not increase only in the MDS1, so that the load distribution before each MDS is more balanced. The setting mode of the concurrent group attribute in this embodiment is the same as the setting mode of the subtree flag attribute, and is not described again.
Furthermore, there may be multiple concurrency groups in the MDS, and it is assumed that there are suitable subtrees in both of the two concurrency groups, that is, the load attribute is the same as the load information to be migrated, how to select a subtree in the two concurrency groups at this time, the rule set in this embodiment is to determine the concurrency group in which the selected subtree is located, and the concurrency group including the largest number of subtrees in the multiple concurrency groups; for example, if one of the concurrency groups includes 5 subtrees and the other concurrency group includes 2 subtrees, the subtrees in the concurrency group including 5 subtrees are preferably selected, because the more subtrees are included, the greater the MDS burden is caused when the loads all rise.
The embodiment of the invention sets the sub-tree mark attribute and the concurrency group attribute for the directory, so that the management efficiency of metadata distribution is higher, the load balancing effect is better, and the overall performance of the MDS cluster is improved.
For example, by setting the subtree flag attribute, and setting the whole subtree corresponding to the root directory of the subtree flag attribute in the same MDS, operations across MDSs can be reduced, and metadata change is executed inside the MDS, so that the time for information transmission during operations across MDSs is saved, and the MDS can process more metadata distribution management tasks.
Referring to fig. 5 and 6, fig. 5 is a first MDS cluster overall performance comparison chart of the embodiment of the metadata management method of the present invention, fig. 6 is a second MDS cluster overall performance comparison chart of the embodiment of the metadata management method of the present invention, and the ordinate of the two tables represents the load value. Such as file creation (file creation) load, file deletion (filemental) load, directory creation (directory creation) load, directory deletion (directory deletion) load, subtree creation (tree creation) load and subtree deletion (tree deletion) load, and file start (file start) load and directory start (directory start) load, which are shown in the figure, and are loads corresponding to the MDS cluster receiving and processing the related application access operation, such as loads caused by processing the file creation operation. The load is increased after improvement compared with the load before improvement, which indicates that the application access operation tasks processed by the improved MDS cluster are increased compared with the tasks processed by the improved MDS cluster, for example, only 2 file creation operations can be processed before improvement in a certain period of time, and 5 file creation operations can be processed after improvement, thereby obviously improving the performance of the MDS.
For another example, by setting the attribute of the concurrent group, the subtrees belonging to the same concurrent group are preferentially selected for migration, so that the multiple subtrees in the concurrent group can be distributed in multiple MDSs, and the problem of excessive MDS burden caused by concentration in the same MDS is avoided. After the load balancing processing, the burden of the MDS is relieved, the metadata processing efficiency of the MDS is improved, and higher application access operation can be processed; for example, the MDS1 in fig. 4 has a heavier load on the MDS1 due to the presence of the concurrent groups before the improvement, so that the processing efficiency of the MDS1 is reduced, and after the improvement, due to the load balancing, the burden on the MDS1 is reduced, the metadata processing efficiency is improved, and the processing task amount is increased.
Referring to fig. 7 and 8, fig. 7 is an MDS cluster load distribution diagram before improvement of an embodiment of the metadata management method of the present invention, and fig. 8 is an MDS cluster load distribution diagram after improvement of an embodiment of the metadata management method of the present invention, and the ordinate of the two tables represents a load value. As can be seen from the graph, the load distribution of the MDS0 and the MDS1 before improvement is unbalanced, the bar graphs are uneven, and the processing load value shown by the ordinate is also lower, which indicates that the processing task amount is less; after improvement, the load distribution of MDS0 and MDS1 is balanced, the load balancing effect is obviously improved compared with that before the improvement, the processing load value displayed by the ordinate is also greatly improved compared with that before the improvement, and the performance of MDS is improved.
It should be noted that, in the embodiment of the present invention, the method for managing metadata distribution is described with MDS of the clustered file system as an object, but in a specific implementation, the method is not limited to an MDS scenario, and the method of the embodiment of the present invention can be used in other systems that need to manage metadata service distribution of the file system.
Example four
Fig. 9 is a schematic structural diagram of an embodiment of a metadata management apparatus according to the present invention, which may execute the method according to any embodiment of the present invention, and as shown in fig. 9, the apparatus may include: a load determining unit 91, a subtree searching unit 92 and a subtree migrating unit 93; wherein,
a load determination unit 91 for acquiring load information to be migrated;
a sub-tree searching unit 92, configured to search directory attributes of directories in the metadata, select a directory having load information matching the load information to be migrated and a sub-tree mark as a target directory, and determine all sub-trees using the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and a subtree migration unit 93, configured to migrate the whole target subtree.
Further, the sub-tree of the root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of the service access operation range.
Further, the sub-tree of the root directory identified by the sub-tree flag includes metadata whose access frequency reaches a preset threshold within a preset time period.
Fig. 10 is a schematic structural diagram of another embodiment of a metadata management apparatus according to the present invention, and in this embodiment, based on the structure shown in fig. 9, the metadata management apparatus further includes: before the sub-tree searching unit searches for the directory attribute of the directory in the metadata, the attribute setting unit 94 is configured to select, from at least two sub-trees whose directory attributes of the root directory have sub-tree flags, a plurality of sub-trees having the same load change trend as a concurrency group, and set a concurrency group flag in the directory attribute of the root directory of each sub-tree in the concurrency group;
the sub-tree searching unit 92 is specifically configured to search directory attributes of directories in the metadata, select a directory including not only the load information and having a sub-tree flag, but also the concurrent group flag as the target directory, and determine that all sub-trees taking the target directory as a root directory serve as target sub-trees to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
Further, the sub-tree searching unit 92 is specifically configured to select, from the concurrency group including the largest number of sub-trees, a directory including not only the load information and the sub-tree flag but also the concurrency group flag in the directory attributes as the target directory.
Further, the metadata management apparatus of the present embodiment further includes: an attribute obtaining unit 95, configured to receive an attribute set by a user through an application programming interface API, where the attribute includes the subtree flag and the concurrency group flag.
EXAMPLE five
Fig. 11 is a schematic structural diagram of a computing node for metadata management according to an embodiment of the present invention, and as shown in fig. 11, this embodiment provides a schematic diagram of a computing node 700. The computing node 700 may be a host server containing computing capability, or a Personal Computer (PC), or a portable computer or a terminal, etc., and the specific implementation of the computing node is not limited by the embodiments of the present invention.
The computing node 700 includes: a processor (processor)710, a communication interface 720, a memory 730, and a bus 740. Processor 710, communication interface 720, and memory 730 communicate with each other via a bus 740.
A communication interface 720 for communicating with a network element to receive a program.
Processor 710 for executing program 732. In particular, the program 732 may include program code that includes computer operational instructions.
Processor 710 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.
A memory 730 for storing a program 732. The memory 730 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The procedure 732 may specifically include:
a load determination unit 91 for acquiring load information to be migrated;
a sub-tree searching unit 92, configured to search directory attributes of directories in the metadata, select a directory having load information matching the load information to be migrated and a sub-tree mark as a target directory, and determine all sub-trees using the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and a subtree migration unit 93, configured to migrate the whole target subtree.
The specific implementation of each unit in the procedure 732 refers to corresponding units in the embodiments shown in fig. 9-10, which are not described herein again.
Further, the sub-tree of the root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of the service access operation range.
Further, the sub-tree of the root directory identified by the sub-tree flag includes metadata whose access frequency reaches a preset threshold within a preset time period.
Further, the program further includes:
before the sub-tree searching unit searches the directory attributes of the directories in the metadata, the attribute setting unit is used for selecting a plurality of sub-trees with the same load change trend as a concurrency group from at least two sub-trees of which the directory attributes of the root directory have sub-tree marks, and setting a concurrency group mark in the directory attribute of the root directory of each sub-tree in the concurrency group;
the sub-tree searching unit is specifically configured to search directory attributes of directories in the metadata, select a directory having the load information and a sub-tree mark in the directory attributes, and determine, as the target directory, all sub-trees using the target directory as a root directory and using the directory having the concurrent group mark as a target sub-tree to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
Further, the sub-tree searching unit is specifically configured to select, from the concurrency group including the largest number of sub-trees, a directory including not only the load information and the sub-tree flag but also the concurrency group flag in directory attributes as the target directory.
Further, the program further includes:
and the attribute acquisition unit is used for receiving the attributes set by the user through an Application Programming Interface (API), wherein the attributes comprise the subtree mark and the concurrency group mark.
An embodiment of the present invention further provides a computer program product for repairing data, including a computer-readable storage medium storing a program code;
the program code includes instructions for obtaining load information to be migrated; searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with the load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark; and migrating the target subtree integrally.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (19)

1. A metadata management method, comprising:
acquiring load information to be migrated;
searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with the load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and migrating the target subtree integrally.
2. The metadata management method as claimed in claim 1, wherein the sub-tree of the root directory identified by the sub-tree flag includes metadata corresponding to a minimum unit of a service access operation scope.
3. The metadata management method according to claim 1, wherein the subtree of the root directory identified by the subtree flag includes metadata whose access frequency reaches a preset threshold value within a preset time period.
4. The metadata management method according to claim 1, further comprising, before searching for directory attributes of directories in the metadata:
selecting a plurality of subtrees with the same load change trend as a concurrency group from at least two subtrees with the subtree marks on the directory attributes of the root directory, and setting the concurrency group marks in the directory attributes of the root directory of each subtree in the concurrency group;
the selecting directory attribute includes load information matched with the load information to be migrated and a directory with a subtree mark is a target directory, including: and selecting the directory which not only comprises the load information and has a sub-tree mark, but also comprises the concurrent group mark in the directory attributes as the target directory.
5. The metadata management method according to any one of claims 2 to 4, wherein the selecting a directory including not only the load information and the sub-tree flag but also the concurrent group flag in the directory attributes as the target directory comprises:
and selecting a directory which not only comprises the load information and the subtree mark but also comprises the concurrency group mark in directory attributes from the concurrency group comprising the largest number of subtrees as the target directory.
6. The metadata management method according to claim 4, wherein the directory attribute is an attribute set by a receiving user through an Application Programming Interface (API), and the attribute includes the subtree flag and the concurrent group flag.
7. A metadata management apparatus, characterized by comprising:
a load determining unit configured to acquire load information to be migrated;
a sub-tree searching unit, configured to search directory attributes of directories in the metadata, select a directory including load information matching the load information to be migrated and having a sub-tree flag as a target directory, and determine all sub-trees taking the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and the subtree migration unit is used for migrating the whole target subtree.
8. The metadata management apparatus as claimed in claim 7, wherein the subtree of the root directory identified by the subtree flag includes metadata corresponding to a minimum unit of a service access operation scope.
9. The metadata management apparatus as claimed in claim 7, wherein the sub-tree flag identifies a sub-tree of the root directory that includes metadata whose access frequency reaches a preset threshold within a preset time period.
10. The metadata management apparatus according to claim 7, further comprising:
before the sub-tree searching unit searches the directory attributes of the directories in the metadata, the attribute setting unit is used for selecting a plurality of sub-trees with the same load change trend as a concurrency group from at least two sub-trees of which the directory attributes of the root directory have sub-tree marks, and setting a concurrency group mark in the directory attribute of the root directory of each sub-tree in the concurrency group;
the sub-tree searching unit is specifically configured to search directory attributes of directories in the metadata, select a directory having the load information and a sub-tree mark in the directory attributes, and determine, as the target directory, all sub-trees using the target directory as a root directory and using the directory having the concurrent group mark as a target sub-tree to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
11. The metadata management apparatus according to any one of claims 8 to 10,
the subtree searching unit is specifically configured to select, from the concurrency group including the largest number of subtrees, a directory including the load information and the subtree indicator in directory attributes and the concurrency group indicator as the target directory.
12. The metadata management apparatus according to claim 10, further comprising:
and the attribute acquisition unit is used for receiving the attributes set by the user through an Application Programming Interface (API), wherein the attributes comprise the subtree mark and the concurrency group mark.
13. A metadata managed compute node, the compute node comprising: a processor, a communication interface, a memory, and a bus; the processor, the communication interface and the memory complete mutual communication through the bus;
the communication interface is used for receiving a program by the metadata management computing node;
the processor is used for executing programs;
the memory is used for storing programs;
the program includes: the system comprises a load determining unit, a subtree searching unit and a subtree migrating unit;
the load determining unit is used for acquiring load information to be migrated;
the sub-tree searching unit is used for searching directory attributes of directories in the metadata, selecting the directory with the load information matched with the load information to be migrated and a sub-tree mark as a target directory, and determining all sub-trees taking the target directory as a root directory as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark;
and the subtree migration unit is used for migrating the whole target subtree.
14. The metadata managed computing node of claim 13, wherein the subtree of the root directory identified by the subtree indicator includes metadata corresponding to a minimum unit of a business access scope of operation.
15. The metadata managed compute node of claim 13 in which the subtree of the root directory identified by the subtree flag includes metadata having a frequency of access reaching a preset threshold for a preset period of time.
16. The metadata managed compute node of claim 13, wherein the program further comprises:
before the sub-tree searching unit searches the directory attributes of the directories in the metadata, the attribute setting unit is used for selecting a plurality of sub-trees with the same load change trend as a concurrency group from at least two sub-trees of which the directory attributes of the root directory have sub-tree marks, and setting a concurrency group mark in the directory attribute of the root directory of each sub-tree in the concurrency group;
the sub-tree searching unit is specifically configured to search directory attributes of directories in the metadata, select a directory having the load information and a sub-tree mark in the directory attributes, and determine, as the target directory, all sub-trees using the target directory as a root directory and using the directory having the concurrent group mark as a target sub-tree to be migrated; the subtree flag is preset in the directory attribute of the root directory identified by the subtree flag.
17. The metadata managed computing node of any of claims 14-16,
the subtree searching unit is specifically configured to select, from the concurrency group including the largest number of subtrees, a directory including the load information and the subtree indicator in directory attributes and the concurrency group indicator as the target directory.
18. The metadata managed computing node of claim 16, wherein the program further comprises:
and the attribute acquisition unit is used for receiving the attributes set by the user through an Application Programming Interface (API), wherein the attributes comprise the subtree mark and the concurrency group mark.
19. A computer program product for repairing data comprising a computer readable storage medium having program code stored thereon;
the program code includes instructions for obtaining load information to be migrated; searching directory attributes of directories in the metadata, selecting directories with sub-tree marks and load information matched with the load information to be migrated in the directory attributes as target directories, and determining all sub-trees taking the target directories as root directories as target sub-trees to be migrated; the subtree mark is preset in the directory attribute of the root directory identified by the subtree mark; and migrating the target subtree integrally.
CN201280002998.8A 2012-11-27 2012-11-27 Method and device for managing metadata Active CN103688257B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/085344 WO2014082203A1 (en) 2012-11-27 2012-11-27 Metadata management method and device

Publications (2)

Publication Number Publication Date
CN103688257A true CN103688257A (en) 2014-03-26
CN103688257B CN103688257B (en) 2017-04-26

Family

ID=50323329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280002998.8A Active CN103688257B (en) 2012-11-27 2012-11-27 Method and device for managing metadata

Country Status (2)

Country Link
CN (1) CN103688257B (en)
WO (1) WO2014082203A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326040A (en) * 2016-08-27 2017-01-11 浪潮(北京)电子信息产业有限公司 Method and device for managing snapshot metadata
CN106446197A (en) * 2016-09-30 2017-02-22 华为数字技术(成都)有限公司 Data storage method, device and system
CN106777062A (en) * 2016-12-12 2017-05-31 郑州云海信息技术有限公司 A kind of method and device for managing metadata
CN107480310A (en) * 2017-09-29 2017-12-15 郑州云海信息技术有限公司 A kind of metadata cluster catalogue dynamic load balancing method of release and system
CN107798104A (en) * 2017-10-31 2018-03-13 郑州云海信息技术有限公司 A kind of catalog management method, device, equipment and computer-readable recording medium
CN110932935A (en) * 2019-11-26 2020-03-27 深圳前海微众银行股份有限公司 Resource control method, device, equipment and computer storage medium
WO2021004295A1 (en) * 2019-07-05 2021-01-14 中兴通讯股份有限公司 Metadata processing method and apparatus, and computer-readable storage medium
CN112948354A (en) * 2021-03-01 2021-06-11 北京金山云网络技术有限公司 Method and device for creating copy cluster, electronic device and storage medium
CN113055448A (en) * 2021-02-28 2021-06-29 新华三信息技术有限公司 Metadata management method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116479A1 (en) * 2001-02-22 2002-08-22 Takeshi Ishida Service managing apparatus
CN101697526A (en) * 2009-10-10 2010-04-21 中国科学技术大学 Method and system for load balancing of metadata management in distributed file system
CN102055650A (en) * 2009-10-29 2011-05-11 华为技术有限公司 Load balance method and system and management server
CN102523158A (en) * 2011-12-15 2012-06-27 杭州电子科技大学 Metadata server cluster load balancing method based on weight

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101697168B (en) * 2009-10-22 2011-10-19 中国科学技术大学 Method and system for dynamically managing metadata of distributed file system
CN102571772B (en) * 2011-12-26 2014-08-27 华中科技大学 Hot spot balancing method for metadata server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116479A1 (en) * 2001-02-22 2002-08-22 Takeshi Ishida Service managing apparatus
CN101697526A (en) * 2009-10-10 2010-04-21 中国科学技术大学 Method and system for load balancing of metadata management in distributed file system
CN102055650A (en) * 2009-10-29 2011-05-11 华为技术有限公司 Load balance method and system and management server
CN102523158A (en) * 2011-12-15 2012-06-27 杭州电子科技大学 Metadata server cluster load balancing method based on weight

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326040A (en) * 2016-08-27 2017-01-11 浪潮(北京)电子信息产业有限公司 Method and device for managing snapshot metadata
CN106326040B (en) * 2016-08-27 2019-12-31 苏州浪潮智能科技有限公司 Snapshot metadata management method and device
CN106446197B (en) * 2016-09-30 2019-11-19 华为数字技术(成都)有限公司 A kind of date storage method, apparatus and system
CN106446197A (en) * 2016-09-30 2017-02-22 华为数字技术(成都)有限公司 Data storage method, device and system
CN106777062A (en) * 2016-12-12 2017-05-31 郑州云海信息技术有限公司 A kind of method and device for managing metadata
CN107480310A (en) * 2017-09-29 2017-12-15 郑州云海信息技术有限公司 A kind of metadata cluster catalogue dynamic load balancing method of release and system
CN107480310B (en) * 2017-09-29 2020-09-04 郑州云海信息技术有限公司 Dynamic load balancing method and system for metadata cluster directory
CN107798104A (en) * 2017-10-31 2018-03-13 郑州云海信息技术有限公司 A kind of catalog management method, device, equipment and computer-readable recording medium
WO2021004295A1 (en) * 2019-07-05 2021-01-14 中兴通讯股份有限公司 Metadata processing method and apparatus, and computer-readable storage medium
CN110932935A (en) * 2019-11-26 2020-03-27 深圳前海微众银行股份有限公司 Resource control method, device, equipment and computer storage medium
CN113055448A (en) * 2021-02-28 2021-06-29 新华三信息技术有限公司 Metadata management method and device
CN113055448B (en) * 2021-02-28 2023-03-28 新华三信息技术有限公司 Metadata management method and device
CN112948354A (en) * 2021-03-01 2021-06-11 北京金山云网络技术有限公司 Method and device for creating copy cluster, electronic device and storage medium

Also Published As

Publication number Publication date
CN103688257B (en) 2017-04-26
WO2014082203A1 (en) 2014-06-05

Similar Documents

Publication Publication Date Title
CN103688257B (en) Method and device for managing metadata
US11797498B2 (en) Systems and methods of database tenant migration
US10901796B2 (en) Hash-based partitioning system
US10740308B2 (en) Key_Value data storage system
US9575976B2 (en) Methods and apparatuses to optimize updates in a file system based on birth time
US9201890B2 (en) Storage optimization manager
US11075991B2 (en) Partitioning data according to relative differences indicated by a cover tree
EP3238106A1 (en) Compaction policy
EP2724268A2 (en) System and method for implementing a scalable data storage service
US11030169B1 (en) Data re-sharding
US10565190B2 (en) Index tree search method and computer
WO2018005058A1 (en) Real-time shard rebalancing for versioned entity repository
US9697243B2 (en) Method and apparatus for searching node by using tree index
US20200320216A1 (en) Systems and methods for determining database permissions
CN108920105B (en) Community structure-based graph data distributed storage method and device
WO2018236429A1 (en) Metadata load distribution management
US11645266B2 (en) Automated pinning of file system subtrees
CN108920613A (en) A kind of metadata management method, system and equipment and storage medium
CN117591608B (en) Cloud primary database data slicing method based on distributed hash
US10614055B2 (en) Method and system for tree management of trees under multi-version concurrency control
US20190213268A1 (en) Dynamic subtree pinning in storage systems
CN108595482A (en) A kind of data index method and device
CN111190863B (en) Catalog management method, device, equipment and medium
EP3995972A1 (en) Metadata processing method and apparatus, and computer-readable storage medium
CN114756382B (en) Optimization method, system and server for memory page merging

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant