CN114610680A - Method, device and equipment for managing metadata of distributed file system and storage medium - Google Patents

Method, device and equipment for managing metadata of distributed file system and storage medium Download PDF

Info

Publication number
CN114610680A
CN114610680A CN202210224649.6A CN202210224649A CN114610680A CN 114610680 A CN114610680 A CN 114610680A CN 202210224649 A CN202210224649 A CN 202210224649A CN 114610680 A CN114610680 A CN 114610680A
Authority
CN
China
Prior art keywords
metadata
node
stored
index
directory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210224649.6A
Other languages
Chinese (zh)
Inventor
郑哲欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210224649.6A priority Critical patent/CN114610680A/en
Publication of CN114610680A publication Critical patent/CN114610680A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the field of cloud storage, and particularly discloses a method, a device, equipment and a storage medium for managing metadata of a distributed file system, wherein the distributed file system comprises at least one metadata node, and the method comprises the following steps: acquiring metadata to be stored and a data type of the metadata to be stored, and determining an index entry and a directory entry of the metadata to be stored; determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node; and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node.

Description

Method, device, equipment and storage medium for managing metadata of distributed file system
Technical Field
The present application relates to the field of data processing, and in particular, to a method, an apparatus, a device, and a storage medium for managing metadata of a distributed file system.
Background
The complexity of the distributed file system is the first resort in the distributed storage system because of the need to provide various metadata information and management interfaces such as permissions, creation update times, file sizes, and file locks. Most of the functions required by the system call need to be realized by metadata, which also causes the metadata to be a hotspot for file system access, so whether the metadata is realized or not is crucial to the success of the system.
Disclosure of Invention
The application provides a method, a device, equipment and a storage medium for managing metadata of a distributed file system, so as to improve the convenience of access.
In a first aspect, the present application provides a method for managing metadata of a distributed file system, where the distributed file system includes at least one metadata node, the method includes:
acquiring metadata to be stored and a data type of the metadata to be stored, and determining an index entry and a directory entry of the metadata to be stored;
determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node;
and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node.
In a second aspect, the present application further provides an apparatus for managing metadata of a distributed file system, where the distributed file system includes at least one metadata node, and the apparatus includes:
the system comprises an item determining module, a storage module and a storage module, wherein the item determining module is used for acquiring metadata to be stored and the data type of the metadata to be stored, and determining an index item and a directory item of the metadata to be stored;
the directory saving module is used for determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node and saving the directory entry in the target metadata node;
and the index storage module is used for determining an index storage node in at least one metadata node according to the data type of the metadata to be stored and storing the index entry in the index storage node.
In a third aspect, the present application further provides a computer device comprising a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute the computer program and to implement the distributed file system metadata management method when executing the computer program.
In a fourth aspect, the present application also provides a computer-readable storage medium storing a computer program, which when executed by a processor causes the processor to implement the distributed file system metadata management method as described above.
The application discloses a method, a device, equipment and a storage medium for managing metadata of a distributed file system, which are characterized in that metadata to be stored and the data type of the metadata to be stored are obtained, and an index entry and a directory entry of the metadata to be stored are determined; then, determining a target metadata node where a parent index entry of a parent directory of metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node; and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node. The directory entries of the metadata to be stored are stored in the same metadata node, and the index entries of the data to be stored can be distributed to other metadata nodes for storage, so that the horizontal extension by taking the directory as a unit is formed, and the traversal operation is further facilitated.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of a distributed file system metadata management method provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of locating a metadata node where a directory is located according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating steps for determining an index saving node according to a data type of metadata to be stored according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating the separation of namespaces within a data instance provided by an embodiment of the present application;
fig. 5 is a schematic flowchart of locating a metadata node where a file is located according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a distributed file system metadata management apparatus according to an embodiment of the present application;
fig. 7 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
The embodiment of the application provides a method and a device for managing metadata of a distributed file system, computer equipment and a storage medium. The metadata management method of the distributed file system uses the KV database as a bottom storage engine, so that a metadata service cluster of the parallel file system is constructed, the limitation of the local file system on the metadata operation of the distributed file system in terms of function and performance can be effectively avoided, and meanwhile, the metadata management method of the distributed file system also ensures the balance between the flexibility of the distributed file system and the performance of the local metadata operation.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for managing metadata of a distributed file system according to an embodiment of the present application. The metadata management method of the distributed file system takes the directory as a unit to split the sorting system, thereby separately storing the metadata in the distributed file system, exerting the expansibility of the distributed system and simultaneously considering the locality of operations such as traversal and the like.
In one embodiment, the distributed file system includes at least one metadata node. In a specific implementation process, three processes may be used to build the whole distributed file system, including MON (monitor), MDS (metadata service), and CSS (Chunk storage service). Where the MON is the registration center of the entire cluster, the coordinator needs to ensure high availability, and all processes need to be heartbeat interconnected with the MON. The MDS is a metadata service process, and the CSS is a data fragment storage process of the file.
The metadata managed by the MDS are stored in a local KV database and are persistently stored in a Key-Value form. In addition, a client kernel module is provided for mounting the file system on the machine head server, a user can directly operate as a local file system, and the kernel module performs data interaction with the back-end cluster through a network.
When the MDS is started, the MON actively sends a registration request to the MON, and after receiving the registration request of the MDS, the MON stores the state information of the MDS. When the cluster is initially started, the first registered MDS is recorded as the metadata node where the metadata of the root directory '/' of the distributed file system is located.
As shown in fig. 1, the method for managing metadata of a distributed file system specifically includes: step S101 to step S103.
S101, obtaining metadata to be stored and a data type of the metadata to be stored, and determining an index entry and a directory entry of the metadata to be stored.
Firstly, for metadata to be stored, which needs to be stored, the data type of the metadata to be stored can be determined, and the data type includes directory data and file data. The directory data refers to file directories in the distributed file system, such as root directories, parent directories, and child directories. And file data refers to file data within each directory in the distributed file system.
The metadata node MDS divides metadata information of the metadata to be stored into two types, namely index entries and directory entries, and the directory entries and the index entries of the metadata to be stored need to be stored in different namespaces for storage. Wherein, the index entry may be a dir-entry, and the index entry may be an inode.
The index entries include data such as attributes, permissions, and last update time of the files or directories, and the directory entries are used for recording location information of the files or directories and for indexing locations of the index entries of the files or directories.
S102, determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node.
When the metadata to be stored is stored, the position of a father directory entry of a father directory of the metadata to be stored is determined according to the metadata to be stored, then inquiry is carried out on the position of the father directory entry until a metadata node where the father index entry is located is determined, and the metadata node is a target metadata node.
For example, as shown in fig. 2, fig. 2 is a flow diagram of locating a metadata node where a directory is located.
As shown in fig. 2, the location of the root directory '/' of the suspended file system may be known by the MON, that is, the metadata node where the root directory entry of the root directory is located is the MDS1, and since the index entries of the file and the directory under the directory are both located on the same metadata node as the directory entry of the parent directory, a query request may be sent to the MDS1 to query the location of the parent index entry of the parent directory under the root directory, and then the metadata node where the directory entry of the parent directory is located is obtained through the parent index entry, that is, the MDS2, and then the MDS2 may be used as the target metadata node.
After a target metadata node where a parent index entry of a parent directory of metadata to be stored is located is determined from at least one metadata node, the directory entry can be stored in the target metadata node.
In addition, before the directory entry of the data to be stored is saved, the directory entry of the metadata to be stored can be named. In naming, naming can be performed according to a preset naming rule. For example, the naming rule for the key of a directory entry may be:
adding the actual file name or directory name to the index entry of the parent directory to which the metadata to be stored belongs.
The index entry of the parent directory to which the metadata to be stored belongs can be used for conveniently performing prefix traversal to realize functions such as readdir and the like during naming, and in a specific implementation process, the index entry can be a snow-flash type uuid.
S103, determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node.
Before the index entries of the metadata to be stored are saved, the index entries of the metadata to be stored can also be named. In naming, naming can be performed according to a preset naming rule. For example, the naming rule for the key of the index entry may be:
if the data type of the metadata to be stored is directory data, the key of the index entry of the directory data is directly the index entry of the directory data; if the data type of the metadata to be stored is file data, the key of the index entry of the directory data is the index entry of the parent directory plus the index entry of the file data.
And determining an index storage node according to the data type of the metadata to be stored, wherein the determined index storage node can be a target metadata node for storing a directory entry of the metadata to be stored, and can also be other metadata nodes in a distributed file system.
In an embodiment, as shown in fig. 3, the step of determining an index saving node according to the data type of the metadata to be stored may include: s1031, if the data type of the metadata to be stored is directory data, determining an index storage node based on the node allowance of at least one metadata node; s1032, if the data type of the metadata to be stored is file data, determining the target metadata node as an index storage node.
If the data type of the metadata to be stored is directory data, the index storage node can be determined according to the node residual capacity of each metadata node in the distributed file system, namely the node residual capacity. In a specific implementation process, one of the metadata nodes with a large residual capacity can be randomly selected as an index storage node according to the difference of the residual capacity of each metadata node.
For example, the MON may receive the capacity information of all metadata nodes through a heartbeat, each metadata node periodically downloads the capacity information to the local, and randomly selects one of the metadata nodes with a large residual capacity as an index storage node according to the difference of the residual capacity.
If the data type of the metadata to be stored is file data, the index entry of the metadata to be stored can be directly saved in the determined target metadata node.
For file data, the index entry and directory entry of the file data are stored on the same metadata node as the parent directory of the metadata to be stored. And the index entries of the directory may be allocated to other metadata nodes for storage. That is, the index entries of a directory and all files stored under the directory are on the same metadata node, and the index entries of other sub-directories under the directory may be allocated to other metadata nodes for storage, so as to form a horizontal extension in units of directories.
In an embodiment, when creating a new directory, a client directly sends a message to a metadata node where a parent directory is located, then the metadata node selects a drop metadata node of a new directory index entry (MON receives capacity information of all MDSs through heartbeat, each metadata node periodically downloads the capacity information to the local, and randomly selects one of metadata nodes with larger residual capacity according to different residual capacities), then sends a request for creating the index entry to the selected metadata node, after the creation is successful, the metadata node where the parent directory is located receives a response, then creates the directory entry of the new directory locally, and after the creation is completed, returns the success to the client.
In an embodiment, multiple data instances may be included in a metadata node. When multiple data instances are included in an index holding node, the step of holding the index entry in the index holding node may include: and performing hash calculation of a preset rule based on the data type of the metadata to be stored, determining a storage example of the index entry from a plurality of data examples, and storing the index entry based on the storage example.
Multiple data instances, namely db instances, can be configured to be opened simultaneously on a single metadata node to improve the concurrency of write operations. Then a save location may also need to be selected from a plurality of data instances when saving the index entry for the metadata to be stored. In a specific implementation process, hash selection of different rules can be performed according to the data type of the data to be stored, so as to determine a storage instance, and thus, the index entry of the metadata to be stored is stored in the storage instance.
In an embodiment, the step of performing hash calculation of a preset rule based on the data type of the metadata to be stored may include: if the data type of the metadata to be stored is directory data, performing hash calculation according to the index entries of the metadata to be stored; and if the data type of the metadata to be stored is file data, performing hash calculation according to the index entry of the parent directory of the metadata to be stored.
If the data type of the metadata to be stored is directory data, the hash can be performed according to the index entry of the metadata to be stored, so that which data instance to save on is selected. If the data type of the metadata to be stored is file data, the hash can be performed according to the index entry of the parent directory of the metadata to be stored, so as to select which instance to save on. Therefore, all directory entries in the same parent directory can be ensured to fall on the same instance, and subsequent traversal operation is facilitated.
In an embodiment, the distributed file system metadata management method may further include: and acquiring the extended attribute entries of the data to be stored, and storing the extended attribute entries in the index storage node.
For each extended attribute of file data or directory data, an extended attribute entry is added in a data instance and stored in an index storage node, and when the file data or directory data is deleted, the corresponding extended attribute entry needs to be deleted together.
The index entries and the directory entries can be stored separately by different namespaces when being saved. In order to realize the extended attribute feature of the posix file system, a plurality of namespaces may be separately provided for the index entries of the file data and the index entries of the directory data, so as to respectively store the extended attribute entries of the directory data and the extended attribute entries of the file data.
FIG. 4 is a diagram illustrating the separation of namespaces within a data instance. The dir-inodes represents index entries of the directory data, the dentries represents directory entries of the directory data, the file-inodes represents index entries of the file data, the dir-inode-xattrs represents extended attribute entries of the directory data, and the file-inodes-xattrs represents extended attribute entries of the file data.
Before the extended attribute entry of the data to be stored is saved, the extended attribute entry may be named according to a certain naming rule, for example, the naming rule may be: and adding an extended attribute name to the index entry of the parent directory to which the metadata to be stored belongs.
After the metadata to be stored is stored in a partitioned manner, when the file data needs to be read, as shown in fig. 5, a schematic flow diagram of locating a metadata node where a file is located is shown.
Firstly, a client acquires the position of a root directory '/' of a file system through MON, namely a metadata node where an index entry of the root directory is positioned, namely MDS1, because a directory entry of a parent directory and an index entry of the root directory are on the same MDS, a query request is sent to MDS1, the position of the directory entry of the root directory is queried, namely MDS2, then the query request is sent to MDS2, the directory entry of a subdirectory on MDS2 is found, the position information of the index entry of the subdirectory is acquired to be above MDS3, and because the index entries of one directory and all files stored below the directory are on the same metadata node, the client communicates with MDS3, and the metadata information of target files under the subdirectory can be acquired.
When executing the operation of traversing a directory, such as executing ls in a command line or executing a readdir function in a program, the client finds the MDS where the target directory is located, and sends a traversal request to the MDS, and since directory entries of all files and directories under a directory are on the same data instance of the same MDS as the parent directory, prefix traversal is directly executed on the data instance.
In order to improve the traversal efficiency, a certain number of entries are pre-read during traversal, and after the entries are returned to the client, the entries are returned upwards according to the use of the user, and after the entries are used up, the entries are continuously traversed by requesting the MDS. And the MDS records the offset (uuid of a self-increment type) of the last entry during each pre-reading, carries the offset returned by the last traversal during the next traversal, and is positioned to the end of the last traversal according to the offset so as to continue the backward traversal.
In an embodiment, the method for managing metadata of a distributed file system may further include the steps of: dividing at least one metadata node into a plurality of redundancy groups, and determining a main node in each redundancy group; and forwarding data to other metadata nodes in the redundancy group based on the main node for data processing.
At least one metadata node can be divided into a plurality of redundancy groups through manual setting or automatic setting, one metadata node is designated as a main node in each redundancy group, and then the rest metadata nodes in the redundancy groups are slave nodes. In a specific implementation process, the redundancy groups do not intersect with each other, that is, each metadata node can only be in one redundancy group, and each redundancy group may include two metadata nodes, one metadata node being a master node and one metadata node being a slave node.
And then forwarding data to other metadata nodes in the redundancy group for data processing based on the main node. The data processing includes receiving and updating data, and the like.
In the specific implementation process, the client can download the node state information from the MON at regular time, the client communicates with the master node during communication, the master node forwards the update message to the slave node after the update message reaches the master node, and the master node processes the message locally after the slave node completes the processing of the update message. When the master node loses contact with the MON's heartbeat, the MON may update the state of the master node as a dropped connection and replace the master node of the redundancy group, and then the client synchronizes the state and changes the interacting peer.
When a metadata node in the redundancy group is in a disconnection state, the MON degrades the data security of the redundancy group, and when the metadata node in the disconnection state is on-line again, the current on-line master node needs to be responsible for performing data synchronization on the newly on-line slave node. And when the MDSs in one redundancy group are all offline, the MON records the last online MDS, and prohibits the group from continuously providing services before the last online MDS is online, so that the condition that the authority data is lost due to the fact that the whole redundancy group member is all offline is avoided.
In an embodiment, the step of performing data processing based on the master node forwarding data to other metadata nodes in the redundancy group may include: acquiring an updating cache operation and a snapshot file; sending the snapshot file to other metadata nodes in the redundancy group based on the main node, so that the other metadata nodes in the redundancy group load the snapshot file; and sending the updating cache operation to other metadata nodes in the redundancy group based on the main node, so that the updating cache operation is synchronized to other metadata nodes in the redundancy group.
When data synchronization starts, an updating operation buffer queue is created, all updating operations initiated by all services after synchronization starts are cached in the queue, in addition, a data instance generates a snapshot at the current time, a snapshot file is directly sent to a slave node after the snapshot is generated, the slave node is responsible for loading the snapshot locally, a master node suspends the services after the slave node finishes loading the snapshot, the updating service operations in the updating buffer queue are synchronized to the slave node, the queue is deleted after synchronization is finished, the services are recovered, and the full data synchronization of the whole slave node is finished.
In the method for managing metadata of a distributed file system provided by the above embodiment, metadata to be stored and a data type of the metadata to be stored are obtained, and an index entry and a directory entry of the metadata to be stored are determined; then, determining a target metadata node where a parent index entry of a parent directory of metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node; and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node. The directory entries of the metadata to be stored are stored in the same metadata node, and the index entries of the data to be stored can be distributed to other metadata nodes for storage, so that the horizontal extension by taking the directory as a unit is formed, and the traversal operation is further facilitated.
Referring to fig. 6, fig. 6 is a schematic block diagram of a distributed file system metadata management apparatus according to an embodiment of the present application, where the distributed file system metadata management apparatus is configured to execute the foregoing distributed file system metadata management method. Wherein, the distributed file system metadata management device can be configured in a server or a terminal.
The server may be an independent server or a server cluster. The terminal can be an electronic device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant and a wearable device.
As shown in fig. 6, the distributed file system metadata management apparatus 200 includes: an entry determination module 201, a directory save module 202, and an index save module 203.
The entry determining module 201 is configured to obtain metadata to be stored and a data type of the metadata to be stored, and determine an index entry and a directory entry of the metadata to be stored.
The directory saving module 202 is configured to determine, from at least one metadata node, a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located, and save the directory entry in the target metadata node.
An index saving module 203, configured to determine an index saving node in at least one metadata node according to the data type of the metadata to be stored, and save the index entry in the index saving node.
In one embodiment, the index preservation module includes a first determination submodule and a second determination submodule.
The first determining submodule is used for determining an index storage node based on the node allowance of at least one metadata node if the data type of the metadata to be stored is directory data; and the second determining submodule is used for determining the target metadata node as an index storage node if the data type of the metadata to be stored is file data.
In one embodiment, the metadata node includes a plurality of data instances, and the index holding module includes an instance determination submodule.
The instance determination submodule is used for performing hash calculation of a preset rule based on the data type of the metadata to be stored, determining a storage instance of the index entry from a plurality of data instances, and storing the index entry based on the storage instance.
In an embodiment, the instance determination submodule includes a first determination submodule and a second determination submodule.
The first determining submodule is used for carrying out hash calculation according to the index entry of the metadata to be stored if the data type of the metadata to be stored is directory data; and the second determining submodule is used for performing hash calculation according to the index entry of the parent directory of the metadata to be stored if the data type of the metadata to be stored is file data.
In one embodiment, the distributed file system metadata management apparatus further includes a redundancy grouping module and a data processing module.
The redundancy grouping module is used for dividing at least one metadata node into a plurality of redundancy groups and determining a main node in each redundancy group; and the data processing module is used for carrying out data processing on the basis of the main node forwarding data to other metadata nodes in the redundancy group.
In one embodiment, the data grouping module comprises a file acquisition sub-module, a file forwarding sub-module and an operation forwarding sub-module.
The file acquisition submodule is used for acquiring an update cache operation and a snapshot file; the file forwarding sub-module is used for sending the snapshot file to other metadata nodes in the redundancy group based on the main node, so that the other metadata nodes in the redundancy group load the snapshot file; and the operation forwarding sub-module is used for sending the updating cache operation to other metadata nodes in the redundancy group based on the main node, so that the updating cache operation is synchronized to other metadata nodes in the redundancy group.
It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the above-described distributed file system metadata management apparatus and each module may refer to the corresponding processes in the foregoing distributed file system metadata management method embodiment, and are not described herein again.
The above-described distributed file system metadata management apparatus may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 7.
Referring to fig. 7, fig. 7 is a schematic block diagram of a computer device according to an embodiment of the present disclosure. The computer device may be a server or a terminal.
Referring to fig. 7, the computer device includes a processor, a memory, and a network interface connected through a system bus, wherein the memory may include a nonvolatile storage medium and an internal memory.
The non-volatile storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any one of the distributed file system metadata management methods.
The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.
The internal memory provides an environment for the execution of a computer program on a non-volatile storage medium, which when executed by a processor, causes the processor to perform any of the distributed file system metadata management methods.
The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein, in one embodiment, the distributed file system includes at least one metadata node, the processor is configured to execute a computer program stored in the memory to implement the steps of:
acquiring metadata to be stored and a data type of the metadata to be stored, and determining an index entry and a directory entry of the metadata to be stored;
determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node;
and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node.
In one embodiment, when the determining of the index saving node in the at least one metadata node according to the data type of the metadata to be stored is implemented, the processor is configured to implement:
if the data type of the metadata to be stored is directory data, determining an index storage node based on the node allowance of at least one metadata node;
and if the data type of the metadata to be stored is file data, determining the target metadata node as an index storage node.
In one embodiment, the index holding node comprises a plurality of data instances, and the processor, when implementing the holding of the index entry in the index holding node, is configured to implement:
and performing hash calculation of a preset rule based on the data type of the metadata to be stored, determining a storage example of the index entry from a plurality of data examples, and storing the index entry based on the storage example.
In one embodiment, when the processor performs the hash calculation of the preset rule based on the data type of the metadata to be stored, the processor is configured to perform:
if the data type of the metadata to be stored is directory data, performing hash calculation according to the index entries of the metadata to be stored;
and if the data type of the metadata to be stored is file data, performing hash calculation according to the index entry of the parent directory of the metadata to be stored.
In one embodiment, the processor is further configured to implement:
and acquiring the extended attribute entries of the data to be stored, and storing the extended attribute entries in the index storage node.
In one embodiment, the processor is configured to implement:
dividing at least one metadata node into a plurality of redundancy groups, and determining a main node in each redundancy group;
and forwarding data to other metadata nodes in the redundancy group based on the main node for data processing.
In one embodiment, the processor, when implementing the data processing based on the master node forwarding data to other metadata nodes in the redundancy group, is configured to implement:
acquiring an updating cache operation and a snapshot file;
sending the snapshot file to other metadata nodes in the redundancy group based on the main node, so that the other metadata nodes in the redundancy group load the snapshot file;
and sending the updating cache operation to other metadata nodes in the redundancy group based on the main node, so that the updating cache operation is synchronized to other metadata nodes in the redundancy group.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and the processor executes the program instructions to implement any one of the distributed file system metadata management methods provided in the embodiments of the present application.
The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for managing metadata of a distributed file system, the distributed file system including at least one metadata node, the method comprising:
acquiring metadata to be stored and a data type of the metadata to be stored, and determining an index entry and a directory entry of the metadata to be stored;
determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node, and storing the directory entry in the target metadata node;
and determining an index storage node in at least one metadata node according to the data type of the metadata to be stored, and storing the index entry in the index storage node.
2. The method for managing metadata in a distributed file system according to claim 1, wherein the determining an index saving node in at least one metadata node according to the data type of the metadata to be stored comprises:
if the data type of the metadata to be stored is directory data, determining an index storage node based on the node allowance of at least one metadata node;
and if the data type of the metadata to be stored is file data, determining the target metadata node as an index storage node.
3. The distributed file system metadata management method of claim 2 wherein the index-holding node includes a plurality of data instances, the holding the index entry at the index-holding node comprising:
and performing hash calculation of a preset rule based on the data type of the metadata to be stored, determining a storage example of the index entry from a plurality of data examples, and storing the index entry based on the storage example.
4. The method for managing metadata in a distributed file system according to claim 3, wherein the performing hash calculation of the preset rule based on the data type of the metadata to be stored includes:
if the data type of the metadata to be stored is directory data, performing hash calculation according to the index entries of the metadata to be stored;
and if the data type of the metadata to be stored is file data, performing hash calculation according to the index entry of the parent directory of the metadata to be stored.
5. The distributed file system metadata management method of claim 1, the method further comprising:
and acquiring the extended attribute entries of the data to be stored, and storing the extended attribute entries in the index storage node.
6. The distributed file system metadata management method of claim 1, the method comprising:
dividing at least one metadata node into a plurality of redundancy groups, and determining a main node in each redundancy group;
and forwarding data to other metadata nodes in the redundancy group based on the main node for data processing.
7. The method of claim 6, wherein the data processing based on the master node forwarding data to other metadata nodes in the redundancy group comprises:
acquiring an updating cache operation and a snapshot file;
sending the snapshot file to other metadata nodes in the redundancy group based on the main node, so that the other metadata nodes in the redundancy group load the snapshot file;
and sending the updating cache operation to other metadata nodes in the redundancy group based on the main node, so that the updating cache operation is synchronized to other metadata nodes in the redundancy group.
8. A distributed file system metadata management apparatus, wherein the distributed file system includes at least one metadata node, the apparatus comprising:
the system comprises an item determining module, a storage module and a storage module, wherein the item determining module is used for acquiring metadata to be stored and the data type of the metadata to be stored, and determining an index item and a directory item of the metadata to be stored;
the directory saving module is used for determining a target metadata node where a parent index entry of a parent directory of the metadata to be stored is located from at least one metadata node and saving the directory entry in the target metadata node;
and the index storage module is used for determining an index storage node in at least one metadata node according to the data type of the metadata to be stored and storing the index entry in the index storage node.
9. A computer device, wherein the computer device comprises a memory and a processor;
the memory is used for storing a computer program;
the processor for executing the computer program and implementing the distributed file system metadata management method of any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to implement the distributed file system metadata management method according to any one of claims 1 to 7.
CN202210224649.6A 2022-03-07 2022-03-07 Method, device and equipment for managing metadata of distributed file system and storage medium Pending CN114610680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210224649.6A CN114610680A (en) 2022-03-07 2022-03-07 Method, device and equipment for managing metadata of distributed file system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210224649.6A CN114610680A (en) 2022-03-07 2022-03-07 Method, device and equipment for managing metadata of distributed file system and storage medium

Publications (1)

Publication Number Publication Date
CN114610680A true CN114610680A (en) 2022-06-10

Family

ID=81860608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210224649.6A Pending CN114610680A (en) 2022-03-07 2022-03-07 Method, device and equipment for managing metadata of distributed file system and storage medium

Country Status (1)

Country Link
CN (1) CN114610680A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115827752A (en) * 2022-11-22 2023-03-21 中国机械总院集团江苏分院有限公司 Data processing method and device and computer equipment
CN116820354A (en) * 2023-08-29 2023-09-29 京东科技信息技术有限公司 Data storage method, data storage device and data storage system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115827752A (en) * 2022-11-22 2023-03-21 中国机械总院集团江苏分院有限公司 Data processing method and device and computer equipment
CN116820354A (en) * 2023-08-29 2023-09-29 京东科技信息技术有限公司 Data storage method, data storage device and data storage system
CN116820354B (en) * 2023-08-29 2024-01-12 京东科技信息技术有限公司 Data storage method, data storage device and data storage system

Similar Documents

Publication Publication Date Title
US11809726B2 (en) Distributed storage method and device
US20170249246A1 (en) Deduplication and garbage collection across logical databases
CN109739815B (en) File processing method, system, device, equipment and storage medium
JP5722962B2 (en) Optimize storage performance
US9547706B2 (en) Using colocation hints to facilitate accessing a distributed data storage system
US20130218934A1 (en) Method for directory entries split and merge in distributed file system
CN112236758A (en) Cloud storage distributed file system
CN109684282B (en) Method and device for constructing metadata cache
US11151081B1 (en) Data tiering service with cold tier indexing
US11468053B2 (en) Servicing queries of a hybrid event index
CN114610680A (en) Method, device and equipment for managing metadata of distributed file system and storage medium
CN111400334B (en) Data processing method, data processing device, storage medium and electronic device
CN111158851B (en) Rapid deployment method of virtual machine
CN108540510B (en) Cloud host creation method and device and cloud service system
CN103501319A (en) Low-delay distributed storage system for small files
US11960442B2 (en) Storing a point in time coherently for a distributed storage system
JP7038864B2 (en) Search server centralized storage
CN111917834A (en) Data synchronization method and device, storage medium and computer equipment
CN111651424B (en) Data processing method, device, data node and storage medium
US11625192B2 (en) Peer storage compute sharing using memory buffer
CN107408239B (en) Architecture for managing mass data in communication application through multiple mailboxes
US10387384B1 (en) Method and system for semantic metadata compression in a two-tier storage system using copy-on-write
CN110347656B (en) Method and device for managing requests in file storage system
US10628391B1 (en) Method and system for reducing metadata overhead in a two-tier storage architecture
US20220365905A1 (en) Metadata processing method and apparatus, and a computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination