CN112579543A

CN112579543A - Dynamic metadata management method for distributed file system and distributed file system

Info

Publication number: CN112579543A
Application number: CN202011586836.6A
Authority: CN
Inventors: 马俊杰; 苏帅; 苏玉娇; 瞿秋薏; 姜瀚; 付慧慧; 付长杰; 刘曦冉; 黄亚杰; 晋晨; 丛峰日
Original assignee: Aerospace Science And Technology Network Information Development Co ltd
Current assignee: Aerospace Science And Technology Network Information Development Co ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-03-30

Abstract

The invention relates to a metadata dynamic management method of a distributed file system and the distributed file system, belonging to the field of distributed computers. In the invention, a plurality of servers are selected from a metadata server cluster as prepositive servers to form a prepositive metadata server cluster, and other metadata servers in the metadata server cluster form a non-prepositive metadata server cluster; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster; starting a front server; and processing the metadata read-write request of the client. The invention can provide high-speed access to the metadata, reduce the load on a distributed system and realize better load balance.

Description

Dynamic metadata management method for distributed file system and distributed file system

Technical Field

The invention belongs to the technical field of distributed computers, and particularly relates to a metadata dynamic management method of a distributed file system and the distributed file system.

Background

With the continuous development of internet information technology, storage systems are becoming more and more important as the data volume increases. The distributed file system conforms to the trend of exponential increase of information by the characteristics of high fault tolerance, high concurrency and high expandability, and is valued by storage manufacturers and practitioners.

Two types of data are mainly managed in the distributed file system, one is data of a user, and the other is referred to as metadata, that is, data for managing and indexing user data. The access characteristics of the user's data are more storage intensive, while the access characteristics of the metadata are more compute intensive. Generally, therefore, a distributed file system manages and stores these two types of data independently, wherein a component storing user data is called a data server, and a component storing metadata is called a metadata server.

In order to enable the whole distributed file system to have stronger fault-tolerant capability and higher parallel access capability, the distributed file system respectively uses a plurality of nodes to construct a data server cluster and a metadata server cluster. Due to frequent access to metadata, the problem of uneven dynamic load often occurs, so that the response speed of the system becomes slow and even the system becomes unstable.

In order to solve the above problems, the present invention provides a distributed file system, which can provide high-speed access to metadata and can achieve better load balancing.

Disclosure of Invention

Technical problem to be solved

The technical problem to be solved by the present invention is how to provide a dynamic metadata management method for a distributed file system and a distributed file system, so as to solve the problems of slow response speed, even system instability and the like of the existing distributed file system.

(II) technical scheme

In order to solve the above technical problem, the present invention provides a dynamic metadata management method for a distributed file system, which includes the following steps:

s1, selecting a plurality of servers as prepositive servers in a metadata server cluster to form a prepositive metadata server cluster, wherein the rest metadata servers in the metadata server cluster form a non-prepositive metadata server cluster; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster;

s2, starting a front server;

and S3, processing the metadata read-write request of the client.

Further, step S2 includes:

s201, preprocessing: setting a configuration file for the front-end server;

s202, initialization of the front server: according to the configuration file, each prepositive server and other prepositive servers are mutually communicated to automatically form a prepositive metadata server cluster;

the configuration file comprises two types of communication addresses and ports, wherein one type is used for communication between the front-end servers, and the other type is used for communication with the client side of the user side.

Further, the method further comprises the step S203: the front-end server elects a main front-end server.

Further, the step S3 of processing the metadata read-write request of the client includes:

s301, the prepositive server processes the metadata writing request: the method comprises the following steps that a client is connected with any one preposed server to initiate a metadata writing request, and after the preposed metadata server cluster receives the metadata writing request initiated by the client, the metadata writing request is processed according to the following process:

if the client is connected with the main front-end server, the main front-end server performs writing operation;

and if the client is connected with the main front-end server, automatically forwarding the metadata writing request to the main front-end server by the front-end server for writing.

Further, after receiving the write request, the main front-end server writes the metadata in the write request into the memory of the server, and then writes the metadata into the memories of other front-end servers until the number of the front-end servers which are successfully written in is greater than half of the total number of the front-end servers; the main front-end server returns a result of successful writing to the client; after the write request is completed, the front metadata server cluster records a log, and the content of the log comprises the directory, the file path, the modification content and the modification time of the metadata.

s302, the front server processes a metadata reading request: the client is connected with any one preposed server and sends a metadata reading request, and the preposed server processes according to the following procedures when receiving the reading request of the client: the preposed server receiving the reading request determines whether all metadata to be read by the client is stored in the preposed metadata server cluster through the preposed metadata server cluster communication, and if all the metadata to be read by the client is stored in the preposed metadata server cluster, the preposed server directly reads the metadata from the preposed metadata server cluster and returns the metadata to the client; if the metadata to be read by the client is not stored in the preposed metadata server cluster at all, the preposed server receiving the reading request sends the reading request to the non-preposed metadata server cluster, the non-preposed metadata server cluster calls the metadata required by the client from the hard disk, returns the metadata to the preposed server receiving the reading request and returns the metadata to the client by the preposed server; if the metadata part to be read by the client is stored in the preposed metadata server cluster, the preposed server firstly sends a reading request to the preposed metadata server cluster and a non-preposed metadata server cluster, the preposed server in the preposed metadata server cluster returns the required part of metadata to the preposed server receiving the reading request, and the non-preposed metadata server cluster calls the part of metadata required by the client from the hard disk and returns the metadata to the preposed server receiving the reading request; and the prepositive server receiving the reading request carries out aggregation processing on the received two parts of metadata and returns the metadata to the client after the processing.

Further, the method further comprises: s4, the preposed metadata server cluster synchronizes the latest data to the non-preposed metadata server cluster: the main front-end server analyzes the log recorded by the front-end server at a preset time every natural day, and only the latest metadata log is reserved for the same directory and file, so that log compression is realized;

after the analysis is finished, the main front-end server starts a new synchronization thread, sequentially initiates write requests to the non-front-end metadata server cluster according to the compressed logs, and synchronizes the latest metadata to the hard disk of the non-front-end metadata server cluster;

after the primary front-end server completes the synchronization of one metadata, the metadata stored in the memory of each server in each front-end metadata server is gradually deleted until all the metadata are deleted.

Further, when the front-end metadata server cluster processes the read-write request, the access load of each directory and each file is recorded; the access load counting method is as follows: the primary read request is counted as 1, the primary modification request is counted as 2, the primary creation request is counted as 3, and the primary deletion request is counted as 2; the load factor of each file is equal to: read request times + write request times + create request times + 3+ delete request times + 2; the load factor for each directory is equal to: read request times 1+ write request times 2+ create request times 3+ delete request times 2, and add the total of all directories and file load factors under the directory.

Further, the method includes step S5, repartitioning the namespace of the metadata server cluster: the main front-end server calculates the load factor of each directory and each file, and averages the load factors with the last calculated load factor stored in the main front-end server; and identifying the directories and files with the load factors exceeding a preset load threshold according to the average number of the load factors, and re-dividing the metadata of the directories and files with the load factors exceeding the load threshold by the front-end server.

The invention also provides a distributed file system which comprises a metadata server cluster, wherein the metadata server cluster consists of a plurality of metadata servers and comprises a preposed metadata server cluster and a non-preposed metadata server cluster; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster, so that the system executes the metadata dynamic management method as claimed in any one of claims 1 to 9.

(III) advantageous effects

The invention provides a dynamic metadata management method for a distributed file system and the distributed file system, wherein a preposed metadata server cluster is constructed in a metadata server cluster, all metadata read-write requests initiated by a client are uniformly processed by the preposed metadata server cluster, metadata in the write requests are only stored in a memory of the preposed metadata server cluster, the preposed metadata server cluster synchronizes latest data to a non-preposed metadata server cluster at regular time, and the naming space of the metadata server cluster is divided again according to a set rule, so that high-speed access to the metadata can be provided, the load of the distributed file system is reduced, and better load balance is realized.

Drawings

FIG. 1 is a schematic diagram of the overall architecture of the distributed file system of the present invention.

Detailed Description

In order to make the objects, contents and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.

In order to realize the purpose of the invention, the following technical scheme is adopted for realizing the purpose:

a dynamic management method for metadata of a distributed file system comprises the following steps:

s2, starting a front server;

and S3, processing the metadata read-write request of the client.

The dynamic management method of the metadata comprises the following steps: step S2 includes:

s201, preprocessing: setting configuration files for front-end server

S202, initialization of the front server:

and according to the configuration file, each preposed server and other preposed servers are mutually communicated to automatically form a preposed metadata server cluster.

The dynamic management method of the metadata comprises the following steps: the configuration file comprises two types of communication addresses and ports, wherein one type is used for communication between the front-end servers, and the other type is used for communication with clients on the user side.

The dynamic management method of the metadata comprises the following steps: further comprising step S203: the front-end server elects a main front-end server.

The dynamic management method of the metadata comprises the following steps: the step S3 of processing the metadata read-write request of the client includes:

The dynamic management method of the metadata comprises the following steps: after receiving the write request, the main front-end server firstly writes the metadata in the write request into the memory of the server, and then writes the metadata into the memories of other front-end servers until the number of the front-end servers which are successfully written is more than half of the total number of the front-end servers; and the main front server returns a result of successful writing to the client.

The dynamic management method of the metadata comprises the following steps: after the write request is completed, the front metadata server cluster records a log, and the content of the log comprises the directory, the file path, the modification content and the modification time of the metadata.

s302, the front server processes a metadata reading request: the client is connected with any one preposed server and sends a metadata reading request, and the preposed server processes according to the following procedures when receiving the reading request of the client: and the front-end server receiving the reading request determines whether the metadata to be read by the client is all stored in the front-end metadata server cluster through front-end metadata server cluster communication, and if the metadata to be read by the client is all stored in the front-end metadata server cluster, the front-end server directly reads the metadata from the front-end metadata server cluster and returns the metadata to the client.

The dynamic management method of the metadata comprises the following steps: if the metadata to be read by the client is not stored in the preposed metadata server cluster at all, the preposed server receiving the reading request sends the reading request to the non-preposed metadata server cluster, the non-preposed metadata server cluster calls the metadata required by the client from the hard disk, returns the metadata to the preposed server receiving the reading request, and returns the metadata to the client by the preposed server.

The dynamic management method of the metadata comprises the following steps: if the metadata part to be read by the client is stored in the preposed metadata server cluster, the preposed server firstly sends a reading request to the preposed metadata server cluster and a non-preposed metadata server cluster, the preposed server in the preposed metadata server cluster returns the required part of metadata to the preposed server receiving the reading request, and the non-preposed metadata server cluster calls the part of metadata required by the client from the hard disk and returns the metadata to the preposed server receiving the reading request; the prepositive server receiving the reading request carries out aggregation processing on the received two parts of metadata, and returns the metadata to the client after the processing;

the dynamic metadata management method further includes: s4, synchronizing the latest data from the preposed metadata server cluster to the non-preposed metadata server cluster: the main front-end server analyzes the log recorded by the front-end server at a preset time every natural day, and only the latest metadata log is reserved for the same directory and file, so that log compression is realized;

after the analysis is finished, the main front-end server starts a new synchronization thread, sequentially initiates write requests to the non-front-end metadata server cluster according to the compressed logs, and synchronizes the latest metadata to the hard disk of the non-front-end metadata server cluster.

The dynamic management method of the metadata comprises the following steps: after the primary front-end server completes the synchronization of one metadata, the metadata stored in the memory of each server in each front-end metadata server is gradually deleted until all the metadata are deleted.

The dynamic management method of the metadata comprises the following steps: when the preposed metadata server cluster processes the read-write request, the access load of each directory and each file is recorded.

The dynamic management method of the metadata comprises the following steps: the access load is as follows: one read request counts as 1, one modify request counts as 2, one create request counts as 3, and one delete request counts as 2.

The dynamic management method of the metadata comprises the following steps: the load factor of each file is equal to: read request times + write request times + create request times + 3+ delete request times + 2; the load factor for each directory is equal to: read request times 1+ write request times 2+ create request times 3+ delete request times 2, and add the total of all directories and file load factors under the directory.

The dynamic management method of the metadata further comprises the following steps of S5, repartitioning the name space of the metadata server cluster: the main front-end server calculates the load factor of each directory and each file, and averages the load factors with the last calculated load factor stored in the main front-end server; and identifying the directories and files with the load factors exceeding a preset load threshold according to the average number of the load factors, and re-dividing the metadata of the directories and files with the load factors exceeding the load threshold by the front-end server.

The metadata dynamic management method comprises the following division modes:

1) if a plurality of subdirectories exist under a certain directory and the load factors of the subdirectories are approximate, splitting and storing the subdirectories in different metadata server cluster servers;

2) if a plurality of subdirectories exist under a certain directory and the load factors of the subdirectories are greatly different, executing the step 1 on the subdirectories with high load factors;

3) if only one subdirectory exists under a certain directory, splitting and storing each subdirectory in different metadata server cluster servers;

4) if a plurality of files exist in a directory and the load factors of the files are similar, splitting and storing the files in different metadata server cluster servers;

5) if there is only one file under a directory, the main front-end server stores the metadata of the file in the memory of the front-end metadata server cluster, and sets the maximum survival time of the metadata.

A distributed file system comprising a cluster of metadata servers, wherein: the metadata server cluster consists of a plurality of metadata servers, and comprises a preposed metadata server cluster and a non-preposed metadata server cluster; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster.

The distributed file system, wherein: the system performs a method for dynamic management of metadata as described in one of the above.

The distributed file system, wherein: the distributed file system also includes a cluster of data servers.

As shown in fig. 1, the distributed file system of the present invention includes a metadata server cluster, where the metadata server cluster is composed of a plurality of metadata servers, and the metadata dynamic management method includes the following steps:

s1, selecting N stations in a metadata server cluster (wherein N is odd greater than 3)

The number) servers are used as prepositive servers to form a prepositive metadata server cluster, and the prepositive servers store metadata in a memory; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster; the front server is used for executing metadata reading and writing requests initiated by the client. The selection rule of the front server is to select N servers with the first N bits of memory capacity so as to improve the response speed to the read-write request. The servers other than the front-end server in the metadata server cluster are called non-front-end servers, and the cluster formed by the non-front-end servers is called a non-front-end metadata server cluster.

S2, starting a front server:

s201, preprocessing: and setting a configuration file for the front server. The configuration file comprises two types of communication addresses and ports, wherein one type of communication addresses and ports is used for heartbeat detection, health check and data synchronization (hereinafter referred to as "peer communication") between the front-end servers, and the other type of communication addresses and ports is used for being connected with a client side at a user side and processing a read-write request (hereinafter referred to as "client communication");

s202, initialization of the front server:

according to the configuration file, each prepositive server is communicated with other prepositive servers, and a prepositive metadata server cluster is automatically formed through a distributed consistency protocol (Raft protocol);

s203, the front-end server uses a Raft protocol to elect a main front-end server;

s3, the front server processes the metadata read-write request

S301, the prepositive server processes the metadata writing request:

the client is connected with the client communication address and the port of any one prepositive server to initiate a metadata writing request, and after receiving the metadata writing request initiated by the client, the prepositive metadata server cluster processes according to the following procedures:

if the client is not connected with the main front-end server, the front-end server automatically forwards the metadata writing request to the main front-end server to perform actual writing operation;

after receiving a write request, a main front-end server writes metadata in the write request into a memory of a server, and then writes the metadata into memories of other front-end servers through peer communication until the number of the front-end servers which are successfully written is larger than half of the total number of the front-end servers (namely, the main front-end server follows the majority principle in the Raft protocol), so that a multi-point copy of data can be stored, the disaster tolerance characteristic of a distributed system is enhanced, and the data loss caused by the damage of a certain server is prevented;

the main front-end server returns a successful writing result to the client to realize distributed consistency;

after the cluster of the preposed metadata server completes the writing request, the main preposed server records a log, and the contents (called as a "journal log") of a directory, a file path, modification contents, modification time and the like of the metadata are recorded in detail in the log;

s302, the front server processes a metadata reading request:

the client can be connected with the client communication address and the port of any one front-end server and sends a metadata reading request. When receiving a read request of a client, a front-end server processes according to the following flow:

the read request can be sent to any pre-server responsible site receiving the read request

The processing is not required to be forwarded to the main preposed server for processing;

the front server receiving the reading request determines that the client is required to read through peer communication

Whether the read metadata are all stored in a memory of the preposed metadata server cluster, if the metadata to be read by the client are all stored in the preposed metadata server cluster, the preposed server directly reads the metadata from the preposed metadata server cluster and returns the metadata to the client;

if the metadata to be read by the client is not stored in the preposed metadata server at all

In the memory of the cluster, the prepositive server receiving the reading request sends the reading request to a non-prepositive metadata server cluster, the non-prepositive metadata server cluster calls metadata required by a client from a hard disk and returns the metadata to the prepositive server receiving the reading request, and the prepositive server stores the metadata in the memory and returns the metadata to the client by the prepositive server;

if the metadata part to be read by the client is stored in the preposed metadata server cluster, the preposed server firstly sends a reading request to the preposed metadata server cluster and then sends the reading request to the non-preposed metadata server cluster, the preposed server in the preposed metadata server cluster returns the required part of metadata to the preposed server receiving the reading request, and the non-preposed metadata server cluster calls the part of metadata required by the client from the hard disk and returns the metadata to the preposed server receiving the reading request; finally, the prepositive server receiving the reading request carries out aggregation processing on the received two parts of metadata, and the metadata is returned to the client after the aggregation processing;

s4, synchronizing the latest data from the front metadata server cluster to the metadata server cluster:

the preposed metadata server cluster contains the latest metadata information of the natural day, and the synchronization to the metadata server cluster is needed every natural day. On one hand, the latest metadata can be synchronized to the disks in the metadata server cluster, so that the metadata can be stored more safely; on the other hand, the method also recycles the memory resources to process the metadata writing request of the next natural day. Data synchronization is processed according to the following flow:

an independent analysis thread is arranged in the main front-end server, the main front-end server is started at a specific time (for example, 24 click) every natural day, a journal log recorded by the main front-end server is analyzed, and the journal log is analyzed according to a time reverse order, namely only the latest metadata log is reserved for the same directory and file, so that a journal log compression function is realized;

after the analysis is finished, the main preposed server starts a new synchronization thread, and sequentially initiates write requests to the non-preposed metadata server clusters according to the compressed journal log to synchronize the latest metadata to the non-preposed metadata server clusters, namely the non-preposed metadata server clusters update the latest metadata to the hard disk;

after the main front-end server completes the synchronization of one metadata, the metadata stored in the memory of each server in the front-end metadata server cluster is gradually deleted so as to recycle the memory resources until all the metadata are deleted;

s5, repartitioning the name space of the metadata server cluster:

when the preposed metadata server cluster processes the read-write request, the access load of each directory and each file is recorded:

since the front metadata server cluster processes all metadata read-write requests, it records the access load of each directory and file in the memory of the front server executing the read and write requests (including modification, creation and deletion requests). The load factor calculation formula is as follows:

the primary read request is as follows: 1

The primary modification request is counted as: 2

The primary creation request is counted as: 3

The primary delete request is as follows: 2

The load factor of each file is equal to: number of read requests × 1+ number of write requests × 2+ number of creation requests × 3+ number of deletion requests × 2

The load factor for each directory is equal to: the number of times of the directory read request is 1+ the number of times of the directory write request is 2+ the number of times of the directory creation request is 3+ the number of times of the directory deletion request is 2, plus the sum of all directories and file load factors in the directory.

After the step S4 is completed, the main front-end server calculates the load factor of each directory and file according to the above access load factor calculation formula, and averages the load factor with the last calculated load factor stored in the main front-end server. And identifying the directories and files with the load factors exceeding a preset load threshold according to the average number of the load factors, and re-dividing the metadata of the directories and files with the load factors exceeding the load threshold by the front-end server, wherein the division mode is as follows:

1. if a plurality of subdirectories exist under a certain directory and the load factors of the subdirectories are approximate, splitting and storing the subdirectories in different metadata server cluster servers;

2. if a plurality of subdirectories exist under a certain directory and the load factors of the subdirectories are greatly different (for example, the difference is more than 10 times), executing the step 1 on the subdirectories with high load factors;

3. if only one subdirectory exists under a certain directory, splitting and storing each subdirectory in different metadata server cluster servers;

4. if a plurality of files exist in a directory and the load factors of the files are similar, splitting and storing the files in different metadata server cluster servers;

5. if only one file exists in a certain directory, the main preposed server stores the file metadata in the memory of the preposed metadata server cluster, and sets the maximum survival time of the metadata to prevent the metadata from permanently occupying the memory resources of the preposed metadata server cluster.

The invention can provide high-speed access to the metadata, reduce the load on the distributed system and realize better load balance.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A dynamic management method for metadata of a distributed file system is characterized by comprising the following steps:

s2, starting a front server;

and S3, processing the metadata read-write request of the client.

2. The method for dynamically managing metadata according to claim 1, wherein step S2 includes:

s201, preprocessing: setting a configuration file for the front-end server;

3. The dynamic metadata management method according to claim 1 or 2, further comprising step S203: the front-end server elects a main front-end server.

4. The dynamic metadata management method according to claim 3, wherein the step S3 of processing the read-write metadata request of the client includes:

5. The dynamic metadata management method according to claim 4, wherein after receiving the write request, the primary front-end server writes the metadata in the write request into the memory of the server, and then writes the metadata into the memories of other front-end servers until the number of front-end servers successfully written is greater than half of the total number of front-end servers; the main front-end server returns a result of successful writing to the client; after the write request is completed, the front metadata server cluster records a log, and the content of the log comprises the directory, the file path, the modification content and the modification time of the metadata.

6. The dynamic metadata management method according to claim 3, wherein the step S3 of processing the read-write metadata request of the client includes:

7. The dynamic metadata management method according to claim 3, further comprising: s4, the preposed metadata server cluster synchronizes the latest data to the non-preposed metadata server cluster: the main front-end server analyzes the log recorded by the front-end server at a preset time every natural day, and only the latest metadata log is reserved for the same directory and file, so that log compression is realized;

8. The dynamic management method of metadata according to claim 3, characterized in that:

when the preposed metadata server cluster processes the read-write request, the access load of each directory and each file is recorded; the access load counting method is as follows: the primary read request is counted as 1, the primary modification request is counted as 2, the primary creation request is counted as 3, and the primary deletion request is counted as 2; the load factor of each file is equal to: read request times + write request times + create request times + 3+ delete request times + 2; the load factor for each directory is equal to: read request times 1+ write request times 2+ create request times 3+ delete request times 2, and add the total of all directories and file load factors under the directory.

9. The dynamic metadata management method according to claim 8, further comprising, at S5, repartitioning the namespace of the metadata server cluster: the main front-end server calculates the load factor of each directory and each file, and averages the load factors with the last calculated load factor stored in the main front-end server; and identifying the directories and files with the load factors exceeding a preset load threshold according to the average number of the load factors, and re-dividing the metadata of the directories and files with the load factors exceeding the load threshold by the front-end server.

10. A distributed file system comprises a metadata server cluster, and is characterized in that the metadata server cluster consists of a plurality of metadata servers, and comprises a preposed metadata server cluster and a non-preposed metadata server cluster; all metadata read-write requests initiated by the client are uniformly processed by the preposed metadata server cluster, and metadata in the write requests are only stored in a memory of the preposed metadata server cluster, so that the system executes the metadata dynamic management method as claimed in any one of claims 1 to 9.